Jesse Gross
|
f66216e399
ggml: Support heterogeneous KV cache layer sizes in memory estimation
|
1 maand geleden |
frob
|
7c168b08c9
server: add missing function parens to debug log (#9255)
|
2 maanden geleden |
Michael Yang
|
58245413f4
next ollama runner (#7913)
|
2 maanden geleden |
Stefan Weil
|
abfdc4710f
all: fix typos in documentation, code, and comments (#7021)
|
4 maanden geleden |
Jesse Gross
|
6cd566872b
sched: Lift parallel restriction for multimodal models except mllama
|
6 maanden geleden |
Daniel Hiltgen
|
05cd82ef94
Rename gpu package discover (#7143)
|
6 maanden geleden |
Patrick Devine
|
abed273de3
add "stop" command (#6739)
|
7 maanden geleden |
Daniel Hiltgen
|
90ca84172c
Fix embeddings memory corruption (#6467)
|
8 maanden geleden |
Richard Lyons
|
885cf45087
Fix white space.
|
8 maanden geleden |
Richard Lyons
|
9352eeb752
Reset NumCtx.
|
8 maanden geleden |
Richard Lyons
|
0ad0e738cd
Override numParallel only if unset.
|
8 maanden geleden |
Michael Yang
|
2697d7f5aa
lint
|
8 maanden geleden |
Michael Yang
|
b732beba6a
lint
|
9 maanden geleden |
Michael Yang
|
5c1912769e
Merge pull request #5473 from ollama/mxyng/environ
|
9 maanden geleden |
Daniel Hiltgen
|
345420998e
Prevent partial loading on mixed GPU brands
|
9 maanden geleden |
Michael Yang
|
85d9d73a72
comments
|
9 maanden geleden |
Michael Yang
|
0f1910129f
int
|
10 maanden geleden |
Michael Yang
|
8570c1c0ef
keepalive
|
10 maanden geleden |
Michael Yang
|
55cd3ddcca
bool
|
10 maanden geleden |
Jeffrey Morgan
|
791650ddef
sched: only error when over-allocating system memory (#5626)
|
9 maanden geleden |
Jeffrey Morgan
|
e4ff73297d
server: fix model reloads when setting `OLLAMA_NUM_PARALLEL` (#5560)
|
9 maanden geleden |
Jeffrey Morgan
|
0ee87615c7
sched: don't error if paging to disk on Windows and macOS (#5523)
|
9 maanden geleden |
Daniel Hiltgen
|
af28b94533
Merge pull request #5469 from dhiltgen/prevent_system_oom
|
10 maanden geleden |
Daniel Hiltgen
|
955f2a4e03
Only set default keep_alive on initial model load
|
10 maanden geleden |
Daniel Hiltgen
|
3c75113e37
Prevent loading models larger than total memory
|
10 maanden geleden |
Daniel Hiltgen
|
cff3f44f4a
Fix case for NumCtx
|
10 maanden geleden |
Daniel Hiltgen
|
3518aaef33
Merge pull request #4218 from dhiltgen/auto_parallel
|
10 maanden geleden |
Blake Mizerany
|
cb42e607c5
llm: speed up gguf decoding by a lot (#5246)
|
10 maanden geleden |
Daniel Hiltgen
|
9929751cc8
Disable concurrency for AMD + Windows
|
10 maanden geleden |
Daniel Hiltgen
|
17b7186cd7
Enable concurrency by default
|
1 jaar geleden |