Jesse Gross
|
f66216e399
ggml: Support heterogeneous KV cache layer sizes in memory estimation
|
vor 1 Monat |
Parth Sareen
|
314573bfe8
config: allow setting context length through env var (#8938)
|
vor 2 Monaten |
Michael Yang
|
58245413f4
next ollama runner (#7913)
|
vor 2 Monaten |
Stefan Weil
|
abfdc4710f
all: fix typos in documentation, code, and comments (#7021)
|
vor 4 Monaten |
Sam
|
1bdab9fdb1
llm: introduce k/v context quantization (vRAM improvements) (#6279)
|
vor 5 Monaten |
Daniel Hiltgen
|
05cd82ef94
Rename gpu package discover (#7143)
|
vor 6 Monaten |
Michael Yang
|
77903ab8b4
llama3.1
|
vor 9 Monaten |
Michael Yang
|
b732beba6a
lint
|
vor 9 Monaten |
Michael Yang
|
df993fa37b
comments
|
vor 9 Monaten |
Michael Yang
|
5e9db9fb0b
refactor convert
|
vor 11 Monaten |
Michael Yang
|
35b89b2eab
rfc: dynamic environ lookup
|
vor 10 Monaten |
Blake Mizerany
|
cb42e607c5
llm: speed up gguf decoding by a lot (#5246)
|
vor 10 Monaten |
Daniel Hiltgen
|
6f351bf586
review comments and coverage
|
vor 11 Monaten |
Daniel Hiltgen
|
6fd04ca922
Improve multi-gpu handling at the limit
|
vor 11 Monaten |