Michael Yang
|
58245413f4
next ollama runner (#7913)
|
2 bulan lalu |
Stefan Weil
|
abfdc4710f
all: fix typos in documentation, code, and comments (#7021)
|
4 bulan lalu |
Daniel Hiltgen
|
05cd82ef94
Rename gpu package discover (#7143)
|
6 bulan lalu |
Daniel Hiltgen
|
d632e23fba
Add Windows arm64 support to official builds (#5712)
|
7 bulan lalu |
Patrick Devine
|
abed273de3
add "stop" command (#6739)
|
7 bulan lalu |
Michael Yang
|
77903ab8b4
llama3.1
|
9 bulan lalu |
Jeffrey Morgan
|
15c2d8fe14
server: parallelize embeddings in API web handler instead of in subprocess runner (#6220)
|
8 bulan lalu |
Michael Yang
|
b732beba6a
lint
|
9 bulan lalu |
Michael Yang
|
df993fa37b
comments
|
9 bulan lalu |
Michael Yang
|
5e9db9fb0b
refactor convert
|
11 bulan lalu |
Michael Yang
|
5c1912769e
Merge pull request #5473 from ollama/mxyng/environ
|
9 bulan lalu |
royjhan
|
1b44d873e7
Add Metrics to `api\embed` response (#5709)
|
9 bulan lalu |
Daniel Hiltgen
|
345420998e
Prevent partial loading on mixed GPU brands
|
9 bulan lalu |
Michael Yang
|
0f1910129f
int
|
10 bulan lalu |
Jeffrey Morgan
|
80ee9b5e47
Remove out of space test temporarily (#5825)
|
9 bulan lalu |
Daniel Hiltgen
|
06e5d74e34
Merge pull request #5506 from dhiltgen/sched_tests
|
9 bulan lalu |
royjhan
|
b9f5e16c80
Introduce `/api/embed` endpoint supporting batch embedding (#5127)
|
9 bulan lalu |
Daniel Hiltgen
|
f4408219e9
Refine scheduler unit tests for reliability
|
10 bulan lalu |
Daniel Hiltgen
|
af28b94533
Merge pull request #5469 from dhiltgen/prevent_system_oom
|
10 bulan lalu |
Daniel Hiltgen
|
955f2a4e03
Only set default keep_alive on initial model load
|
10 bulan lalu |
Daniel Hiltgen
|
3c75113e37
Prevent loading models larger than total memory
|
10 bulan lalu |
Daniel Hiltgen
|
3518aaef33
Merge pull request #4218 from dhiltgen/auto_parallel
|
10 bulan lalu |
Blake Mizerany
|
cb42e607c5
llm: speed up gguf decoding by a lot (#5246)
|
10 bulan lalu |
Daniel Hiltgen
|
17b7186cd7
Enable concurrency by default
|
1 tahun lalu |
Daniel Hiltgen
|
45cacbaf05
Merge pull request #4517 from dhiltgen/gpu_incremental
|
10 bulan lalu |
Daniel Hiltgen
|
6f351bf586
review comments and coverage
|
11 bulan lalu |
Daniel Hiltgen
|
fc37c192ae
Refine CPU load behavior with system memory visibility
|
11 bulan lalu |
Daniel Hiltgen
|
6fd04ca922
Improve multi-gpu handling at the limit
|
11 bulan lalu |
Jeffrey Morgan
|
dd7c9ebeaf
server: longer timeout in `TestRequests` (#5046)
|
10 bulan lalu |
Michael Yang
|
e40145a39d
lint
|
11 bulan lalu |