Michael Yang
|
cf442cd57e
fix typo
|
11 月之前 |
Michael Yang
|
ce3b212d12
only forward some env vars
|
11 月之前 |
Michael Yang
|
58876091f7
log clean up
|
11 月之前 |
Daniel Hiltgen
|
d0425f26cf
Merge pull request #4294 from dhiltgen/harden_subprocess_reaping
|
11 月之前 |
Bruce MacDonald
|
cfa84b8470
add done_reason to the api (#4235)
|
11 月之前 |
Daniel Hiltgen
|
84ac7ce139
Refine subprocess reaping
|
11 月之前 |
Daniel Hiltgen
|
920a4b0794
Merge remote-tracking branch 'upstream/main' into pr3702
|
11 月之前 |
Daniel Hiltgen
|
ee49844d09
Merge pull request #4153 from dhiltgen/gpu_verbose_response
|
11 月之前 |
Daniel Hiltgen
|
bee2f4a3b0
Record GPU usage information
|
1 年之前 |
Daniel Hiltgen
|
72700279e2
Detect noexec and report a better error
|
1 年之前 |
Daniel Hiltgen
|
380378cc80
Use our libraries first
|
1 年之前 |
Jeffrey Morgan
|
ed740a2504
Fix `no slots available` error with concurrent requests (#4160)
|
1 年之前 |
Jeffrey Morgan
|
1b0e6c9c0e
Fix llava models not working after first request (#4164)
|
1 年之前 |
Daniel Hiltgen
|
f56aa20014
Centralize server config handling
|
1 年之前 |
Mark Ward
|
321d57e1a0
Removing go routine calling .wait from load.
|
1 年之前 |
Mark Ward
|
ba26c7aa00
it will always return an error due to Kill() discarding Wait() errors
|
1 年之前 |
Mark Ward
|
63c763685f
log when the waiting for the process to stop to help debug when other tasks execute during this wait.
|
1 年之前 |
Mark Ward
|
948114e3e3
fix sched to wait for the runner to terminate to ensure following vram check will be more accurate
|
1 年之前 |
Jeffrey Morgan
|
7aa08a77ca
llm: dont cap context window limit to training context window (#3988)
|
1 年之前 |
Jeffrey Morgan
|
bb31def011
return code `499` when user cancels request while a model is loading (#3955)
|
1 年之前 |
Jeffrey Morgan
|
993cf8bf55
llm: limit generation to 10x context size to avoid run on generations (#3918)
|
1 年之前 |
Daniel Hiltgen
|
6e76348df7
Merge pull request #3834 from dhiltgen/not_found_in_path
|
1 年之前 |
Daniel Hiltgen
|
58888a74bc
Detect and recover if runner removed
|
1 年之前 |
Daniel Hiltgen
|
34b9db5afc
Request and model concurrency
|
1 年之前 |
Daniel Hiltgen
|
8711d03df7
Report errors on server lookup instead of path lookup failure
|
1 年之前 |
Daniel Hiltgen
|
aa72281eae
Trim spaces and quotes from llm lib override
|
1 年之前 |
ManniX-ITA
|
c496967e56
Merge branch 'ollama:main' into mannix-server
|
1 年之前 |
Michael Yang
|
3cf483fe48
add stablelm graph calculation
|
1 年之前 |
Michael Yang
|
a8b9b930b4
account for all non-repeating layers
|
1 年之前 |
ManniX-ITA
|
bd54b08261
Streamlined WaitUntilRunning
|
1 年之前 |