OpenSource/ollama

Author	SHA1 Message	Date
Patrick Devine	b73a512f24 fix the cpu estimatedTotal memory + get the expiry time for loading models	11 months ago
Daniel Hiltgen	853ae490e1 Sanitize the env var debug log	11 months ago
Patrick Devine	6845988807 Ollama `ps` command for showing currently loaded models (#4327)	11 months ago
jmorganca	92ca2cca95 Revert "only forward some env vars"	1 year ago
Daniel Hiltgen	c4014e73a2 Fall back to CPU runner with zero layers	1 year ago
Jeffrey Morgan	bb6fd02298 Don't clamp ctx size in `PredictServerFit` (#4317)	1 year ago
Michael Yang	cf442cd57e fix typo	1 year ago
Michael Yang	ce3b212d12 only forward some env vars	1 year ago
Michael Yang	58876091f7 log clean up	1 year ago
Daniel Hiltgen	d0425f26cf Merge pull request #4294 from dhiltgen/harden_subprocess_reaping	1 year ago
Bruce MacDonald	cfa84b8470 add done_reason to the api (#4235)	1 year ago
Daniel Hiltgen	84ac7ce139 Refine subprocess reaping	1 year ago
Daniel Hiltgen	920a4b0794 Merge remote-tracking branch 'upstream/main' into pr3702	1 year ago
Daniel Hiltgen	ee49844d09 Merge pull request #4153 from dhiltgen/gpu_verbose_response	1 year ago
Daniel Hiltgen	bee2f4a3b0 Record GPU usage information	1 year ago
Daniel Hiltgen	72700279e2 Detect noexec and report a better error	1 year ago
Daniel Hiltgen	380378cc80 Use our libraries first	1 year ago
Jeffrey Morgan	ed740a2504 Fix `no slots available` error with concurrent requests (#4160)	1 year ago
Jeffrey Morgan	1b0e6c9c0e Fix llava models not working after first request (#4164)	1 year ago
Daniel Hiltgen	f56aa20014 Centralize server config handling	1 year ago
Mark Ward	321d57e1a0 Removing go routine calling .wait from load.	1 year ago
Mark Ward	ba26c7aa00 it will always return an error due to Kill() discarding Wait() errors	1 year ago
Mark Ward	63c763685f log when the waiting for the process to stop to help debug when other tasks execute during this wait.	1 year ago
Mark Ward	948114e3e3 fix sched to wait for the runner to terminate to ensure following vram check will be more accurate	1 year ago
Jeffrey Morgan	7aa08a77ca llm: dont cap context window limit to training context window (#3988)	1 year ago
Jeffrey Morgan	bb31def011 return code `499` when user cancels request while a model is loading (#3955)	1 year ago
Jeffrey Morgan	993cf8bf55 llm: limit generation to 10x context size to avoid run on generations (#3918)	1 year ago
Daniel Hiltgen	6e76348df7 Merge pull request #3834 from dhiltgen/not_found_in_path	1 year ago
Daniel Hiltgen	58888a74bc Detect and recover if runner removed	1 year ago
Daniel Hiltgen	34b9db5afc Request and model concurrency	1 year ago

Newer Older

Commit History Find

Commit History