Roy Han 80c1a3f812 playing around with truncate stuff 11 月之前
..
ext_server 80c1a3f812 playing around with truncate stuff 11 月之前
generate b0930626c5 Add back lower level parallel flags 11 月之前
llama.cpp @ 7c26775adb 152fc202f5 llm: update llama.cpp commit to `7c26775` (#4896) 11 月之前
patches 152fc202f5 llm: update llama.cpp commit to `7c26775` (#4896) 11 月之前
filetype.go d6f692ad1a Add support for IQ1_S, IQ3_S, IQ2_S, IQ4_XS. IQ4_NL (#4322) 1 年之前
ggla.go 171eb040fc simplify safetensors reading 1 年之前
ggml.go 6fd04ca922 Improve multi-gpu handling at the limit 11 月之前
gguf.go 7bdcd1da94 Revert "Merge pull request #4938 from ollama/mxyng/fix-byte-order" 11 月之前
llm.go 829ff87bd1 revert tokenize ffi (#4761) 11 月之前
llm_darwin_amd64.go 58d95cc9bd Switch back to subprocessing for llama.cpp 1 年之前
llm_darwin_arm64.go 58d95cc9bd Switch back to subprocessing for llama.cpp 1 年之前
llm_linux.go 58d95cc9bd Switch back to subprocessing for llama.cpp 1 年之前
llm_windows.go 058f6cd2cc Move nested payloads to installer and zip file on windows 1 年之前
memory.go 359b15a597 Handle models with divergent layer sizes 11 月之前
memory_test.go 6f351bf586 review comments and coverage 11 月之前
payload.go 6f351bf586 review comments and coverage 11 月之前
server.go c111d8bb51 normalization 11 月之前
status.go 58d95cc9bd Switch back to subprocessing for llama.cpp 1 年之前