Jeffrey Morgan 34f142797a llm: always add bos token to prompt (#4941) 10 months ago
..
ext_server 34f142797a llm: always add bos token to prompt (#4941) 10 months ago
generate ab8c929e20 Add ability to skip oneapi generate 10 months ago
llama.cpp @ 5921b8f089 22f5c12ced Update llama.cpp submodule to `5921b8f0` (#4731) 11 months ago
patches ce0dc33cb8 llm: patch to fix qwen 2 temporarily on nvidia (#4897) 10 months ago
filetype.go d6f692ad1a Add support for IQ1_S, IQ3_S, IQ2_S, IQ4_XS. IQ4_NL (#4322) 11 months ago
ggla.go 171eb040fc simplify safetensors reading 11 months ago
ggml.go 9b6c2e6eb6 detect chat template from KV 10 months ago
gguf.go 030e765e76 fix create model when template detection errors 10 months ago
llm.go 829ff87bd1 revert tokenize ffi (#4761) 11 months ago
llm_darwin_amd64.go 58d95cc9bd Switch back to subprocessing for llama.cpp 1 year ago
llm_darwin_arm64.go 58d95cc9bd Switch back to subprocessing for llama.cpp 1 year ago
llm_linux.go 58d95cc9bd Switch back to subprocessing for llama.cpp 1 year ago
llm_windows.go 058f6cd2cc Move nested payloads to installer and zip file on windows 1 year ago
memory.go 6297f85606 gofmt, goimports 11 months ago
payload.go 04f3c12bb7 replace x/exp/slices with slices 11 months ago
server.go e40145a39d lint 11 months ago
status.go 58d95cc9bd Switch back to subprocessing for llama.cpp 1 year ago