Roy Han 0ac5cbc00e separate deprecation changes 10 months ago
..
ext_server 34f142797a llm: always add bos token to prompt (#4941) 10 months ago
generate ab8c929e20 Add ability to skip oneapi generate 10 months ago
llama.cpp @ 7c26775adb 0ac5cbc00e separate deprecation changes 10 months ago
patches ce0dc33cb8 llm: patch to fix qwen 2 temporarily on nvidia (#4897) 10 months ago
filetype.go d6f692ad1a Add support for IQ1_S, IQ3_S, IQ2_S, IQ4_XS. IQ4_NL (#4322) 11 months ago
ggla.go 171eb040fc simplify safetensors reading 11 months ago
ggml.go 620d5c569e fix parsing big endian gguf 10 months ago
gguf.go 620d5c569e fix parsing big endian gguf 10 months ago
llm.go 829ff87bd1 revert tokenize ffi (#4761) 11 months ago
llm_darwin_amd64.go 58d95cc9bd Switch back to subprocessing for llama.cpp 1 year ago
llm_darwin_arm64.go 58d95cc9bd Switch back to subprocessing for llama.cpp 1 year ago
llm_linux.go 58d95cc9bd Switch back to subprocessing for llama.cpp 1 year ago
llm_windows.go 058f6cd2cc Move nested payloads to installer and zip file on windows 1 year ago
memory.go 6297f85606 gofmt, goimports 11 months ago
payload.go 04f3c12bb7 replace x/exp/slices with slices 11 months ago
server.go b84aea1685 Critical fix from llama.cpp JSON grammar to forbid un-escaped escape characters inside strings, which breaks parsing. (#3782) 10 months ago
status.go 58d95cc9bd Switch back to subprocessing for llama.cpp 1 year ago