.. |
ext_server
|
fcf4d60eee
llm: add back check for empty token cache
|
il y a 1 an |
generate
|
8a65717f55
Do not build AVX runners on ARM64
|
il y a 1 an |
llama.cpp @ 952d03dbea
|
e33d5c2dbc
update llama.cpp commit to `952d03d`
|
il y a 1 an |
patches
|
85801317d1
Fix clip log import
|
il y a 1 an |
ggla.go
|
8b2c10061c
refactor tensor query
|
il y a 1 an |
ggml.go
|
435cc866a3
fix: mixtral graph
|
il y a 1 an |
gguf.go
|
14476d48cc
fixes for gguf (#3863)
|
il y a 1 an |
llm.go
|
86e67fc4a9
Add import declaration for windows,arm64 to llm.go
|
il y a 1 an |
llm_darwin_amd64.go
|
58d95cc9bd
Switch back to subprocessing for llama.cpp
|
il y a 1 an |
llm_darwin_arm64.go
|
58d95cc9bd
Switch back to subprocessing for llama.cpp
|
il y a 1 an |
llm_linux.go
|
58d95cc9bd
Switch back to subprocessing for llama.cpp
|
il y a 1 an |
llm_windows.go
|
058f6cd2cc
Move nested payloads to installer and zip file on windows
|
il y a 1 an |
memory.go
|
f81f308118
fix gemma, command-r layer weights
|
il y a 1 an |
payload.go
|
058f6cd2cc
Move nested payloads to installer and zip file on windows
|
il y a 1 an |
server.go
|
7aa08a77ca
llm: dont cap context window limit to training context window (#3988)
|
il y a 1 an |
status.go
|
58d95cc9bd
Switch back to subprocessing for llama.cpp
|
il y a 1 an |