Jesse Gross
|
1c36f36c41
llm: Fix array out-of-bounds memory access when tokenizing
|
9 months ago |
Jesse Gross
|
23c7c1326e
llm: Fix lint
|
9 months ago |
Daniel Hiltgen
|
751009a5d7
Runtime selection of new or old runners
|
9 months ago |
Daniel Hiltgen
|
e9dd656ff5
Update sync with latest llama.cpp layout, and run against b3485
|
9 months ago |
jmorganca
|
e1dfc757b3
revert llm changes
|
11 months ago |
jmorganca
|
01ccbc07fe
replace static build in `llm`
|
11 months ago |
Michael Yang
|
b732beba6a
lint
|
9 months ago |
Josh
|
10e768826c
fix: quant err message (#5616)
|
10 months ago |
Daniel Hiltgen
|
b51e3b63ac
Statically link c++ and thread lib
|
10 months ago |
jmorganca
|
a08f20d910
release: remove unwanted mingw dll.a files
|
10 months ago |
jmorganca
|
6cea036027
Revert "llm: only statically link libstdc++"
|
10 months ago |
jmorganca
|
5796bfc401
llm: only statically link libstdc++
|
10 months ago |
jmorganca
|
f1a379aa56
llm: statically link pthread and stdc++ dependencies in windows build
|
10 months ago |
Jeffrey Morgan
|
5304b765b2
llm: put back old include dir (#5507)
|
10 months ago |
Jeffrey Morgan
|
78fb33dd07
fix typo in cgo directives in `llm.go` (#5501)
|
10 months ago |
Jeffrey Morgan
|
8f8e736b13
update llama.cpp submodule to `d7fd29f` (#5475)
|
10 months ago |
Michael Yang
|
829ff87bd1
revert tokenize ffi (#4761)
|
11 months ago |
Jeffrey Morgan
|
763bb65dbb
use `int32_t` for call to tokenize (#4738)
|
11 months ago |
Michael Yang
|
bf54c845e9
vocab only
|
11 months ago |
Michael Yang
|
26a00a0410
use ffi for tokenizing/detokenizing
|
1 year ago |
Michael Yang
|
01811c176a
comments
|
1 year ago |
Michael Yang
|
9685c34509
quantize any fp16/fp32 model
|
1 year ago |
Hernan Martinez
|
86e67fc4a9
Add import declaration for windows,arm64 to llm.go
|
1 year ago |
Michael Yang
|
9502e5661f
cgo quantize
|
1 year ago |
Daniel Hiltgen
|
58d95cc9bd
Switch back to subprocessing for llama.cpp
|
1 year ago |
Michael Yang
|
91b3e4d282
update memory calcualtions
|
1 year ago |
Michael Yang
|
d338d70492
refactor model parsing
|
1 year ago |
Patrick Devine
|
1b272d5bcd
change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347)
|
1 year ago |
Jeffrey Morgan
|
f9cd55c70b
disable gpu for certain model architectures and fix divide-by-zero on memory estimation
|
1 year ago |
Daniel Hiltgen
|
6c5ccb11f9
Revamp ROCm support
|
1 year ago |