Jeffrey Morgan
|
f11bf0740b
use `llm.ImageData`
|
1 éve |
Michael Yang
|
8450bf66e6
trim images
|
1 éve |
Michael Yang
|
4a33cede20
remove unused fields and functions
|
1 éve |
Jeffrey Morgan
|
08f1e18965
Offload layers to GPU based on new model size estimates (#1850)
|
1 éve |
Bruce MacDonald
|
0b3118e0af
fix: relay request opts to loaded llm prediction (#1761)
|
1 éve |
Daniel Hiltgen
|
d966b730ac
Switch windows build to fully dynamic
|
1 éve |
Daniel Hiltgen
|
7555ea44f8
Revamp the dynamic library shim
|
1 éve |
Daniel Hiltgen
|
54dbfa4c4a
Carry ggml-metal.metal as payload
|
1 éve |
Daniel Hiltgen
|
35934b2e05
Adapted rocm support to cgo based llama.cpp
|
1 éve |
Daniel Hiltgen
|
d4cd695759
Add cgo implementation for llama.cpp
|
1 éve |
Bruce MacDonald
|
811b1f03c8
deprecate ggml
|
1 éve |
Bruce MacDonald
|
6ee8c80199
restore model load duration on generate response (#1524)
|
1 éve |
Bruce MacDonald
|
3144e2a439
exponential back-off (#1484)
|
1 éve |
Bruce MacDonald
|
c0960e29b5
retry on concurrent request failure (#1483)
|
1 éve |
Patrick Devine
|
910e9401d0
Multimodal support (#1216)
|
1 éve |
Jeffrey Morgan
|
fa2f095bd9
fix model name returned by `/api/generate` being different than the model name provided
|
1 éve |
Jeffrey Morgan
|
2dd040d04c
do not use `--parallel 2` for old runners
|
1 éve |
Bruce MacDonald
|
bbe41ce41a
fix: parallel queueing race condition caused silent failure (#1445)
|
1 éve |
Michael Yang
|
b9495ea162
load projectors
|
1 éve |
Bruce MacDonald
|
195e3d9dbd
chat api endpoint (#1392)
|
1 éve |
Jeffrey Morgan
|
00d06619a1
Revert "chat api (#991)" while context variable is fixed
|
1 éve |
Bruce MacDonald
|
7a0899d62d
chat api (#991)
|
1 éve |
Jing Zhang
|
82b9b329ff
windows CUDA support (#1262)
|
1 éve |
Jeffrey Morgan
|
a3fcecf943
only set `main_gpu` if value > 0 is provided
|
1 éve |
Purinda Gunasekara
|
be61a81758
main-gpu argument is not getting passed to llamacpp, fixed. (#1192)
|
1 éve |
Jeffrey Morgan
|
36a3bbf65f
Update llm/llama.go
|
1 éve |
Bruce MacDonald
|
43a726149d
fix potentially inaccurate error message
|
1 éve |
Jeffrey Morgan
|
41434a7cdc
build intel mac with correct binary and compile flags
|
1 éve |
Jeffrey Morgan
|
5cba29b9d6
JSON mode: add `"format" as an api parameter (#1051)
|
1 éve |
Bruce MacDonald
|
1ae84bc2a2
skip gpu if less than 2GB VRAM are available (#1059)
|
1 éve |