Jeffrey Morgan
|
96a97adf9b
build: use correct GGML_HIP_NO_VMM compiler definition for ggml-hip (#9451)
|
2 bulan lalu |
Jeffrey Morgan
|
e75c6126e9
build: set GGML_CUDA_NO_VMM for ggml-hip target (#9449)
|
2 bulan lalu |
Blake Mizerany
|
cda6f5c66c
server/internal/internal/names: validate names (#9400)
|
2 bulan lalu |
Bruce MacDonald
|
bebb6823c0
server: validate local path on safetensor create (#9379)
|
2 bulan lalu |
Michael Yang
|
31e472baa4
runner: defer context cancel
|
2 bulan lalu |
Michael Yang
|
657685e85d
fix: replace deprecated functions
|
2 bulan lalu |
Jeffrey Morgan
|
a14912858e
build: add compute capability 12.0 to CUDA 12 preset (#9426)
|
2 bulan lalu |
Blake Mizerany
|
eed11ded30
server/.../safetensors: fix offsets and include all model parts (#9427)
|
2 bulan lalu |
Michael Yang
|
b42aba40ed
cuda: enable flash attention
|
2 bulan lalu |
王贺
|
25885e5335
docs: Add 1Panel to Community Integrations (#9312)
|
2 bulan lalu |
Jeffrey Morgan
|
98d44fa39d
llama: add phi4 mini support (#9403)
|
2 bulan lalu |
Blake Mizerany
|
2099e2d267
CONTRIBUTING: provide clarity on good commit messages, and bad (#9405)
|
2 bulan lalu |
Bruce MacDonald
|
0c1041ad85
runner: default to greedy sampler for performance (#9407)
|
2 bulan lalu |
Parth Sareen
|
c245b0406f
sample: remove transforms from greedy sampling (#9377)
|
2 bulan lalu |
Michael Yang
|
8b194b7520
kvcache: update tests
|
2 bulan lalu |
Michael Yang
|
3e8b8a1933
ml: update Context.Forward interface
|
2 bulan lalu |
Blake Mizerany
|
41dc280491
server/internal/registry: implement CloseNotify and Flush (for now) (#9402)
|
2 bulan lalu |
Michael Yang
|
53d2990d9b
model: add bos token if configured
|
2 bulan lalu |
Jesse Gross
|
e185c08ad9
go.mod: Use full version for go 1.24.0
|
2 bulan lalu |
Blake Mizerany
|
2412adf42b
server/internal: replace model delete API with new registry handler. (#9347)
|
2 bulan lalu |
Steven Hartland
|
be2ac1ed93
docs: fix api examples link (#9360)
|
2 bulan lalu |
Eries Trisnadi
|
dc13813a03
server: allow vscode-file origins (#9313)
|
2 bulan lalu |
Michael Yang
|
d6af13efed
runner: simplify tensor split parsing
|
2 bulan lalu |
Michael Yang
|
a59f665235
ml/backend/ggml: fix debug logging
|
2 bulan lalu |
Daniel Hiltgen
|
688925aca9
Windows ARM build (#9120)
|
2 bulan lalu |
Blake Mizerany
|
76e903cf9d
.github/workflows: swap order of go test and golangci-lint (#9389)
|
2 bulan lalu |
Jeffrey Morgan
|
a5272130c4
ml/backend/ggml: follow on fixes after updating vendored code (#9388)
|
2 bulan lalu |
Jeffrey Morgan
|
d7d7e99662
llama: update llama.cpp vendor code to commit d7cfe1ff (#9356)
|
2 bulan lalu |
Gordon Kamer
|
2db96c18e7
readme: add Nichey to community integrations (#9370)
|
2 bulan lalu |
Daniel Hiltgen
|
e12af460ed
Add cuda Blackwell architecture for v12 (#9350)
|
2 bulan lalu |