Jeffrey Morgan
|
1579c4f06d
build: install binutils alongside gcc in Dockerfile (#9475)
|
2 maanden geleden |
Blake Mizerany
|
3519dd1c6e
server/internal/client/ollama: hold DiskCache on Registry (#9463)
|
2 maanden geleden |
Jeffrey Morgan
|
e41c4cbea7
build: install ccache manually in Dockerfile (#9464)
|
2 maanden geleden |
Blake Mizerany
|
ee048b76d4
server/internal/client/ollama: handle extended names in client/ollama (#9454)
|
2 maanden geleden |
Soulter
|
af68d60a58
readme: add AstrBot to community integrations (#9442)
|
2 maanden geleden |
Jesse Gross
|
21aa666a1e
ml: Enable support for flash attention
|
2 maanden geleden |
Jesse Gross
|
ee141cc821
ml: Empty tensor constructor for tensors
|
2 maanden geleden |
Jesse Gross
|
55e5776c44
ggml-backend: Store parent backend as part of tensor
|
2 maanden geleden |
Jesse Gross
|
854a9195f3
attention: Remove unnecessary contiguous operations
|
2 maanden geleden |
Jeffrey Morgan
|
96a97adf9b
build: use correct GGML_HIP_NO_VMM compiler definition for ggml-hip (#9451)
|
2 maanden geleden |
Jeffrey Morgan
|
e75c6126e9
build: set GGML_CUDA_NO_VMM for ggml-hip target (#9449)
|
2 maanden geleden |
Blake Mizerany
|
cda6f5c66c
server/internal/internal/names: validate names (#9400)
|
2 maanden geleden |
Bruce MacDonald
|
bebb6823c0
server: validate local path on safetensor create (#9379)
|
2 maanden geleden |
Michael Yang
|
31e472baa4
runner: defer context cancel
|
2 maanden geleden |
Michael Yang
|
657685e85d
fix: replace deprecated functions
|
2 maanden geleden |
Jeffrey Morgan
|
a14912858e
build: add compute capability 12.0 to CUDA 12 preset (#9426)
|
2 maanden geleden |
Blake Mizerany
|
eed11ded30
server/.../safetensors: fix offsets and include all model parts (#9427)
|
2 maanden geleden |
Michael Yang
|
b42aba40ed
cuda: enable flash attention
|
2 maanden geleden |
王贺
|
25885e5335
docs: Add 1Panel to Community Integrations (#9312)
|
2 maanden geleden |
Jeffrey Morgan
|
98d44fa39d
llama: add phi4 mini support (#9403)
|
2 maanden geleden |
Blake Mizerany
|
2099e2d267
CONTRIBUTING: provide clarity on good commit messages, and bad (#9405)
|
2 maanden geleden |
Bruce MacDonald
|
0c1041ad85
runner: default to greedy sampler for performance (#9407)
|
2 maanden geleden |
Parth Sareen
|
c245b0406f
sample: remove transforms from greedy sampling (#9377)
|
2 maanden geleden |
Michael Yang
|
8b194b7520
kvcache: update tests
|
2 maanden geleden |
Michael Yang
|
3e8b8a1933
ml: update Context.Forward interface
|
2 maanden geleden |
Blake Mizerany
|
41dc280491
server/internal/registry: implement CloseNotify and Flush (for now) (#9402)
|
2 maanden geleden |
Michael Yang
|
53d2990d9b
model: add bos token if configured
|
2 maanden geleden |
Jesse Gross
|
e185c08ad9
go.mod: Use full version for go 1.24.0
|
2 maanden geleden |
Blake Mizerany
|
2412adf42b
server/internal: replace model delete API with new registry handler. (#9347)
|
2 maanden geleden |
Steven Hartland
|
be2ac1ed93
docs: fix api examples link (#9360)
|
2 maanden geleden |