Commit History

Author SHA1 Message Date
  Jeffrey Morgan 1579c4f06d build: install binutils alongside gcc in Dockerfile (#9475) 2 months ago
  Blake Mizerany 3519dd1c6e server/internal/client/ollama: hold DiskCache on Registry (#9463) 2 months ago
  Jeffrey Morgan e41c4cbea7 build: install ccache manually in Dockerfile (#9464) 2 months ago
  Blake Mizerany ee048b76d4 server/internal/client/ollama: handle extended names in client/ollama (#9454) 2 months ago
  Soulter af68d60a58 readme: add AstrBot to community integrations (#9442) 2 months ago
  Jesse Gross 21aa666a1e ml: Enable support for flash attention 2 months ago
  Jesse Gross ee141cc821 ml: Empty tensor constructor for tensors 2 months ago
  Jesse Gross 55e5776c44 ggml-backend: Store parent backend as part of tensor 2 months ago
  Jesse Gross 854a9195f3 attention: Remove unnecessary contiguous operations 2 months ago
  Jeffrey Morgan 96a97adf9b build: use correct GGML_HIP_NO_VMM compiler definition for ggml-hip (#9451) 2 months ago
  Jeffrey Morgan e75c6126e9 build: set GGML_CUDA_NO_VMM for ggml-hip target (#9449) 2 months ago
  Blake Mizerany cda6f5c66c server/internal/internal/names: validate names (#9400) 2 months ago
  Bruce MacDonald bebb6823c0 server: validate local path on safetensor create (#9379) 2 months ago
  Michael Yang 31e472baa4 runner: defer context cancel 2 months ago
  Michael Yang 657685e85d fix: replace deprecated functions 2 months ago
  Jeffrey Morgan a14912858e build: add compute capability 12.0 to CUDA 12 preset (#9426) 2 months ago
  Blake Mizerany eed11ded30 server/.../safetensors: fix offsets and include all model parts (#9427) 2 months ago
  Michael Yang b42aba40ed cuda: enable flash attention 2 months ago
  王贺 25885e5335 docs: Add 1Panel to Community Integrations (#9312) 2 months ago
  Jeffrey Morgan 98d44fa39d llama: add phi4 mini support (#9403) 2 months ago
  Blake Mizerany 2099e2d267 CONTRIBUTING: provide clarity on good commit messages, and bad (#9405) 2 months ago
  Bruce MacDonald 0c1041ad85 runner: default to greedy sampler for performance (#9407) 2 months ago
  Parth Sareen c245b0406f sample: remove transforms from greedy sampling (#9377) 2 months ago
  Michael Yang 8b194b7520 kvcache: update tests 2 months ago
  Michael Yang 3e8b8a1933 ml: update Context.Forward interface 2 months ago
  Blake Mizerany 41dc280491 server/internal/registry: implement CloseNotify and Flush (for now) (#9402) 2 months ago
  Michael Yang 53d2990d9b model: add bos token if configured 2 months ago
  Jesse Gross e185c08ad9 go.mod: Use full version for go 1.24.0 2 months ago
  Blake Mizerany 2412adf42b server/internal: replace model delete API with new registry handler. (#9347) 2 months ago
  Steven Hartland be2ac1ed93 docs: fix api examples link (#9360) 2 months ago