Histórico de Commits

Autor SHA1 Mensagem Data
  Jesse Gross 4100ed7bdd ml: Add support for quantized KV cache há 2 meses atrás
  Jesse Gross 25f9b152f9 ggml-backend: Ensure allocation meet backend requirements há 2 meses atrás
  Jesse Gross 98272fbd58 additional review comments há 2 meses atrás
  Michael Yang b27e8f3f10 ml/backend/ggml: use backend buffer type há 2 meses atrás
  Michael Yang 45df786f09 comments há 2 meses atrás
  Michael Yang daaf42e4a4 ml/backend/ggml: clean up há 2 meses atrás
  Michael Yang 2dc60d4620 ml/backend/ggml: offload vision to cpu há 2 meses atrás
  Michael Yang b5312f30e8 ml/backend/ggml: handle tensor split há 2 meses atrás
  Michael Yang 26c2e0bd35 ml/backend/ggml: handle user specified cpu offloading há 2 meses atrás
  Michael Yang bf920883d5 ml/backend/ggml: set cpu n_threads há 2 meses atrás
  Michael Yang 7bae7fa5ce ml/backend/ggml: create tensor on specific backend há 2 meses atrás
  Michael Yang 764e199d67 kvcache: create cache ctx per layer há 2 meses atrás
  Michael Yang bfce55db3d model: load non-repeated tensors into multiple backends há 2 meses atrás
  Michael Yang bab6f34dc0 ml/backend/ggml: update model loading for hybrid/multi backends há 2 meses atrás
  Michael Yang 05a01fdecb ml/backend/ggml: consolidate system info logging há 2 meses atrás
  Jesse Gross 21aa666a1e ml: Enable support for flash attention há 2 meses atrás
  Jesse Gross ee141cc821 ml: Empty tensor constructor for tensors há 2 meses atrás
  Jesse Gross 55e5776c44 ggml-backend: Store parent backend as part of tensor há 2 meses atrás
  Jesse Gross 854a9195f3 attention: Remove unnecessary contiguous operations há 2 meses atrás
  Michael Yang 3e8b8a1933 ml: update Context.Forward interface há 2 meses atrás
  Jesse Gross f53f4198c3 ml: Abstract attention out of model definitions há 3 meses atrás
  Michael Yang 2192a28eed ml/backend/ggml: fix rms norm há 2 meses atrás
  Jesse Gross e5bcc51ae1 ggml-backend: Don't recreate the scheduler for each context há 2 meses atrás
  Jesse Gross bd6a7d5e64 ollamarunner: Pass runner performance parameters to backends há 2 meses atrás
  Daniel Hiltgen df2680b4b9 Wire up system info log for new engine (#9123) há 3 meses atrás
  Jesse Gross ed443a0393 Runner for Ollama engine há 4 meses atrás
  Jesse Gross d223f3b697 ggml-backend: Close on nil should be a no-op há 3 meses atrás
  Jesse Gross 60830695c2 ggml-backend: Ensure data is available after async computation há 3 meses atrás
  Jesse Gross 01d9a46854 ggml-backend: Let GGML allocate context memory há 3 meses atrás
  Jesse Gross d773b7d671 backend: API to support full precision matmul há 3 meses atrás