Historique des commits

Auteur SHA1 Message Date
  royjhan b9f5e16c80 Introduce `/api/embed` endpoint supporting batch embedding (#5127) il y a 9 mois
  Jeffrey Morgan ef98803d63 llm: looser checks for minimum memory (#5677) il y a 9 mois
  Jeffrey Morgan c4cf8ad559 llm: avoid loading model if system memory is too small (#5637) il y a 9 mois
  Jeffrey Morgan 791650ddef sched: only error when over-allocating system memory (#5626) il y a 9 mois
  Daniel Hiltgen 22c81f62ec Remove duplicate merge glitch il y a 9 mois
  Michael Yang 9bbddc37a7 Merge pull request #5126 from ollama/mxyng/messages il y a 9 mois
  Jeffrey Morgan 53da2c6965 llm: remove ambiguous comment when putting upper limit on predictions to avoid infinite generation (#5535) il y a 10 mois
  Michael Yang ac7a842e55 fix model reloading il y a 10 mois
  Daniel Hiltgen ccd7785859 Merge pull request #5243 from dhiltgen/modelfile_use_mmap il y a 10 mois
  Daniel Hiltgen 0e982bc1f4 Fix corner cases on tmp cleaner on mac il y a 10 mois
  Josh Yan 33a65e3ba3 error il y a 10 mois
  Daniel Hiltgen 97c9e11768 Switch use_mmap to a pointer type il y a 10 mois
  Daniel Hiltgen 3518aaef33 Merge pull request #4218 from dhiltgen/auto_parallel il y a 10 mois
  Blake Mizerany cb42e607c5 llm: speed up gguf decoding by a lot (#5246) il y a 10 mois
  Daniel Hiltgen 17b7186cd7 Enable concurrency by default il y a 1 an
  Daniel Hiltgen 5bf5aeec01 Refine mmap default logic on linux il y a 10 mois
  Daniel Hiltgen 96624aa412 Merge pull request #5072 from dhiltgen/windows_path il y a 10 mois
  Daniel Hiltgen 7784ca33ce Tighten up memory prediction logging il y a 10 mois
  Daniel Hiltgen 171796791f Adjust mmap logic for cuda windows for faster model load il y a 10 mois
  Daniel Hiltgen b2799f111b Move libraries out of users path il y a 10 mois
  Daniel Hiltgen da3bf23354 Workaround gfx900 SDMA bugs il y a 11 mois
  Daniel Hiltgen 6f351bf586 review comments and coverage il y a 11 mois
  Daniel Hiltgen fc37c192ae Refine CPU load behavior with system memory visibility il y a 11 mois
  Daniel Hiltgen 6fd04ca922 Improve multi-gpu handling at the limit il y a 11 mois
  Craig Hughes b84aea1685 Critical fix from llama.cpp JSON grammar to forbid un-escaped escape characters inside strings, which breaks parsing. (#3782) il y a 10 mois
  Michael Yang e40145a39d lint il y a 11 mois
  Michael Yang c895a7d13f some gocritic il y a 11 mois
  Michael Yang 829ff87bd1 revert tokenize ffi (#4761) il y a 11 mois
  Jeffrey Morgan a50a87a7b8 partial offloading: allow flash attention and disable mmap (#4734) il y a 11 mois
  Michael Yang 26a00a0410 use ffi for tokenizing/detokenizing il y a 11 mois