Commit History

Author SHA1 Message Date
  Daniel Hiltgen fb9cdfa723 Fix server.cpp for the new cuda build macros 11 months ago
  Jeffrey Morgan ead259d877 llm: fix seed value not being applied to requests (#4986) 10 months ago
  Jeffrey Morgan 34f142797a llm: always add bos token to prompt (#4941) 10 months ago
  Michael Yang 829ff87bd1 revert tokenize ffi (#4761) 11 months ago
  Michael Yang de781b37c8 rm unused infill 11 months ago
  Michael Yang 3e21799377 rm unused system prompt 11 months ago
  Michael Yang 26a00a0410 use ffi for tokenizing/detokenizing 11 months ago
  Michael Yang 714adb8bd1 bump (#4597) 11 months ago
  Daniel Hiltgen b37b496a12 Wire up load progress 11 months ago
  Sam e15307fdf4 feat: add support for flash_attn (#4120) 11 months ago
  Michael Yang 58876091f7 log clean up 11 months ago
  Daniel Hiltgen 920a4b0794 Merge remote-tracking branch 'upstream/main' into pr3702 1 year ago
  Michael Yang 44869c59d6 omit prompt and generate settings from final response 1 year ago
  jmorganca fcf4d60eee llm: add back check for empty token cache 1 year ago
  Jeffrey Morgan 18d9a7e1f1 update llama.cpp submodule to `f364eb6` (#4060) 1 year ago
  Daniel Hiltgen 23d23409a0 Update llama.cpp (#4036) 1 year ago
  ManniX-ITA c942e4a07b Fixed startup sequence to report model loading 1 year ago
  Jeffrey Morgan 7c9792a6e0 Support unicode characters in model path (#3681) 1 year ago
  Daniel Hiltgen 0a0e9f3e0f Apply 01-cache.diff 1 year ago
  Daniel Hiltgen 58d95cc9bd Switch back to subprocessing for llama.cpp 1 year ago
  Jeffrey Morgan f5ca7f8c8e add license in file header for vendored llama.cpp code (#3351) 1 year ago
  Daniel Hiltgen 43799532c1 Bump llama.cpp to b2474 1 year ago
  Jeffrey Morgan e95ffc7448 llama: remove server static assets (#3174) 1 year ago
  Daniel Hiltgen 85129d3a32 Adapt our build for imported server.cpp 1 year ago
  Daniel Hiltgen 9ac6440da3 Import server.cpp as of b2356 1 year ago