Commit History

Autor SHA1 Mensaxe Data
  Michael Yang d790bf9916 Merge pull request #783 from jmorganca/mxyng/fix-gpu-offloading hai 1 ano
  Michael Yang 35afac099a do not use gpu binary when num_gpu == 0 hai 1 ano
  Michael Yang 811c3d1900 no gpu if vram < 2GB hai 1 ano
  Bruce MacDonald 6fe178134d improve api error handling (#781) hai 1 ano
  Bruce MacDonald 56497663c8 relay model runner error message to client (#720) hai 1 ano
  Michael Yang b599946b74 add format bytes hai 1 ano
  Bruce MacDonald 77295f716e prevent waiting on exited command (#752) hai 1 ano
  Bruce MacDonald f2ba1311aa improve vram safety with 5% vram memory buffer (#724) hai 1 ano
  Bruce MacDonald 5d22319a2c rename server subprocess (#700) hai 1 ano
  Bruce MacDonald 9e2de1bd2c increase streaming buffer size (#692) hai 1 ano
  Michael Yang c02c0cd483 starcoder hai 1 ano
  Bruce MacDonald b1f7123301 clean up num_gpu calculation code (#673) hai 1 ano
  Bruce MacDonald 1fbf3585d6 Relay default values to llama runner (#672) hai 1 ano
  Bruce MacDonald 9771b1ec51 windows runner fixes (#637) hai 1 ano
  Michael Yang f40b3de758 use int64 consistently hai 1 ano
  Bruce MacDonald 86279f4ae3 unbound max num gpu layers (#591) hai 1 ano
  Bruce MacDonald 4cba75efc5 remove tmp directories created by previous servers (#559) hai 1 ano
  Bruce MacDonald 1255bc9b45 only package 11.8 runner hai 1 ano
  Bruce MacDonald 4e8be787c7 pack in cuda libs hai 1 ano
  Bruce MacDonald 66003e1d05 subprocess improvements (#524) hai 1 ano
  Bruce MacDonald 2540c9181c support for packaging in multiple cuda runners (#509) hai 1 ano
  Michael Yang 7dee25a07f fix falcon decode hai 1 ano
  Bruce MacDonald f221637053 first pass at linux gpu support (#454) hai 1 ano
  Bruce MacDonald 09dd2aeff9 GGUF support (#441) hai 1 ano
  Bruce MacDonald 42998d797d subprocess llama.cpp server (#401) hai 1 ano
  Quinn Slack f4432e1dba treat stop as stop sequences, not exact tokens (#442) hai 1 ano
  Michael Yang 5ca05c2e88 fix ModelType() hai 1 ano
  Michael Yang a894cc792d model and file type as strings hai 1 ano
  Bruce MacDonald 4b2d366c37 Update llama.go hai 1 ano
  Bruce MacDonald 56fd4e4ef2 log embedding eval timing hai 1 ano