Commit History

Author SHA1 Message Date
  Michael Yang 7e33a017c0 partial offloading 1 year ago
  Daniel Hiltgen be330174dd Allow setting max vram for workarounds 1 year ago
  peanut256 a189810df6 Determine max VRAM on macOS using `recommendedMaxWorkingSetSize` (#2354) 1 year ago
  Daniel Hiltgen 7427fa1387 Fix up the CPU fallback selection 1 year ago
  Daniel Hiltgen 39928a42e8 Always dynamically load the llm server library 1 year ago
  Daniel Hiltgen d88c527be3 Build multiple CPU variants and pick the best 1 year ago
  Jeffrey Morgan c336693f07 calculate overhead based number of gpu devices (#1875) 1 year ago
  Jeffrey Morgan 08f1e18965 Offload layers to GPU based on new model size estimates (#1850) 1 year ago
  Jeffrey Morgan c7ea8f237e set `num_gpu` to 1 only by default on darwin arm64 (#1771) 1 year ago
  Daniel Hiltgen a2ad952440 Fix windows system memory lookup 1 year ago
  Daniel Hiltgen d966b730ac Switch windows build to fully dynamic 1 year ago
  Daniel Hiltgen 7555ea44f8 Revamp the dynamic library shim 1 year ago
  Daniel Hiltgen 6558f94ed0 Fix darwin intel build 1 year ago
  Daniel Hiltgen 35934b2e05 Adapted rocm support to cgo based llama.cpp 1 year ago