Historique des commits

Auteur SHA1 Message Date
  Daniel Hiltgen a1dfab43b9 Ensure the libraries are present il y a 1 an
  Jeffrey Morgan 4458efb73a Load all layers on `arm64` macOS if model is small enough (#2149) il y a 1 an
  Daniel Hiltgen fedd705aea Mechanical switch from log to slog il y a 1 an
  Michael Yang eaed6f8c45 add max context length check il y a 1 an
  Daniel Hiltgen 7427fa1387 Fix up the CPU fallback selection il y a 1 an
  Daniel Hiltgen de2fbdec99 Merge pull request #1819 from dhiltgen/multi_variant il y a 1 an
  Michael Yang f4f939de28 Merge pull request #1552 from jmorganca/mxyng/lint-test il y a 1 an
  Daniel Hiltgen 39928a42e8 Always dynamically load the llm server library il y a 1 an
  Daniel Hiltgen d88c527be3 Build multiple CPU variants and pick the best il y a 1 an
  Jeffrey Morgan ab6be852c7 revisit memory allocation to account for full kv cache on main gpu il y a 1 an
  Daniel Hiltgen 8da7bef05f Support multiple variants for a given llm lib type il y a 1 an
  Jeffrey Morgan b24e8d17b2 Increase minimum CUDA memory allocation overhead and fix minimum overhead for multi-gpu (#1896) il y a 1 an
  Michael Yang f921e2696e typo il y a 1 an
  Jeffrey Morgan f387e9631b use runner if cuda alloc won't fit il y a 1 an
  Jeffrey Morgan cb534e6ac2 use 10% vram overhead for cuda il y a 1 an
  Jeffrey Morgan 58ce2d8273 better estimate scratch buffer size il y a 1 an
  Jeffrey Morgan 08f1e18965 Offload layers to GPU based on new model size estimates (#1850) il y a 1 an
  Daniel Hiltgen e9ce91e9a6 Load dynamic cpu lib on windows il y a 1 an
  Jeffrey Morgan c0285158a9 tweak memory requirements error text il y a 1 an
  Jeffrey Morgan 77a66df72c add macOS memory check for 47B models il y a 1 an
  Jeffrey Morgan 5b4837f881 remove unused filetype check il y a 1 an
  Daniel Hiltgen 7555ea44f8 Revamp the dynamic library shim il y a 1 an
  Daniel Hiltgen 3269535a4c Refine handling of shim presence il y a 1 an
  Daniel Hiltgen 35934b2e05 Adapted rocm support to cgo based llama.cpp il y a 1 an
  Daniel Hiltgen d4cd695759 Add cgo implementation for llama.cpp il y a 1 an
  Bruce MacDonald 811b1f03c8 deprecate ggml il y a 1 an
  Michael Yang b9495ea162 load projectors il y a 1 an
  Bruce MacDonald 195e3d9dbd chat api endpoint (#1392) il y a 1 an
  Jeffrey Morgan 00d06619a1 Revert "chat api (#991)" while context variable is fixed il y a 1 an
  Bruce MacDonald 7a0899d62d chat api (#991) il y a 1 an