Daniel Hiltgen
|
69be940bf6
gpu: Group GPU Library sets by variant (#6483)
|
hai 8 meses |
Daniel Hiltgen
|
4fe3a556fa
Add cuda v12 variant and selection logic
|
hai 10 meses |
Daniel Hiltgen
|
fc3b4cda89
Report GPU variant in log
|
hai 10 meses |
Daniel Hiltgen
|
d470ebe78b
Add Jetson cuda variants for arm
|
hai 11 meses |
Jeffrey Morgan
|
c4cf8ad559
llm: avoid loading model if system memory is too small (#5637)
|
hai 9 meses |
Daniel Hiltgen
|
f6f759fc5f
Detect CUDA OS Overhead
|
hai 9 meses |
Daniel Hiltgen
|
9929751cc8
Disable concurrency for AMD + Windows
|
hai 10 meses |
Daniel Hiltgen
|
da3bf23354
Workaround gfx900 SDMA bugs
|
hai 11 meses |
Daniel Hiltgen
|
6f351bf586
review comments and coverage
|
hai 11 meses |
Daniel Hiltgen
|
4e2b7e181d
Refactor intel gpu discovery
|
hai 11 meses |
Daniel Hiltgen
|
6fd04ca922
Improve multi-gpu handling at the limit
|
hai 11 meses |
Daniel Hiltgen
|
43ed358f9a
Refine GPU discovery to bootstrap once
|
hai 11 meses |
Daniel Hiltgen
|
8727a9c140
Record more GPU information
|
hai 1 ano |
Daniel Hiltgen
|
34b9db5afc
Request and model concurrency
|
hai 1 ano |
Michael Yang
|
7e33a017c0
partial offloading
|
hai 1 ano |
Michael Yang
|
91b3e4d282
update memory calcualtions
|
hai 1 ano |
Daniel Hiltgen
|
6d84f07505
Detect AMD GPU info via sysfs and block old cards
|
hai 1 ano |
Daniel Hiltgen
|
8da7bef05f
Support multiple variants for a given llm lib type
|
hai 1 ano |
Jeffrey Morgan
|
c336693f07
calculate overhead based number of gpu devices (#1875)
|
hai 1 ano |
Daniel Hiltgen
|
a2ad952440
Fix windows system memory lookup
|
hai 1 ano |
Daniel Hiltgen
|
d966b730ac
Switch windows build to fully dynamic
|
hai 1 ano |
Daniel Hiltgen
|
7555ea44f8
Revamp the dynamic library shim
|
hai 1 ano |
Daniel Hiltgen
|
35934b2e05
Adapted rocm support to cgo based llama.cpp
|
hai 1 ano |