Arne Müller
|
ee94693b1a
handling unescaped json marshaling
|
1 éve |
Michael Yang
|
11d82d7b9b
update checkvram
|
1 éve |
Michael Yang
|
92189a5855
fix memory check
|
1 éve |
Michael Yang
|
d790bf9916
Merge pull request #783 from jmorganca/mxyng/fix-gpu-offloading
|
1 éve |
Michael Yang
|
35afac099a
do not use gpu binary when num_gpu == 0
|
1 éve |
Michael Yang
|
811c3d1900
no gpu if vram < 2GB
|
1 éve |
Bruce MacDonald
|
6fe178134d
improve api error handling (#781)
|
1 éve |
Bruce MacDonald
|
56497663c8
relay model runner error message to client (#720)
|
1 éve |
Michael Yang
|
b599946b74
add format bytes
|
1 éve |
Bruce MacDonald
|
77295f716e
prevent waiting on exited command (#752)
|
1 éve |
Bruce MacDonald
|
f2ba1311aa
improve vram safety with 5% vram memory buffer (#724)
|
1 éve |
Bruce MacDonald
|
5d22319a2c
rename server subprocess (#700)
|
1 éve |
Bruce MacDonald
|
9e2de1bd2c
increase streaming buffer size (#692)
|
1 éve |
Michael Yang
|
c02c0cd483
starcoder
|
1 éve |
Bruce MacDonald
|
b1f7123301
clean up num_gpu calculation code (#673)
|
1 éve |
Bruce MacDonald
|
1fbf3585d6
Relay default values to llama runner (#672)
|
1 éve |
Bruce MacDonald
|
9771b1ec51
windows runner fixes (#637)
|
1 éve |
Michael Yang
|
f40b3de758
use int64 consistently
|
1 éve |
Bruce MacDonald
|
86279f4ae3
unbound max num gpu layers (#591)
|
1 éve |
Bruce MacDonald
|
4cba75efc5
remove tmp directories created by previous servers (#559)
|
1 éve |
Bruce MacDonald
|
1255bc9b45
only package 11.8 runner
|
1 éve |
Bruce MacDonald
|
4e8be787c7
pack in cuda libs
|
1 éve |
Bruce MacDonald
|
66003e1d05
subprocess improvements (#524)
|
1 éve |
Bruce MacDonald
|
2540c9181c
support for packaging in multiple cuda runners (#509)
|
1 éve |
Michael Yang
|
7dee25a07f
fix falcon decode
|
1 éve |
Bruce MacDonald
|
f221637053
first pass at linux gpu support (#454)
|
1 éve |
Bruce MacDonald
|
09dd2aeff9
GGUF support (#441)
|
1 éve |
Bruce MacDonald
|
42998d797d
subprocess llama.cpp server (#401)
|
1 éve |
Quinn Slack
|
f4432e1dba
treat stop as stop sequences, not exact tokens (#442)
|
1 éve |
Michael Yang
|
5ca05c2e88
fix ModelType()
|
1 éve |