Parth Sareen
|
b816ff86c9
docs: make context length faq readable (#10006)
|
1 月之前 |
molbal
|
e5d84fb90b
docs: add molbal/orca-cli to community integrations (#9909)
|
1 月之前 |
Hengky Steen
|
dd66712e31
docs: add ollamb to community projects
|
1 月之前 |
Jesse Gross
|
f66216e399
ggml: Support heterogeneous KV cache layer sizes in memory estimation
|
1 月之前 |
Jesse Gross
|
f4f0992b6e
llm: Fix debug logging for memory estimates
|
1 月之前 |
Jesse Gross
|
1feff61977
kvcache: Sliding window cache only needs a single batch total
|
1 月之前 |
copeland3300
|
5e0b904e88
docs: add flags to example linux log output command (#9852)
|
1 月之前 |
Matheus C. França
|
131f0355a5
readme: add ollama-d library (#9907)
|
1 月之前 |
Blake Mizerany
|
ce929984a3
server/internal/client/ollama: fix file descriptor management in Pull (#9931)
|
1 月之前 |
Michael Yang
|
4b34930a31
Merge pull request #9897 from ollama/mxyng/chunk-load
|
1 月之前 |
Michael Yang
|
74bd09652d
ml/backend/ggml: load tensors in 32KiB chunks
|
1 月之前 |
Bruce MacDonald
|
fb6252d786
benchmark: performance of running ollama server (#8643)
|
1 月之前 |
Blake Mizerany
|
c794fef2f2
server/internal/client/ollama: persist through chunk download errors (#9923)
|
1 月之前 |
Parth Sareen
|
00ebda8cc4
Revert "parser: remove role validation from Modelfile parser" (#9917)
|
1 月之前 |
Parth Sareen
|
d14ce75b95
docs: update final response for /api/chat stream (#9919)
|
1 月之前 |
Jesse Gross
|
2d6eac9084
kvcache: Optimize sliding window attention
|
1 月之前 |
Jesse Gross
|
3ed7ad3ab3
kvcache: Pass granular cache size into implementations
|
1 月之前 |
Patrick Devine
|
6d1103048e
fix: show correct bool value for kv in verbose show information (#9928)
|
1 月之前 |
Jesse Gross
|
0ff28758b3
ollamarunner: Provide mechanism for backends to report loading progress
|
1 月之前 |
Jesse Gross
|
d3e9ca3eda
kvcache: Account for source tensors in defrag operation count
|
1 月之前 |
Jesse Gross
|
0fbfcf3c9c
model: Pass input tensor instead of raw data to models
|
1 月之前 |
Jesse Gross
|
0c220935bd
input: Rename Options to Batch
|
1 月之前 |
rylativity
|
ffbfe833da
parser: remove role validation from Modelfile parser (#9874)
|
1 月之前 |
Parth Sareen
|
42a14f7f63
sample: add error handling for empty logits (#9740)
|
1 月之前 |
Patrick Devine
|
f8c3dbe5b5
templates: add autotemplate for gemma3 (#9880)
|
1 月之前 |
Jesse Gross
|
b078dd157c
gemma2: Remove second call to Rows
|
1 月之前 |
Blake Mizerany
|
2ddacd7516
server/internal/client/ollama: confirm all chunksums were received (#9893)
|
1 月之前 |
Jeffrey Morgan
|
da0e345200
ml: use input context for extracting outputs (#9875)
|
1 月之前 |
Bruce MacDonald
|
df94175a0f
ggml: return error on failure to read tensor data (#9872)
|
1 月之前 |
Bruce MacDonald
|
61a8825216
convert: return name of unsupported architecture (#9862)
|
1 月之前 |