OpenSource/ollama

Author	SHA1 Message	Date
Jesse Gross	1feff61977 kvcache: Sliding window cache only needs a single batch total	1 month ago
Jesse Gross	2d6eac9084 kvcache: Optimize sliding window attention	1 month ago
Jesse Gross	3ed7ad3ab3 kvcache: Pass granular cache size into implementations	1 month ago
Jesse Gross	d3e9ca3eda kvcache: Account for source tensors in defrag operation count	1 month ago
Jesse Gross	0c220935bd input: Rename Options to Batch	1 month ago
Jesse Gross	a8e83a7654 Disable causal attention based on batch index	1 month ago
Michael Yang	e95278932b use non-causal mask only for image positions	1 month ago
Jesse Gross	a1cda80bcb model: Update encoder cache to use multimodal input processing handler	1 month ago
Jesse Gross	f52b2615ef kvcache: Set context for shift offsets	1 month ago
Jesse Gross	6da8b6a879 kvcache: Support non-causal attention	1 month ago
Michael Yang	7bae7fa5ce ml/backend/ggml: create tensor on specific backend	2 months ago
Michael Yang	764e199d67 kvcache: create cache ctx per layer	2 months ago
Jesse Gross	21aa666a1e ml: Enable support for flash attention	2 months ago
Jesse Gross	854a9195f3 attention: Remove unnecessary contiguous operations	2 months ago
Michael Yang	3e8b8a1933 ml: update Context.Forward interface	2 months ago
Jesse Gross	ed443a0393 Runner for Ollama engine	4 months ago