|
il y a 1 an | |
---|---|---|
api | il y a 1 an | |
app | il y a 1 an | |
cmd | il y a 1 an | |
docs | il y a 1 an | |
llama | il y a 1 an | |
python | il y a 1 an | |
server | il y a 1 an | |
signature | il y a 1 an | |
templates | il y a 1 an | |
web | il y a 1 an | |
.dockerignore | il y a 1 an | |
.gitignore | il y a 1 an | |
.prettierrc.json | il y a 1 an | |
Dockerfile | il y a 1 an | |
LICENSE | il y a 1 an | |
README.md | il y a 1 an | |
go.mod | il y a 1 an | |
go.sum | il y a 1 an | |
main.go | il y a 1 an | |
models.json | il y a 1 an |
An easy, fast runtime for large language models, powered by llama.cpp
.
Note: this project is a work in progress. Certain models that can be run with
ollama
are intended for research and/or non-commercial use only.
Using pip
:
pip install ollama
Using docker
:
docker run ollama/ollama
To run a model, use ollama run
:
ollama run orca-mini-3b
You can also run models from hugging face:
ollama run huggingface.co/TheBloke/orca_mini_3B-GGML
Or directly via downloaded model files:
ollama run ~/Downloads/orca-mini-13b.ggmlv3.q4_0.bin
go generate ./...
go build .