Bez popisu

Patrick Devine 402babdad0 change push to chunked uploads from monolithic před 1 rokem
api 6d6b0d3321 change error handler behavior and fix error when a model isn't found (#173) před 1 rokem
app 9657314ae2 address comment před 1 rokem
cmd 402babdad0 change push to chunked uploads from monolithic před 1 rokem
docs 65d93a86b2 Update modelfile.md (#177) před 1 rokem
examples 8454f298ac fix example `Modelfile`s před 1 rokem
format 5bea29f610 add new list command (#97) před 1 rokem
library 6a19724d5f remove colon from library modelfiles před 1 rokem
llama 8526e1f5f1 add llama.cpp mpi, opencl files před 1 rokem
parser d59b164fa2 add prompt back to parser před 1 rokem
progressbar e4d7f3e287 vendor in progress bar and change to bytes instead of bibytes (#130) před 1 rokem
scripts 4dd296e155 build app in publish script před 1 rokem
server 402babdad0 change push to chunked uploads from monolithic před 1 rokem
web 3c8f4c03d7 web: tweak homepage text před 1 rokem
.dockerignore 6292f4b64c update `Dockerfile` před 1 rokem
.gitignore e6c427ce4d Update .gitignore před 1 rokem
.prettierrc.json 8685a5ad18 move .prettierrc.json to root před 1 rokem
Dockerfile 7c71c10d4f fix compilation issue in Dockerfile, remove from `README.md` until ready před 1 rokem
LICENSE df5fdd6647 `proto` -> `ollama` před 1 rokem
README.md 91cd54016c add basic REST api documentation před 1 rokem
ggml-metal.metal e64ef69e34 look for ggml-metal in the same directory as the binary před 1 rokem
go.mod 8609db77ea use gin-contrib/cors middleware před 1 rokem
go.sum 8609db77ea use gin-contrib/cors middleware před 1 rokem
main.go 1775647f76 continue conversation před 1 rokem

README.md

logo

Ollama

Discord

Note: Ollama is in early preview. Please report any issues you find.

Run, create, and share large language models (LLMs).

Download

  • Download for macOS on Apple Silicon (Intel coming soon)
  • Download for Windows and Linux (coming soon)
  • Build from source

Quickstart

To run and chat with Llama 2, the new model by Meta:

ollama run llama2

Model library

ollama includes a library of open-source models:

Model Parameters Size Download
Llama2 7B 3.8GB ollama pull llama2
Llama2 13B 13B 7.3GB ollama pull llama2:13b
Orca Mini 3B 1.9GB ollama pull orca
Vicuna 7B 3.8GB ollama pull vicuna
Nous-Hermes 13B 7.3GB ollama pull nous-hermes
Wizard Vicuna Uncensored 13B 7.3GB ollama pull wizard-vicuna

Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.

Examples

Run a model

ollama run llama2
>>> hi
Hello! How can I help you today?

Create a custom model

Pull a base model:

ollama pull llama2

Create a Modelfile:

FROM llama2

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system prompt
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""

Next, create and run the model:

ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.

For more examples, see the examples directory.

Pull a model from the registry

ollama pull orca

Listing local models

ollama list

Model packages

Overview

Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.

logo

Building

go build .

To run it start the server:

./ollama serve &

Finally, run a model!

./ollama run llama2

REST API

POST /api/generate

Generate text from a model.

curl -X POST http://localhost:11434/api/generate -d '{"model": "llama2", "prompt":"Why is the sky blue?"}'