暫無描述

Michael Yang 4f54f25b66 Merge pull request #272 from jmorganca/decode-ggml-2		1 年之前
api	81d8d7b73f fix could not convert int	1 年之前
app	4ab1da38ba guard around `id()`	1 年之前
cmd	7e26a8df31 cmd: use environment variables for server options	1 年之前
docs	be889b2f81 add docs for `/api/embeddings`	1 年之前
examples	10885986b8 fix a typo in the tweetwriter example Modelfile	1 年之前
format	5bea29f610 add new list command (#97)	1 年之前
llm	d791df75dd check memory requirements before loading	1 年之前
parser	21e6197c0b Merge pull request #322 from jmorganca/no-comment-warning	1 年之前
progressbar	e4d7f3e287 vendor in progress bar and change to bytes instead of bibytes (#130)	1 年之前
scripts	639288bf2b make `ollama` binary executable on build	1 年之前
server	6a6828bddf Merge pull request #167 from jmorganca/decode-ggml	1 年之前
vector	a6f6d18f83 embed text document in modelfile	1 年之前
.dockerignore	6292f4b64c update `Dockerfile`	1 年之前
.gitignore	67b6f8ba86 add `ggml-metal.metal` to `.gitignore`	1 年之前
.prettierrc.json	8685a5ad18 move .prettierrc.json to root	1 年之前
Dockerfile	7c71c10d4f fix compilation issue in Dockerfile, remove from `README.md` until ready	1 年之前
LICENSE	df5fdd6647 `proto` -> `ollama`	1 年之前
README.md	178237d37f tweak `README.md`	1 年之前
go.mod	d791df75dd check memory requirements before loading	1 年之前
go.sum	d791df75dd check memory requirements before loading	1 年之前
main.go	1775647f76 continue conversation	1 年之前

Ollama

Run, create, and share large language models (LLMs).

Note: Ollama is in early preview. Please report any issues you find.

Download

Download for macOS
Download for Windows and Linux (coming soon)
Build from source

Quickstart

To run and chat with Llama 2, the new model by Meta:

ollama run llama2

Model library

ollama includes a library of open-source models:

Model	Parameters	Size	Download
Llama2	7B	3.8GB	`ollama pull llama2`
Llama2 13B	13B	7.3GB	`ollama pull llama2:13b`
Llama2 70B	70B	39GB	`ollama pull llama2:70b`
Llama2 Uncensored	7B	3.8GB	`ollama pull llama2-uncensored`
Orca Mini	3B	1.9GB	`ollama pull orca`
Vicuna	7B	3.8GB	`ollama pull vicuna`
Nous-Hermes	13B	7.3GB	`ollama pull nous-hermes`
Wizard Vicuna Uncensored	13B	7.3GB	`ollama pull wizard-vicuna`

Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.

Examples

Run a model

ollama run llama2
>>> hi
Hello! How can I help you today?

For multiline input, you can wrap text with """:

>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.

Create a custom model

Pull a base model:

ollama pull llama2

To update a model to the latest version, run ollama pull llama2 again. The model will be updated (if necessary).

Create a Modelfile:

FROM llama2

# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1

# set the system prompt
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""

Next, create and run the model:

ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.

For more examples, see the examples directory. For more information on creating a Modelfile, see the Modelfile documentation.

Pull a model from the registry

ollama pull orca

Listing local models

ollama list

Model packages

Overview

Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.

Building

go build .

To run it start the server:

./ollama serve &

Finally, run a model!

./ollama run llama2

REST API

See the API documentation for all endpoints.

Ollama has an API for running and managing models. For example to generate text from a model:

curl -X POST http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt":"Why is the sky blue?"
}'

Tools using Ollama

LangChain and LangChain.js with a question-answering example.
Continue - embeds Ollama inside Visual Studio Code. The extension lets you highlight code to add to the prompt, ask questions in the sidebar, and generate code inline.
Discord AI Bot - interact with Ollama as a chatbot on Discord.
Raycast Ollama - Raycast extension to use Ollama for local llama inference on Raycast.
Simple HTML UI for Ollama

README.md