|
1 рік тому | |
---|---|---|
.github | 1 рік тому | |
api | 1 рік тому | |
app | 1 рік тому | |
auth | 1 рік тому | |
client | 1 рік тому | |
cmd | 1 рік тому | |
convert | 1 рік тому | |
docs | 1 рік тому | |
examples | 1 рік тому | |
format | 1 рік тому | |
gpu | 1 рік тому | |
integration | 1 рік тому | |
llm | 1 рік тому | |
macapp | 1 рік тому | |
openai | 1 рік тому | |
parser | 1 рік тому | |
progress | 1 рік тому | |
readline | 1 рік тому | |
scripts | 1 рік тому | |
server | 1 рік тому | |
types | 1 рік тому | |
version | 1 рік тому | |
.dockerignore | 1 рік тому | |
.gitattributes | 1 рік тому | |
.gitignore | 1 рік тому | |
.gitmodules | 1 рік тому | |
.golangci.yaml | 1 рік тому | |
.prettierrc.json | 1 рік тому | |
Dockerfile | 1 рік тому | |
LICENSE | 1 рік тому | |
README.md | 1 рік тому | |
go.mod | 1 рік тому | |
go.sum | 1 рік тому | |
main.go | 1 рік тому |
Get up and running with large language models locally.
curl -fsSL https://ollama.com/install.sh | sh
The official Ollama Docker image ollama/ollama
is available on Docker Hub.
To run and chat with Llama 3:
ollama run llama3
Ollama supports a list of models available on ollama.com/library
Here are some example models that can be downloaded:
Model | Parameters | Size | Download |
---|---|---|---|
Llama 3 | 8B | 4.7GB | ollama run llama3 |
Llama 3 | 70B | 40GB | ollama run llama3:70b |
Phi-3 | 3.8B | 2.3GB | ollama run phi3 |
Mistral | 7B | 4.1GB | ollama run mistral |
Neural Chat | 7B | 4.1GB | ollama run neural-chat |
Starling | 7B | 4.1GB | ollama run starling-lm |
Code Llama | 7B | 3.8GB | ollama run codellama |
Llama 2 Uncensored | 7B | 3.8GB | ollama run llama2-uncensored |
LLaVA | 7B | 4.5GB | ollama run llava |
Gemma | 2B | 1.4GB | ollama run gemma:2b |
Gemma | 7B | 4.8GB | ollama run gemma:7b |
Solar | 10.7B | 6.1GB | ollama run solar |
Note: You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
Ollama supports importing GGUF models in the Modelfile:
Create a file named Modelfile
, with a FROM
instruction with the local filepath to the model you want to import.
FROM ./vicuna-33b.Q4_0.gguf
Create the model in Ollama
ollama create example -f Modelfile
Run the model
ollama run example
See the guide on importing models for more information.
Models from the Ollama library can be customized with a prompt. For example, to customize the llama3
model:
ollama pull llama3
Create a Modelfile
:
FROM llama3
# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1
# set the system message
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""
Next, create and run the model:
ollama create mario -f ./Modelfile
ollama run mario
>>> hi
Hello! It's your friend Mario.
For more examples, see the examples directory. For more information on working with a Modelfile, see the Modelfile documentation.
ollama create
is used to create a model from a Modelfile.
ollama create mymodel -f ./Modelfile
ollama pull llama3
This command can also be used to update a local model. Only the diff will be pulled.
ollama rm llama3
ollama cp llama3 my-model
For multiline input, you can wrap text with """
:
>>> """Hello,
... world!
... """
I'm a basic program that prints the famous "Hello, world!" message to the console.
>>> What's in this image? /Users/jmorgan/Desktop/smile.png
The image features a yellow smiley face, which is likely the central focus of the picture.
$ ollama run llama3 "Summarize this file: $(cat README.md)"
Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.
ollama list
ollama serve
is used when you want to start ollama without running the desktop application.
Install cmake
and go
:
brew install cmake go
Then generate dependencies:
go generate ./...
Then build the binary:
go build .
More detailed instructions can be found in the developer guide
Next, start the server:
./ollama serve
Finally, in a separate shell, run a model:
./ollama run llama3
Ollama has a REST API for running and managing models.
curl http://localhost:11434/api/generate -d '{
"model": "llama3",
"prompt":"Why is the sky blue?"
}'
curl http://localhost:11434/api/chat -d '{
"model": "llama3",
"messages": [
{ "role": "user", "content": "why is the sky blue?" }
]
}'
See the API documentation for all endpoints.