Note: OpenAI compatibility is experimental and is subject to major adjustments including breaking changes. For fully-featured access to the Ollama API, see the Ollama Python library, JavaScript library and REST API.
Ollama provides experimental compatibility with parts of the OpenAI API to help connect existing applications to Ollama.
from openai import OpenAI
client = OpenAI(
base_url='http://localhost:11434/v1/',
# required but ignored
api_key='ollama',
)
chat_completion = client.chat.completions.create(
messages=[
{
'role': 'user',
'content': 'Say this is a test',
}
],
model='llama3',
)
list_completion = client.models.list()
model = client.models.retrieve("llama3")
embeddings = client.embeddings.create(
model="all-minilm",
input=["why is the sky blue?", "why is the grass green?"]
)
import OpenAI from 'openai'
const openai = new OpenAI({
baseURL: 'http://localhost:11434/v1/',
// required but ignored
apiKey: 'ollama',
})
const chatCompletion = await openai.chat.completions.create({
messages: [{ role: 'user', content: 'Say this is a test' }],
model: 'llama3',
})
const listCompletion = await openai.models.list()
const model = await openai.models.retrieve("llama3");
const embedding = await openai.embeddings.create({
model: "all-minilm",
input: ["why is the sky blue?", "why is the grass green?"],
});
curl
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}'
curl http://localhost:11434/v1/models
curl https://api.openai.com/v1/models/llama3
curl http://localhost:11434/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "all-minilm",
"input": ["why is the sky blue?", "why is the grass green?"]
}'
/v1/chat/completions
model
messages
content
content
partsfrequency_penalty
presence_penalty
response_format
seed
stop
stream
temperature
top_p
max_tokens
tools
tool_choice
logit_bias
user
n
/v1/models
created
corresponds to when the model was last modifiedowned_by
corresponds to the ollama username, defaulting to "library"
/v1/models/{model}
created
corresponds to when the model was last modifiedowned_by
corresponds to the ollama username, defaulting to "library"
/v1/embeddings
model
input
encoding format
dimensions
user
Before using a model, pull it locally ollama pull
:
ollama pull llama3
For tooling that relies on default OpenAI model names such as gpt-3.5-turbo
, use ollama cp
to copy an existing model name to a temporary name:
ollama cp llama3 gpt-3.5-turbo
Afterwards, this new model name can be specified the model
field:
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}'