Ollama provides experimental compatibility with parts of the OpenAI API to help connect existing applications to Ollama.
Note: OpenAI compatibility is experimental and is subject to major adjustments including breaking changes. For fully-featured access to the Ollama API, see the Ollama Python library, JavaScript library and REST API.
from openai import OpenAI
client = OpenAI(
base_url='http://localhost:11434/v1/',
# required but ignored
api_key='ollama',
)
chat_completion = client.chat.completions.create(
messages=[
{
'role': 'user',
'content': 'Say this is a test',
}
],
model='llama2',
)
import OpenAI from 'openai'
const openai = new OpenAI({
baseURL: 'http://localhost:11434/v1/',
// required but ignored
apiKey: 'ollama',
})
const chatCompletion = await openai.chat.completions.create({
messages: [{ role: 'user', content: 'Say this is a test' }],
model: 'llama2',
})
curl
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama2",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}'
/v1/chat/completions
model
messages
content
content
partsfrequency_penalty
presence_penalty
response_format
seed
stop
stream
temperature
top_p
max_tokens
logit_bias
tools
tool_choice
user
seed
will always set temperature
to 0
finish_reason
will always be stop
usage.prompt_tokens
will be 0 for completions where prompt evaluation is cachedBefore using a model, pull it locally ollama pull
:
ollama pull llama2
For tooling that relies on default OpenAI model names such as gpt-3.5-turbo
, use ollama cp
to copy an existing model name to a temporary name:
ollama cp llama2 gpt-3.5-turbo
Afterwards, this new model name can be specified the model
field:
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}'