|
@@ -17,7 +17,7 @@
|
|
|
|
|
|
### Model names
|
|
|
|
|
|
-Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
|
|
|
+Model names follow a `model:tag` format, where `model` can have an optional namespace such as `example/model`. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
|
|
|
|
|
|
### Durations
|
|
|
|
|
@@ -25,7 +25,8 @@ All durations are returned in nanoseconds.
|
|
|
|
|
|
### Streaming responses
|
|
|
|
|
|
-Certain endpoints stream responses as JSON objects.
|
|
|
+Certain endpoints stream responses as JSON objects and can optional return non-streamed responses.
|
|
|
+
|
|
|
|
|
|
## Generate a completion
|
|
|
|
|
@@ -39,7 +40,7 @@ Generate a response for a given prompt with a provided model. This is a streamin
|
|
|
|
|
|
- `model`: (required) the [model name](#model-names)
|
|
|
- `prompt`: the prompt to generate a response for
|
|
|
-- `images`: a list of base64-encoded images (for multimodal models such as `llava`)
|
|
|
+- `images`: (optional) a list of base64-encoded images (for multimodal models such as `llava`)
|
|
|
|
|
|
Advanced parameters (optional):
|
|
|
|
|
@@ -51,15 +52,17 @@ Advanced parameters (optional):
|
|
|
- `stream`: if `false` the response will be returned as a single response object, rather than a stream of objects
|
|
|
- `raw`: if `true` no formatting will be applied to the prompt. You may choose to use the `raw` parameter if you are specifying a full templated prompt in your request to the API.
|
|
|
|
|
|
-### JSON mode
|
|
|
+#### JSON mode
|
|
|
|
|
|
-Enable JSON mode by setting the `format` parameter to `json`. This will structure the response as valid JSON. See the JSON mode [example](#request-json-mode) below.
|
|
|
+Enable JSON mode by setting the `format` parameter to `json`. This will structure the response as a valid JSON object. See the JSON mode [example](#generate-request-json-mode) below.
|
|
|
|
|
|
> Note: it's important to instruct the model to use JSON in the `prompt`. Otherwise, the model may generate large amounts whitespace.
|
|
|
|
|
|
### Examples
|
|
|
|
|
|
-#### Request
|
|
|
+#### Generate request (Streaming)
|
|
|
+
|
|
|
+##### Request
|
|
|
|
|
|
```shell
|
|
|
curl http://localhost:11434/api/generate -d '{
|
|
@@ -68,7 +71,7 @@ curl http://localhost:11434/api/generate -d '{
|
|
|
}'
|
|
|
```
|
|
|
|
|
|
-#### Response
|
|
|
+##### Response
|
|
|
|
|
|
A stream of JSON objects is returned:
|
|
|
|
|
@@ -99,20 +102,22 @@ To calculate how fast the response is generated in tokens per second (token/s),
|
|
|
"model": "llama2",
|
|
|
"created_at": "2023-08-04T19:22:45.499127Z",
|
|
|
"response": "",
|
|
|
- "context": [1, 2, 3],
|
|
|
"done": true,
|
|
|
- "total_duration": 5589157167,
|
|
|
- "load_duration": 3013701500,
|
|
|
- "prompt_eval_count": 46,
|
|
|
- "prompt_eval_duration": 1160282000,
|
|
|
- "eval_count": 113,
|
|
|
- "eval_duration": 1325948000
|
|
|
+ "context": [1, 2, 3],
|
|
|
+ "total_duration":10706818083,
|
|
|
+ "load_duration":6338219291,
|
|
|
+ "prompt_eval_count":26,
|
|
|
+ "prompt_eval_duration":130079000,
|
|
|
+ "eval_count":259,
|
|
|
+ "eval_duration":4232710000
|
|
|
}
|
|
|
```
|
|
|
|
|
|
#### Request (No streaming)
|
|
|
|
|
|
-A response can be recieved in one reply when streaming is off.
|
|
|
+##### Request
|
|
|
+
|
|
|
+A response can be received in one reply when streaming is off.
|
|
|
|
|
|
```shell
|
|
|
curl http://localhost:11434/api/generate -d '{
|
|
@@ -122,7 +127,7 @@ curl http://localhost:11434/api/generate -d '{
|
|
|
}'
|
|
|
```
|
|
|
|
|
|
-#### Response
|
|
|
+##### Response
|
|
|
|
|
|
If `stream` is set to `false`, the response will be a single JSON object:
|
|
|
|
|
@@ -131,14 +136,66 @@ If `stream` is set to `false`, the response will be a single JSON object:
|
|
|
"model": "llama2",
|
|
|
"created_at": "2023-08-04T19:22:45.499127Z",
|
|
|
"response": "The sky is blue because it is the color of the sky.",
|
|
|
+ "done": true,
|
|
|
"context": [1, 2, 3],
|
|
|
+ "total_duration": 5043500667,
|
|
|
+ "load_duration": 5025959,
|
|
|
+ "prompt_eval_count": 26,
|
|
|
+ "prompt_eval_duration": 325953000,
|
|
|
+ "eval_count": 290,
|
|
|
+ "eval_duration": 4709213000
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+#### Request (JSON mode)
|
|
|
+
|
|
|
+> When `format` is set to `json`, the output will always be a well-formed JSON object. It's important to also instruct the model to respond in JSON.
|
|
|
+
|
|
|
+##### Request
|
|
|
+
|
|
|
+```shell
|
|
|
+curl http://localhost:11434/api/generate -d '{
|
|
|
+ "model": "llama2",
|
|
|
+ "prompt": "What color is the sky at different times of the day? Respond using JSON",
|
|
|
+ "format": "json",
|
|
|
+ "stream": false
|
|
|
+}'
|
|
|
+```
|
|
|
+
|
|
|
+##### Response
|
|
|
+
|
|
|
+```json
|
|
|
+{
|
|
|
+ "model": "llama2",
|
|
|
+ "created_at": "2023-11-09T21:07:55.186497Z",
|
|
|
+ "response": "{\n\"morning\": {\n\"color\": \"blue\"\n},\n\"noon\": {\n\"color\": \"blue-gray\"\n},\n\"afternoon\": {\n\"color\": \"warm gray\"\n},\n\"evening\": {\n\"color\": \"orange\"\n}\n}\n",
|
|
|
"done": true,
|
|
|
- "total_duration": 5589157167,
|
|
|
- "load_duration": 3013701500,
|
|
|
- "prompt_eval_count": 46,
|
|
|
- "prompt_eval_duration": 1160282000,
|
|
|
- "eval_count": 13,
|
|
|
- "eval_duration": 1325948000
|
|
|
+ "context": [1, 2, 3],
|
|
|
+ "total_duration": 4648158584,
|
|
|
+ "load_duration": 4071084,
|
|
|
+ "prompt_eval_count": 36,
|
|
|
+ "prompt_eval_duration": 439038000,
|
|
|
+ "eval_count": 180,
|
|
|
+ "eval_duration": 4196918000
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+The value of `response` will be a string containing JSON similar to:
|
|
|
+
|
|
|
+```json
|
|
|
+{
|
|
|
+ "morning": {
|
|
|
+ "color": "blue"
|
|
|
+ },
|
|
|
+ "noon": {
|
|
|
+ "color": "blue-gray"
|
|
|
+ },
|
|
|
+ "afternoon": {
|
|
|
+ "color": "warm gray"
|
|
|
+ },
|
|
|
+ "evening": {
|
|
|
+ "color": "orange"
|
|
|
+ }
|
|
|
}
|
|
|
```
|
|
|
|
|
@@ -146,6 +203,8 @@ If `stream` is set to `false`, the response will be a single JSON object:
|
|
|
|
|
|
To submit images to multimodal models such as `llava` or `bakllava`, provide a list of base64-encoded `images`:
|
|
|
|
|
|
+#### Request
|
|
|
+
|
|
|
```shell
|
|
|
curl http://localhost:11434/api/generate -d '{
|
|
|
"model": "llava",
|
|
@@ -162,20 +221,21 @@ curl http://localhost:11434/api/generate -d '{
|
|
|
"model": "llava",
|
|
|
"created_at": "2023-11-03T15:36:02.583064Z",
|
|
|
"response": "A happy cartoon character, which is cute and cheerful.",
|
|
|
- "context": [1, 2, 3],
|
|
|
"done": true,
|
|
|
- "total_duration": 14648695333,
|
|
|
- "load_duration": 3302671417,
|
|
|
- "prompt_eval_count": 14,
|
|
|
- "prompt_eval_duration": 286243000,
|
|
|
- "eval_count": 129,
|
|
|
- "eval_duration": 10931424000
|
|
|
+ "context": [1, 2, 3],
|
|
|
+ "total_duration": 2938432250,
|
|
|
+ "load_duration": 2559292,
|
|
|
+ "prompt_eval_count": 1,
|
|
|
+ "prompt_eval_duration": 2195557000,
|
|
|
+ "eval_count": 44,
|
|
|
+ "eval_duration": 736432000
|
|
|
}
|
|
|
```
|
|
|
|
|
|
#### Request (Raw Mode)
|
|
|
|
|
|
-In some cases you may wish to bypass the templating system and provide a full prompt. In this case, you can use the `raw` parameter to disable formatting.
|
|
|
+In some cases, you may wish to bypass the templating system and provide a full prompt. In this case, you can use the `raw` parameter to disable templating. Also note that raw mode will not return a context.
|
|
|
+##### Request
|
|
|
|
|
|
```shell
|
|
|
curl http://localhost:11434/api/generate -d '{
|
|
@@ -186,75 +246,29 @@ curl http://localhost:11434/api/generate -d '{
|
|
|
}'
|
|
|
```
|
|
|
|
|
|
-#### Response
|
|
|
+##### Response
|
|
|
|
|
|
```json
|
|
|
{
|
|
|
"model": "mistral",
|
|
|
"created_at": "2023-11-03T15:36:02.583064Z",
|
|
|
"response": " The sky appears blue because of a phenomenon called Rayleigh scattering.",
|
|
|
- "context": [1, 2, 3],
|
|
|
"done": true,
|
|
|
- "total_duration": 14648695333,
|
|
|
- "load_duration": 3302671417,
|
|
|
+ "total_duration": 8493852375,
|
|
|
+ "load_duration": 6589624375,
|
|
|
"prompt_eval_count": 14,
|
|
|
- "prompt_eval_duration": 286243000,
|
|
|
- "eval_count": 129,
|
|
|
- "eval_duration": 10931424000
|
|
|
-}
|
|
|
-```
|
|
|
-
|
|
|
-#### Request (JSON mode)
|
|
|
-
|
|
|
-```shell
|
|
|
-curl http://localhost:11434/api/generate -d '{
|
|
|
- "model": "llama2",
|
|
|
- "prompt": "What color is the sky at different times of the day? Respond using JSON",
|
|
|
- "format": "json",
|
|
|
- "stream": false
|
|
|
-}'
|
|
|
-```
|
|
|
-
|
|
|
-#### Response
|
|
|
-
|
|
|
-```json
|
|
|
-{
|
|
|
- "model": "llama2",
|
|
|
- "created_at": "2023-11-09T21:07:55.186497Z",
|
|
|
- "response": "{\n\"morning\": {\n\"color\": \"blue\"\n},\n\"noon\": {\n\"color\": \"blue-gray\"\n},\n\"afternoon\": {\n\"color\": \"warm gray\"\n},\n\"evening\": {\n\"color\": \"orange\"\n}\n}\n",
|
|
|
- "done": true,
|
|
|
- "total_duration": 4661289125,
|
|
|
- "load_duration": 1714434500,
|
|
|
- "prompt_eval_count": 36,
|
|
|
- "prompt_eval_duration": 264132000,
|
|
|
- "eval_count": 75,
|
|
|
- "eval_duration": 2112149000
|
|
|
+ "prompt_eval_duration": 119039000,
|
|
|
+ "eval_count": 110,
|
|
|
+ "eval_duration": 1779061000
|
|
|
}
|
|
|
```
|
|
|
|
|
|
-The value of `response` will be a string containing JSON similar to:
|
|
|
-
|
|
|
-```json
|
|
|
-{
|
|
|
- "morning": {
|
|
|
- "color": "blue"
|
|
|
- },
|
|
|
- "noon": {
|
|
|
- "color": "blue-gray"
|
|
|
- },
|
|
|
- "afternoon": {
|
|
|
- "color": "warm gray"
|
|
|
- },
|
|
|
- "evening": {
|
|
|
- "color": "orange"
|
|
|
- }
|
|
|
-}
|
|
|
-```
|
|
|
-
|
|
|
-#### Request (With options)
|
|
|
+#### Generate request (With options)
|
|
|
|
|
|
If you want to set custom options for the model at runtime rather than in the Modelfile, you can do so with the `options` parameter. This example sets every available option, but you can set any of them individually and omit the ones you do not want to override.
|
|
|
|
|
|
+##### Request
|
|
|
+
|
|
|
```shell
|
|
|
curl http://localhost:11434/api/generate -d '{
|
|
|
"model": "llama2",
|
|
@@ -297,7 +311,7 @@ curl http://localhost:11434/api/generate -d '{
|
|
|
}'
|
|
|
```
|
|
|
|
|
|
-#### Response
|
|
|
+##### Response
|
|
|
|
|
|
```json
|
|
|
{
|
|
@@ -305,12 +319,38 @@ curl http://localhost:11434/api/generate -d '{
|
|
|
"created_at": "2023-08-04T19:22:45.499127Z",
|
|
|
"response": "The sky is blue because it is the color of the sky.",
|
|
|
"done": true,
|
|
|
- "total_duration": 5589157167,
|
|
|
- "load_duration": 3013701500,
|
|
|
- "prompt_eval_count": 46,
|
|
|
- "prompt_eval_duration": 1160282000,
|
|
|
- "eval_count": 13,
|
|
|
- "eval_duration": 1325948000
|
|
|
+ "context": [1, 2, 3],
|
|
|
+ "total_duration": 4935886791,
|
|
|
+ "load_duration": 534986708,
|
|
|
+ "prompt_eval_count": 26,
|
|
|
+ "prompt_eval_duration": 107345000,
|
|
|
+ "eval_count": 237,
|
|
|
+ "eval_duration": 4289432000
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+#### Load a model
|
|
|
+
|
|
|
+If an empty prompt is provided, the model will be loaded into memory.
|
|
|
+
|
|
|
+##### Request
|
|
|
+
|
|
|
+```shell
|
|
|
+curl http://localhost:11434/api/generate -d '{
|
|
|
+ "model": "llama2"
|
|
|
+}'
|
|
|
+```
|
|
|
+
|
|
|
+##### Response
|
|
|
+
|
|
|
+A single JSON object is returned:
|
|
|
+
|
|
|
+```json
|
|
|
+{
|
|
|
+ "model":"llama2",
|
|
|
+ "created_at":"2023-12-18T19:52:07.071755Z",
|
|
|
+ "response":"",
|
|
|
+ "done":true
|
|
|
}
|
|
|
```
|
|
|
|
|
@@ -320,7 +360,7 @@ curl http://localhost:11434/api/generate -d '{
|
|
|
POST /api/chat
|
|
|
```
|
|
|
|
|
|
-Generate the next message in a chat with a provided model. This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and additional data from the request.
|
|
|
+Generate the next message in a chat with a provided model. This is a streaming endpoint, so there will be a series of responses. Streaming can be disabled using `"stream": false`. The final response object will include statistics and additional data from the request.
|
|
|
|
|
|
### Parameters
|
|
|
|
|
@@ -342,7 +382,9 @@ Advanced parameters (optional):
|
|
|
|
|
|
### Examples
|
|
|
|
|
|
-#### Request
|
|
|
+#### Chat Request (Streaming)
|
|
|
+
|
|
|
+##### Request
|
|
|
|
|
|
Send a chat message with a streaming response.
|
|
|
|
|
@@ -358,7 +400,7 @@ curl http://localhost:11434/api/chat -d '{
|
|
|
}'
|
|
|
```
|
|
|
|
|
|
-#### Response
|
|
|
+##### Response
|
|
|
|
|
|
A stream of JSON objects is returned:
|
|
|
|
|
@@ -368,7 +410,8 @@ A stream of JSON objects is returned:
|
|
|
"created_at": "2023-08-04T08:52:19.385406455-07:00",
|
|
|
"message": {
|
|
|
"role": "assisant",
|
|
|
- "content": "The"
|
|
|
+ "content": "The",
|
|
|
+ "images": null
|
|
|
},
|
|
|
"done": false
|
|
|
}
|
|
@@ -381,18 +424,57 @@ Final response:
|
|
|
"model": "llama2",
|
|
|
"created_at": "2023-08-04T19:22:45.499127Z",
|
|
|
"done": true,
|
|
|
- "total_duration": 5589157167,
|
|
|
- "load_duration": 3013701500,
|
|
|
- "prompt_eval_count": 46,
|
|
|
- "prompt_eval_duration": 1160282000,
|
|
|
- "eval_count": 113,
|
|
|
- "eval_duration": 1325948000
|
|
|
+ "total_duration":4883583458,
|
|
|
+ "load_duration":1334875,
|
|
|
+ "prompt_eval_count":26,
|
|
|
+ "prompt_eval_duration":342546000,
|
|
|
+ "eval_count":282,
|
|
|
+ "eval_duration":4535599000
|
|
|
}
|
|
|
```
|
|
|
|
|
|
-#### Request (With History)
|
|
|
+#### Chat request (No streaming)
|
|
|
|
|
|
-Send a chat message with a conversation history.
|
|
|
+##### Request
|
|
|
+
|
|
|
+```shell
|
|
|
+curl http://localhost:11434/api/chat -d '{
|
|
|
+ "model": "llama2",
|
|
|
+ "messages": [
|
|
|
+ {
|
|
|
+ "role": "user",
|
|
|
+ "content": "why is the sky blue?"
|
|
|
+ }
|
|
|
+ ],
|
|
|
+ "stream": false
|
|
|
+}'
|
|
|
+```
|
|
|
+
|
|
|
+##### Response
|
|
|
+
|
|
|
+```json
|
|
|
+{
|
|
|
+ "model": "registry.ollama.ai/library/llama2:latest",
|
|
|
+ "created_at": "2023-12-12T14:13:43.416799Z",
|
|
|
+ "message": {
|
|
|
+ "role": "assistant",
|
|
|
+ "content": "Hello! How are you today?"
|
|
|
+ },
|
|
|
+ "done": true,
|
|
|
+ "total_duration": 5191566416,
|
|
|
+ "load_duration": 2154458,
|
|
|
+ "prompt_eval_count": 26,
|
|
|
+ "prompt_eval_duration": 383809000,
|
|
|
+ "eval_count": 298,
|
|
|
+ "eval_duration": 4799921000
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+#### Chat request (With History)
|
|
|
+
|
|
|
+Send a chat message with a conversation history. You can use this same approach to start the conversation using multi-shot or chain-of-thought prompting.
|
|
|
+
|
|
|
+##### Request
|
|
|
|
|
|
```shell
|
|
|
curl http://localhost:11434/api/chat -d '{
|
|
@@ -414,7 +496,7 @@ curl http://localhost:11434/api/chat -d '{
|
|
|
}'
|
|
|
```
|
|
|
|
|
|
-#### Response
|
|
|
+##### Response
|
|
|
|
|
|
A stream of JSON objects is returned:
|
|
|
|
|
@@ -437,22 +519,24 @@ Final response:
|
|
|
"model": "llama2",
|
|
|
"created_at": "2023-08-04T19:22:45.499127Z",
|
|
|
"done": true,
|
|
|
- "total_duration": 5589157167,
|
|
|
- "load_duration": 3013701500,
|
|
|
- "prompt_eval_count": 46,
|
|
|
- "prompt_eval_duration": 1160282000,
|
|
|
- "eval_count": 113,
|
|
|
- "eval_duration": 1325948000
|
|
|
+ "total_duration":8113331500,
|
|
|
+ "load_duration":6396458,
|
|
|
+ "prompt_eval_count":61,
|
|
|
+ "prompt_eval_duration":398801000,
|
|
|
+ "eval_count":468,
|
|
|
+ "eval_duration":7701267000
|
|
|
}
|
|
|
```
|
|
|
|
|
|
-#### Request (with images)
|
|
|
+#### Chat request (with images)
|
|
|
+
|
|
|
+##### Request
|
|
|
|
|
|
Send a chat message with a conversation history.
|
|
|
|
|
|
```shell
|
|
|
curl http://localhost:11434/api/chat -d '{
|
|
|
- "model": "llama2",
|
|
|
+ "model": "llava",
|
|
|
"messages": [
|
|
|
{
|
|
|
"role": "user",
|
|
@@ -463,13 +547,34 @@ curl http://localhost:11434/api/chat -d '{
|
|
|
}'
|
|
|
```
|
|
|
|
|
|
+##### Response
|
|
|
+
|
|
|
+```json
|
|
|
+{
|
|
|
+ "model": "llava",
|
|
|
+ "created_at": "2023-12-13T22:42:50.203334Z",
|
|
|
+ "message": {
|
|
|
+ "role": "assistant",
|
|
|
+ "content": " The image features a cute, little pig with an angry facial expression. It's wearing a heart on its shirt and is waving in the air. This scene appears to be part of a drawing or sketching project.",
|
|
|
+ "images": null
|
|
|
+ },
|
|
|
+ "done": true,
|
|
|
+ "total_duration":1668506709,
|
|
|
+ "load_duration":1986209,
|
|
|
+ "prompt_eval_count":26,
|
|
|
+ "prompt_eval_duration":359682000,
|
|
|
+ "eval_count":83,
|
|
|
+ "eval_duration":1303285000
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
## Create a Model
|
|
|
|
|
|
```shell
|
|
|
POST /api/create
|
|
|
```
|
|
|
|
|
|
-Create a model from a [`Modelfile`](./modelfile.md). It is recommended to set `modelfile` to the content of the Modelfile rather than just set `path`. This is a requirement for remote create. Remote model creation should also create any file blobs, fields such as `FROM` and `ADAPTER`, explicitly with the server using [Create a Blob](#create-a-blob) and the value to the path indicated in the response.
|
|
|
+Create a model from a [`Modelfile`](./modelfile.md). It is recommended to set `modelfile` to the content of the Modelfile rather than just set `path`. This is a requirement for remote create. Remote model creation must also create any file blobs, fields such as `FROM` and `ADAPTER`, explicitly with the server using [Create a Blob](#create-a-blob) and the value to the path indicated in the response.
|
|
|
|
|
|
### Parameters
|
|
|
|
|
@@ -480,7 +585,11 @@ Create a model from a [`Modelfile`](./modelfile.md). It is recommended to set `m
|
|
|
|
|
|
### Examples
|
|
|
|
|
|
-#### Request
|
|
|
+#### Create a new model
|
|
|
+
|
|
|
+Create a new model from a `Modelfile`.
|
|
|
+
|
|
|
+##### Request
|
|
|
|
|
|
```shell
|
|
|
curl http://localhost:11434/api/create -d '{
|
|
@@ -489,14 +598,22 @@ curl http://localhost:11434/api/create -d '{
|
|
|
}'
|
|
|
```
|
|
|
|
|
|
-#### Response
|
|
|
+##### Response
|
|
|
|
|
|
-A stream of JSON objects. When finished, `status` is `success`.
|
|
|
+A stream of JSON objects. Notice that the final JSON object shows a `"status": "success"`.
|
|
|
|
|
|
```json
|
|
|
-{
|
|
|
- "status": "parsing modelfile"
|
|
|
-}
|
|
|
+{"status":"reading model metadata"}
|
|
|
+{"status":"creating system layer"}
|
|
|
+{"status":"using already created layer sha256:22f7f8ef5f4c791c1b03d7eb414399294764d7cc82c7e94aa81a1feb80a983a2"}
|
|
|
+{"status":"using already created layer sha256:8c17c2ebb0ea011be9981cc3922db8ca8fa61e828c5d3f44cb6ae342bf80460b"}
|
|
|
+{"status":"using already created layer sha256:7c23fb36d80141c4ab8cdbb61ee4790102ebd2bf7aeff414453177d4f2110e5d"}
|
|
|
+{"status":"using already created layer sha256:2e0493f67d0c8c9c68a8aeacdf6a38a2151cb3c4c1d42accf296e19810527988"}
|
|
|
+{"status":"using already created layer sha256:2759286baa875dc22de5394b4a925701b1896a7e3f8e53275c36f75a877a82c9"}
|
|
|
+{"status":"writing layer sha256:df30045fe90f0d750db82a058109cecd6d4de9c90a3d75b19c09e5f64580bb42"}
|
|
|
+{"status":"writing layer sha256:f18a68eb09bf925bb1b669490407c1b1251c5db98dc4d3d81f3088498ea55690"}
|
|
|
+{"status":"writing manifest"}
|
|
|
+{"status":"success"}
|
|
|
```
|
|
|
|
|
|
### Check if a Blob Exists
|
|
@@ -505,7 +622,8 @@ A stream of JSON objects. When finished, `status` is `success`.
|
|
|
HEAD /api/blobs/:digest
|
|
|
```
|
|
|
|
|
|
-Check if a blob is known to the server.
|
|
|
+Ensures that the file blob used for a FROM or ADAPTER field exists on the server. This is checking your Ollama server and not Ollama.ai.
|
|
|
+
|
|
|
|
|
|
#### Query Parameters
|
|
|
|
|
@@ -529,7 +647,7 @@ Return 200 OK if the blob exists, 404 Not Found if it does not.
|
|
|
POST /api/blobs/:digest
|
|
|
```
|
|
|
|
|
|
-Create a blob from a file. Returns the server file path.
|
|
|
+Create a blob from a file on the server. Returns the server file path.
|
|
|
|
|
|
#### Query Parameters
|
|
|
|
|
@@ -545,7 +663,7 @@ curl -T model.bin -X POST http://localhost:11434/api/blobs/sha256:29fdb92e57cf08
|
|
|
|
|
|
##### Response
|
|
|
|
|
|
-Return 201 Created if the blob was successfully created.
|
|
|
+Return 201 Created if the blob was successfully created, 400 Bad Request if the digest used is not expected.
|
|
|
|
|
|
## List Local Models
|
|
|
|
|
@@ -571,14 +689,30 @@ A single JSON object will be returned.
|
|
|
{
|
|
|
"models": [
|
|
|
{
|
|
|
- "name": "llama2",
|
|
|
- "modified_at": "2023-08-02T17:02:23.713454393-07:00",
|
|
|
- "size": 3791730596
|
|
|
+ "name": "codellama:13b",
|
|
|
+ "modified_at": "2023-11-04T14:56:49.277302595-07:00",
|
|
|
+ "size": 7365960935,
|
|
|
+ "digest": "9f438cb9cd581fc025612d27f7c1a6669ff83a8bb0ed86c94fcf4c5440555697",
|
|
|
+ "details": {
|
|
|
+ "format": "gguf",
|
|
|
+ "family": "llama",
|
|
|
+ "families": null,
|
|
|
+ "parameter_size": "13B",
|
|
|
+ "quantization_level": "Q4_0"
|
|
|
+ }
|
|
|
},
|
|
|
{
|
|
|
- "name": "llama2:13b",
|
|
|
- "modified_at": "2023-08-08T12:08:38.093596297-07:00",
|
|
|
- "size": 7323310500
|
|
|
+ "name": "llama2:latest",
|
|
|
+ "modified_at": "2023-12-07T09:32:18.757212583-08:00",
|
|
|
+ "size": 3825819519,
|
|
|
+ "digest": "fe938a131f40e6f6d40083c9f0f430a515233eb2edaa6d72eb85c50d64f2300e",
|
|
|
+ "details": {
|
|
|
+ "format": "gguf",
|
|
|
+ "family": "llama",
|
|
|
+ "families": null,
|
|
|
+ "parameter_size": "7B",
|
|
|
+ "quantization_level": "Q4_0"
|
|
|
+ }
|
|
|
}
|
|
|
]
|
|
|
}
|
|
@@ -610,12 +744,12 @@ curl http://localhost:11434/api/show -d '{
|
|
|
|
|
|
```json
|
|
|
{
|
|
|
- "modelfile": "# Modelfile generated by \"ollama show\"\n# To build a new Modelfile based on this one, replace the FROM line with:\n# FROM llava:latest\n\nFROM mike/llava:latest\nTEMPLATE \"\"\"\nUSER:{{ .Prompt }}\nASSISTANT:\n\"\"\"\nPARAMETER num_ctx 4096",
|
|
|
- "parameters": "num_ctx 4096",
|
|
|
- "template": "\nUSER:{{ .Prompt }}\nASSISTANT:\n",
|
|
|
- "license:": "<license>",
|
|
|
+ "modelfile": "# Modelfile generated by \"ollama show\"\n# To build a new Modelfile based on this one, replace the FROM line with:\n# FROM llava:latest\n\nFROM /Users/matt/.ollama/models/blobs/sha256:200765e1283640ffbd013184bf496e261032fa75b99498a9613be4e94d63ad52\nTEMPLATE \"\"\"{{ .System }}\nUSER: {{ .Prompt }}\nASSSISTANT: \"\"\"\nPARAMETER num_ctx 4096\nPARAMETER stop \"\u003c/s\u003e\"\nPARAMETER stop \"USER:\"\nPARAMETER stop \"ASSSISTANT:\"",
|
|
|
+ "parameters": "num_ctx 4096\nstop \u003c/s\u003e\nstop USER:\nstop ASSSISTANT:",
|
|
|
+ "template": "{{ .System }}\nUSER: {{ .Prompt }}\nASSSISTANT: ",
|
|
|
"details": {
|
|
|
"format": "gguf",
|
|
|
+ "family": "llama",
|
|
|
"families": ["llama", "clip"],
|
|
|
"parameter_size": "7B",
|
|
|
"quantization_level": "Q4_0"
|
|
@@ -644,7 +778,7 @@ curl http://localhost:11434/api/copy -d '{
|
|
|
|
|
|
#### Response
|
|
|
|
|
|
-The only response is a 200 OK if successful.
|
|
|
+Returns a 200 OK if successful, or a 404 Not Found if the source model doesn't exist.
|
|
|
|
|
|
## Delete a Model
|
|
|
|
|
@@ -670,7 +804,7 @@ curl -X DELETE http://localhost:11434/api/delete -d '{
|
|
|
|
|
|
#### Response
|
|
|
|
|
|
-If successful, the only response is a 200 OK.
|
|
|
+Returns a 200 OK if successful, 404 Not Found if the model to be deleted doesn't exist.
|
|
|
|
|
|
## Pull a Model
|
|
|
|