před 1 rokem · 291700c92d
--- a/README.md
+++ b/README.md
@@ -60,7 +60,7 @@ Here are some example open-source models that can be downloaded:
 
				 | Vicuna             | 7B         | 3.8GB | `ollama run vicuna`            |
			
 
				 | LLaVA              | 7B         | 4.5GB | `ollama run llava`             |
			
 
				 
			
 
				-> Note: You should have at least 8 GB of RAM to run the 3B models, 16 GB to run the 7B models, and 32 GB to run the 13B models.
			
 
				+> Note: You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.
			
 
				 
			
 
				 ## Customize your own model
			
 
				 
			
@@ -129,6 +129,10 @@ For more examples, see the [examples](examples) directory. For more information
 
				 
			
 
				 `ollama create` is used to create a model from a Modelfile.
			
 
				 
			
 
				+```
			
 
				+ollama create mymodel -f ./Modelfile
			
 
				+```
			
 
				+
			
 
				 ### Pull a model
			
 
				 
			
 
				 ```
			
--- a/docs/README.md
+++ b/docs/README.md
@@ -1,6 +1,25 @@
 
				 # Documentation
			
 
				 
			
 
				-- [Modelfile](./modelfile.md)
			
 
				-- [How to develop Ollama](./development.md)
			
 
				-- [API](./api.md)
			
 
				-- [Tutorials](./tutorials.md)
			
 
				+To get started, see the project's **[quicktart](../README.md#quickstart)**.
			
 
				+
			
 
				+Ollama is a tool for running AI models on your hardware. Many users will choose to use the Command Line Interface (CLI) to work with Ollama. Learn more about all the commands in the CLI in the **[Main Readme](../README.md)**.
			
 
				+
			
 
				+Use the RESTful API using any language, including Python, JavaScript, Typescript, Go, Rust, and many more. Learn more about using the API in the **[API Documentation](./api.md)**.
			
 
				+
			
 
				+Create new models or modify models already in the library using the Modelfile. Learn more about the Modelfile syntax in the **[Modelfile Documentation](./modelfile.md)**.
			
 
				+
			
 
				+Import models using source model weights found on Hugging Face and similar sites by referring to the **[Import Documentation](./import.md)**.
			
 
				+
			
 
				+Installing on Linux in most cases is easy using the script on Ollama.ai. To get more detail about the install, including CUDA drivers, see the **[Linux Documentation](./linux.md)**.
			
 
				+
			
 
				+Many of our users like the flexibility of using our official Docker Image. Learn more about using Docker with Ollama using the **[Docker Documentation](./docker.md)**.
			
 
				+
			
 
				+It is easy to install on Linux and Mac, but many users will choose to build Ollama on their own. To do this, refer to the **[Development Documentation](./development.md)**.
			
 
				+
			
 
				+If encountering a problem with Ollama, the best place to start is the logs. Find more information about them here in the **[Troubleshooting Guide](./troubleshooting.md)**.
			
 
				+
			
 
				+Finally for all the questions that don't fit anywhere else, there is the **[FAQ](./faq.md)**
			
 
				+
			
 
				+[Tutorials](./tutorials.md) apply the documentation to tasks.
			
 
				+
			
 
				+For working code examples of using Ollama, see [Examples](../examples).
			
--- a/docs/api.md
+++ b/docs/api.md
@@ -17,7 +17,7 @@
 
				 
			
 
				 ### Model names
			
 
				 
			
 
				-Model names follow a `model:tag` format. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
			
 
				+Model names follow a `model:tag` format, where `model` can have an optional namespace such as `example/model`. Some examples are `orca-mini:3b-q4_1` and `llama2:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
			
 
				 
			
 
				 ### Durations
			
 
				 
			
@@ -25,7 +25,8 @@ All durations are returned in nanoseconds.
 
				 
			
 
				 ### Streaming responses
			
 
				 
			
 
				-Certain endpoints stream responses as JSON objects.
			
 
				+Certain endpoints stream responses as JSON objects and can optional return non-streamed responses.
			
 
				+
			
 
				 
			
 
				 ## Generate a completion
			
 
				 
			
@@ -39,7 +40,7 @@ Generate a response for a given prompt with a provided model. This is a streamin
 
				 
			
 
				 - `model`: (required) the [model name](#model-names)
			
 
				 - `prompt`: the prompt to generate a response for
			
 
				-- `images`: a list of base64-encoded images (for multimodal models such as `llava`)
			
 
				+- `images`: (optional) a list of base64-encoded images (for multimodal models such as `llava`)
			
 
				 
			
 
				 Advanced parameters (optional):
			
 
				 
			
@@ -51,15 +52,17 @@ Advanced parameters (optional):
 
				 - `stream`: if `false` the response will be returned as a single response object, rather than a stream of objects
			
 
				 - `raw`: if `true` no formatting will be applied to the prompt. You may choose to use the `raw` parameter if you are specifying a full templated prompt in your request to the API.
			
 
				 
			
 
				-### JSON mode
			
 
				+#### JSON mode
			
 
				 
			
 
				-Enable JSON mode by setting the `format` parameter to `json`. This will structure the response as valid JSON. See the JSON mode [example](#request-json-mode) below.
			
 
				+Enable JSON mode by setting the `format` parameter to `json`. This will structure the response as a valid JSON object. See the JSON mode [example](#generate-request-json-mode) below.
			
 
				 
			
 
				 > Note: it's important to instruct the model to use JSON in the `prompt`. Otherwise, the model may generate large amounts whitespace.
			
 
				 
			
 
				 ### Examples
			
 
				 
			
 
				-#### Request
			
 
				+#### Generate request (Streaming)
			
 
				+
			
 
				+##### Request
			
 
				 
			
 
				 ```shell
			
 
				 curl http://localhost:11434/api/generate -d '{
			
@@ -68,7 +71,7 @@ curl http://localhost:11434/api/generate -d '{
 
				 }'
			
 
				 ```
			
 
				 
			
 
				-#### Response
			
 
				+##### Response
			
 
				 
			
 
				 A stream of JSON objects is returned:
			
 
				 
			
@@ -99,20 +102,22 @@ To calculate how fast the response is generated in tokens per second (token/s),
 
				   "model": "llama2",
			
 
				   "created_at": "2023-08-04T19:22:45.499127Z",
			
 
				   "response": "",
			
 
				-  "context": [1, 2, 3],
			
 
				   "done": true,
			
 
				-  "total_duration": 5589157167,
			
 
				-  "load_duration": 3013701500,
			
 
				-  "prompt_eval_count": 46,
			
 
				-  "prompt_eval_duration": 1160282000,
			
 
				-  "eval_count": 113,
			
 
				-  "eval_duration": 1325948000
			
 
				+  "context": [1, 2, 3],
			
 
				+  "total_duration":10706818083,
			
 
				+  "load_duration":6338219291,
			
 
				+  "prompt_eval_count":26,
			
 
				+  "prompt_eval_duration":130079000,
			
 
				+  "eval_count":259,
			
 
				+  "eval_duration":4232710000
			
 
				 }
			
 
				 ```
			
 
				 
			
 
				 #### Request (No streaming)
			
 
				 
			
 
				-A response can be recieved in one reply when streaming is off.
			
 
				+##### Request
			
 
				+
			
 
				+A response can be received in one reply when streaming is off.
			
 
				 
			
 
				 ```shell
			
 
				 curl http://localhost:11434/api/generate -d '{
			
@@ -122,7 +127,7 @@ curl http://localhost:11434/api/generate -d '{
 
				 }'
			
 
				 ```
			
 
				 
			
 
				-#### Response
			
 
				+##### Response
			
 
				 
			
 
				 If `stream` is set to `false`, the response will be a single JSON object:
			
 
				 
			
@@ -131,14 +136,66 @@ If `stream` is set to `false`, the response will be a single JSON object:
 
				   "model": "llama2",
			
 
				   "created_at": "2023-08-04T19:22:45.499127Z",
			
 
				   "response": "The sky is blue because it is the color of the sky.",
			
 
				+  "done": true,
			
 
				   "context": [1, 2, 3],
			
 
				+  "total_duration": 5043500667,
			
 
				+  "load_duration": 5025959,
			
 
				+  "prompt_eval_count": 26,
			
 
				+  "prompt_eval_duration": 325953000,
			
 
				+  "eval_count": 290,
			
 
				+  "eval_duration": 4709213000
			
 
				+}
			
 
				+```
			
 
				+
			
 
				+#### Request (JSON mode)
			
 
				+
			
 
				+> When `format` is set to `json`, the output will always be a well-formed JSON object. It's important to also instruct the model to respond in JSON.
			
 
				+
			
 
				+##### Request
			
 
				+
			
 
				+```shell
			
 
				+curl http://localhost:11434/api/generate -d '{
			
 
				+  "model": "llama2",
			
 
				+  "prompt": "What color is the sky at different times of the day? Respond using JSON",
			
 
				+  "format": "json",
			
 
				+  "stream": false
			
 
				+}'
			
 
				+```
			
 
				+
			
 
				+##### Response
			
 
				+
			
 
				+```json
			
 
				+{
			
 
				+  "model": "llama2",
			
 
				+  "created_at": "2023-11-09T21:07:55.186497Z",
			
 
				+  "response": "{\n\"morning\": {\n\"color\": \"blue\"\n},\n\"noon\": {\n\"color\": \"blue-gray\"\n},\n\"afternoon\": {\n\"color\": \"warm gray\"\n},\n\"evening\": {\n\"color\": \"orange\"\n}\n}\n",
			
 
				   "done": true,
			
 
				-  "total_duration": 5589157167,
			
 
				-  "load_duration": 3013701500,
			
 
				-  "prompt_eval_count": 46,
			
 
				-  "prompt_eval_duration": 1160282000,
			
 
				-  "eval_count": 13,
			
 
				-  "eval_duration": 1325948000
			
 
				+  "context": [1, 2, 3], 
			
 
				+  "total_duration": 4648158584,
			
 
				+  "load_duration": 4071084,
			
 
				+  "prompt_eval_count": 36,
			
 
				+  "prompt_eval_duration": 439038000,
			
 
				+  "eval_count": 180,
			
 
				+  "eval_duration": 4196918000
			
 
				+}
			
 
				+```
			
 
				+
			
 
				+The value of `response` will be a string containing JSON similar to:
			
 
				+
			
 
				+```json
			
 
				+{
			
 
				+  "morning": {
			
 
				+    "color": "blue"
			
 
				+  },
			
 
				+  "noon": {
			
 
				+    "color": "blue-gray"
			
 
				+  },
			
 
				+  "afternoon": {
			
 
				+    "color": "warm gray"
			
 
				+  },
			
 
				+  "evening": {
			
 
				+    "color": "orange"
			
 
				+  }
			
 
				 }
			
 
				 ```
			
 
				 
			
@@ -146,6 +203,8 @@ If `stream` is set to `false`, the response will be a single JSON object:
 
				 
			
 
				 To submit images to multimodal models such as `llava` or `bakllava`, provide a list of base64-encoded `images`:
			
 
				 
			
 
				+#### Request
			
 
				+
			
 
				 ```shell
			
 
				 curl http://localhost:11434/api/generate -d '{
			
 
				   "model": "llava",
			
@@ -162,20 +221,21 @@ curl http://localhost:11434/api/generate -d '{
 
				   "model": "llava",
			
 
				   "created_at": "2023-11-03T15:36:02.583064Z",
			
 
				   "response": "A happy cartoon character, which is cute and cheerful.",
			
 
				-  "context": [1, 2, 3],
			
 
				   "done": true,
			
 
				-  "total_duration": 14648695333,
			
 
				-  "load_duration": 3302671417,
			
 
				-  "prompt_eval_count": 14,
			
 
				-  "prompt_eval_duration": 286243000,
			
 
				-  "eval_count": 129,
			
 
				-  "eval_duration": 10931424000
			
 
				+  "context": [1, 2, 3],
			
 
				+  "total_duration": 2938432250,
			
 
				+  "load_duration": 2559292,
			
 
				+  "prompt_eval_count": 1,
			
 
				+  "prompt_eval_duration": 2195557000,
			
 
				+  "eval_count": 44,
			
 
				+  "eval_duration": 736432000
			
 
				 }
			
 
				 ```
			
 
				 
			
 
				 #### Request (Raw Mode)
			
 
				 
			
 
				-In some cases you may wish to bypass the templating system and provide a full prompt. In this case, you can use the `raw` parameter to disable formatting.
			
 
				+In some cases, you may wish to bypass the templating system and provide a full prompt. In this case, you can use the `raw` parameter to disable templating. Also note that raw mode will not return a context.
			
 
				+##### Request
			
 
				 
			
 
				 ```shell
			
 
				 curl http://localhost:11434/api/generate -d '{
			
@@ -186,75 +246,29 @@ curl http://localhost:11434/api/generate -d '{
 
				 }'
			
 
				 ```
			
 
				 
			
 
				-#### Response
			
 
				+##### Response
			
 
				 
			
 
				 ```json
			
 
				 {
			
 
				   "model": "mistral",
			
 
				   "created_at": "2023-11-03T15:36:02.583064Z",
			
 
				   "response": " The sky appears blue because of a phenomenon called Rayleigh scattering.",
			
 
				-  "context": [1, 2, 3],
			
 
				   "done": true,
			
 
				-  "total_duration": 14648695333,
			
 
				-  "load_duration": 3302671417,
			
 
				+  "total_duration": 8493852375,
			
 
				+  "load_duration": 6589624375,
			
 
				   "prompt_eval_count": 14,
			
 
				-  "prompt_eval_duration": 286243000,
			
 
				-  "eval_count": 129,
			
 
				-  "eval_duration": 10931424000
			
 
				-}
			
 
				-```
			
 
				-
			
 
				-#### Request (JSON mode)
			
 
				-
			
 
				-```shell
			
 
				-curl http://localhost:11434/api/generate -d '{
			
 
				-  "model": "llama2",
			
 
				-  "prompt": "What color is the sky at different times of the day? Respond using JSON",
			
 
				-  "format": "json",
			
 
				-  "stream": false
			
 
				-}'
			
 
				-```
			
 
				-
			
 
				-#### Response
			
 
				-
			
 
				-```json
			
 
				-{
			
 
				-  "model": "llama2",
			
 
				-  "created_at": "2023-11-09T21:07:55.186497Z",
			
 
				-  "response": "{\n\"morning\": {\n\"color\": \"blue\"\n},\n\"noon\": {\n\"color\": \"blue-gray\"\n},\n\"afternoon\": {\n\"color\": \"warm gray\"\n},\n\"evening\": {\n\"color\": \"orange\"\n}\n}\n",
			
 
				-  "done": true,
			
 
				-  "total_duration": 4661289125,
			
 
				-  "load_duration": 1714434500,
			
 
				-  "prompt_eval_count": 36,
			
 
				-  "prompt_eval_duration": 264132000,
			
 
				-  "eval_count": 75,
			
 
				-  "eval_duration": 2112149000
			
 
				+  "prompt_eval_duration": 119039000,
			
 
				+  "eval_count": 110,
			
 
				+  "eval_duration": 1779061000
			
 
				 }
			
 
				 ```
			
 
				 
			
 
				-The value of `response` will be a string containing JSON similar to:
			
 
				-
			
 
				-```json
			
 
				-{
			
 
				-  "morning": {
			
 
				-    "color": "blue"
			
 
				-  },
			
 
				-  "noon": {
			
 
				-    "color": "blue-gray"
			
 
				-  },
			
 
				-  "afternoon": {
			
 
				-    "color": "warm gray"
			
 
				-  },
			
 
				-  "evening": {
			
 
				-    "color": "orange"
			
 
				-  }
			
 
				-}
			
 
				-```
			
 
				-
			
 
				-#### Request (With options)
			
 
				+#### Generate request (With options)
			
 
				 
			
 
				 If you want to set custom options for the model at runtime rather than in the Modelfile, you can do so with the `options` parameter. This example sets every available option, but you can set any of them individually and omit the ones you do not want to override.
			
 
				 
			
 
				+##### Request
			
 
				+
			
 
				 ```shell
			
 
				 curl http://localhost:11434/api/generate -d '{
			
 
				   "model": "llama2",
			
@@ -297,7 +311,7 @@ curl http://localhost:11434/api/generate -d '{
 
				 }'
			
 
				 ```
			
 
				 
			
 
				-#### Response
			
 
				+##### Response
			
 
				 
			
 
				 ```json
			
 
				 {
			
@@ -305,12 +319,38 @@ curl http://localhost:11434/api/generate -d '{
 
				   "created_at": "2023-08-04T19:22:45.499127Z",
			
 
				   "response": "The sky is blue because it is the color of the sky.",
			
 
				   "done": true,
			
 
				-  "total_duration": 5589157167,
			
 
				-  "load_duration": 3013701500,
			
 
				-  "prompt_eval_count": 46,
			
 
				-  "prompt_eval_duration": 1160282000,
			
 
				-  "eval_count": 13,
			
 
				-  "eval_duration": 1325948000
			
 
				+  "context": [1, 2, 3], 
			
 
				+  "total_duration": 4935886791,
			
 
				+  "load_duration": 534986708,
			
 
				+  "prompt_eval_count": 26,
			
 
				+  "prompt_eval_duration": 107345000,
			
 
				+  "eval_count": 237,
			
 
				+  "eval_duration": 4289432000
			
 
				+}
			
 
				+```
			
 
				+
			
 
				+#### Load a model
			
 
				+
			
 
				+If an empty prompt is provided, the model will be loaded into memory.
			
 
				+
			
 
				+##### Request
			
 
				+
			
 
				+```shell
			
 
				+curl http://localhost:11434/api/generate -d '{
			
 
				+  "model": "llama2"
			
 
				+}'
			
 
				+```
			
 
				+
			
 
				+##### Response
			
 
				+
			
 
				+A single JSON object is returned:
			
 
				+
			
 
				+```json
			
 
				+{
			
 
				+  "model":"llama2",
			
 
				+  "created_at":"2023-12-18T19:52:07.071755Z",
			
 
				+  "response":"",
			
 
				+  "done":true
			
 
				 }
			
 
				 ```
			
 
				 
			
@@ -320,7 +360,7 @@ curl http://localhost:11434/api/generate -d '{
 
				 POST /api/chat
			
 
				 ```
			
 
				 
			
 
				-Generate the next message in a chat with a provided model. This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and additional data from the request.
			
 
				+Generate the next message in a chat with a provided model. This is a streaming endpoint, so there will be a series of responses. Streaming can be disabled using `"stream": false`. The final response object will include statistics and additional data from the request.
			
 
				 
			
 
				 ### Parameters
			
 
				 
			
@@ -342,7 +382,9 @@ Advanced parameters (optional):
 
				 
			
 
				 ### Examples
			
 
				 
			
 
				-#### Request
			
 
				+#### Chat Request (Streaming)
			
 
				+
			
 
				+##### Request
			
 
				 
			
 
				 Send a chat message with a streaming response.
			
 
				 
			
@@ -358,7 +400,7 @@ curl http://localhost:11434/api/chat -d '{
 
				 }'
			
 
				 ```
			
 
				 
			
 
				-#### Response
			
 
				+##### Response
			
 
				 
			
 
				 A stream of JSON objects is returned:
			
 
				 
			
@@ -368,7 +410,8 @@ A stream of JSON objects is returned:
 
				   "created_at": "2023-08-04T08:52:19.385406455-07:00",
			
 
				   "message": {
			
 
				     "role": "assisant",
			
 
				-    "content": "The"
			
 
				+    "content": "The", 
			
 
				+    "images": null
			
 
				   },
			
 
				   "done": false
			
 
				 }
			
@@ -381,18 +424,57 @@ Final response:
 
				   "model": "llama2",
			
 
				   "created_at": "2023-08-04T19:22:45.499127Z",
			
 
				   "done": true,
			
 
				-  "total_duration": 5589157167,
			
 
				-  "load_duration": 3013701500,
			
 
				-  "prompt_eval_count": 46,
			
 
				-  "prompt_eval_duration": 1160282000,
			
 
				-  "eval_count": 113,
			
 
				-  "eval_duration": 1325948000
			
 
				+  "total_duration":4883583458,
			
 
				+  "load_duration":1334875,
			
 
				+  "prompt_eval_count":26,
			
 
				+  "prompt_eval_duration":342546000,
			
 
				+  "eval_count":282,
			
 
				+  "eval_duration":4535599000
			
 
				 }
			
 
				 ```
			
 
				 
			
 
				-#### Request (With History)
			
 
				+#### Chat request (No streaming)
			
 
				 
			
 
				-Send a chat message with a conversation history.
			
 
				+##### Request
			
 
				+
			
 
				+```shell
			
 
				+curl http://localhost:11434/api/chat -d '{
			
 
				+  "model": "llama2",
			
 
				+  "messages": [
			
 
				+    {
			
 
				+      "role": "user",
			
 
				+      "content": "why is the sky blue?"
			
 
				+    }
			
 
				+  ], 
			
 
				+  "stream": false
			
 
				+}'
			
 
				+```
			
 
				+
			
 
				+##### Response
			
 
				+
			
 
				+```json
			
 
				+{
			
 
				+  "model": "registry.ollama.ai/library/llama2:latest",
			
 
				+  "created_at": "2023-12-12T14:13:43.416799Z",
			
 
				+  "message": {
			
 
				+    "role": "assistant",
			
 
				+    "content": "Hello! How are you today?"
			
 
				+  },
			
 
				+  "done": true,
			
 
				+  "total_duration": 5191566416,
			
 
				+  "load_duration": 2154458,
			
 
				+  "prompt_eval_count": 26,
			
 
				+  "prompt_eval_duration": 383809000,
			
 
				+  "eval_count": 298,
			
 
				+  "eval_duration": 4799921000
			
 
				+}
			
 
				+```
			
 
				+
			
 
				+#### Chat request (With History)
			
 
				+
			
 
				+Send a chat message with a conversation history. You can use this same approach to start the conversation using multi-shot or chain-of-thought prompting.
			
 
				+
			
 
				+##### Request
			
 
				 
			
 
				 ```shell
			
 
				 curl http://localhost:11434/api/chat -d '{
			
@@ -414,7 +496,7 @@ curl http://localhost:11434/api/chat -d '{
 
				 }'
			
 
				 ```
			
 
				 
			
 
				-#### Response
			
 
				+##### Response
			
 
				 
			
 
				 A stream of JSON objects is returned:
			
 
				 
			
@@ -437,22 +519,24 @@ Final response:
 
				   "model": "llama2",
			
 
				   "created_at": "2023-08-04T19:22:45.499127Z",
			
 
				   "done": true,
			
 
				-  "total_duration": 5589157167,
			
 
				-  "load_duration": 3013701500,
			
 
				-  "prompt_eval_count": 46,
			
 
				-  "prompt_eval_duration": 1160282000,
			
 
				-  "eval_count": 113,
			
 
				-  "eval_duration": 1325948000
			
 
				+  "total_duration":8113331500,
			
 
				+  "load_duration":6396458,
			
 
				+  "prompt_eval_count":61,
			
 
				+  "prompt_eval_duration":398801000,
			
 
				+  "eval_count":468,
			
 
				+  "eval_duration":7701267000
			
 
				 }
			
 
				 ```
			
 
				 
			
 
				-#### Request (with images)
			
 
				+#### Chat request (with images)
			
 
				+
			
 
				+##### Request
			
 
				 
			
 
				 Send a chat message with a conversation history.
			
 
				 
			
 
				 ```shell
			
 
				 curl http://localhost:11434/api/chat -d '{
			
 
				-  "model": "llama2",
			
 
				+  "model": "llava",
			
 
				   "messages": [
			
 
				     {
			
 
				       "role": "user",
			
@@ -463,13 +547,34 @@ curl http://localhost:11434/api/chat -d '{
 
				 }'
			
 
				 ```
			
 
				 
			
 
				+##### Response
			
 
				+
			
 
				+```json
			
 
				+{
			
 
				+  "model": "llava",
			
 
				+  "created_at": "2023-12-13T22:42:50.203334Z",
			
 
				+  "message": {
			
 
				+    "role": "assistant",
			
 
				+    "content": " The image features a cute, little pig with an angry facial expression. It's wearing a heart on its shirt and is waving in the air. This scene appears to be part of a drawing or sketching project.",
			
 
				+    "images": null
			
 
				+  },
			
 
				+  "done": true,
			
 
				+  "total_duration":1668506709,
			
 
				+  "load_duration":1986209,
			
 
				+  "prompt_eval_count":26,
			
 
				+  "prompt_eval_duration":359682000,
			
 
				+  "eval_count":83,
			
 
				+  "eval_duration":1303285000
			
 
				+}
			
 
				+```
			
 
				+
			
 
				 ## Create a Model
			
 
				 
			
 
				 ```shell
			
 
				 POST /api/create
			
 
				 ```
			
 
				 
			
 
				-Create a model from a [`Modelfile`](./modelfile.md). It is recommended to set `modelfile` to the content of the Modelfile rather than just set `path`. This is a requirement for remote create. Remote model creation should also create any file blobs, fields such as `FROM` and `ADAPTER`, explicitly with the server using [Create a Blob](#create-a-blob) and the value to the path indicated in the response.
			
 
				+Create a model from a [`Modelfile`](./modelfile.md). It is recommended to set `modelfile` to the content of the Modelfile rather than just set `path`. This is a requirement for remote create. Remote model creation must also create any file blobs, fields such as `FROM` and `ADAPTER`, explicitly with the server using [Create a Blob](#create-a-blob) and the value to the path indicated in the response. 
			
 
				 
			
 
				 ### Parameters
			
 
				 
			
@@ -480,7 +585,11 @@ Create a model from a [`Modelfile`](./modelfile.md). It is recommended to set `m
 
				 
			
 
				 ### Examples
			
 
				 
			
 
				-#### Request
			
 
				+#### Create a new model
			
 
				+
			
 
				+Create a new model from a `Modelfile`.
			
 
				+
			
 
				+##### Request
			
 
				 
			
 
				 ```shell
			
 
				 curl http://localhost:11434/api/create -d '{
			
@@ -489,14 +598,22 @@ curl http://localhost:11434/api/create -d '{
 
				 }'
			
 
				 ```
			
 
				 
			
 
				-#### Response
			
 
				+##### Response
			
 
				 
			
 
				-A stream of JSON objects. When finished, `status` is `success`.
			
 
				+A stream of JSON objects. Notice that the final JSON object shows a `"status": "success"`.
			
 
				 
			
 
				 ```json
			
 
				-{
			
 
				-  "status": "parsing modelfile"
			
 
				-}
			
 
				+{"status":"reading model metadata"}
			
 
				+{"status":"creating system layer"}
			
 
				+{"status":"using already created layer sha256:22f7f8ef5f4c791c1b03d7eb414399294764d7cc82c7e94aa81a1feb80a983a2"}
			
 
				+{"status":"using already created layer sha256:8c17c2ebb0ea011be9981cc3922db8ca8fa61e828c5d3f44cb6ae342bf80460b"}
			
 
				+{"status":"using already created layer sha256:7c23fb36d80141c4ab8cdbb61ee4790102ebd2bf7aeff414453177d4f2110e5d"}
			
 
				+{"status":"using already created layer sha256:2e0493f67d0c8c9c68a8aeacdf6a38a2151cb3c4c1d42accf296e19810527988"}
			
 
				+{"status":"using already created layer sha256:2759286baa875dc22de5394b4a925701b1896a7e3f8e53275c36f75a877a82c9"}
			
 
				+{"status":"writing layer sha256:df30045fe90f0d750db82a058109cecd6d4de9c90a3d75b19c09e5f64580bb42"}
			
 
				+{"status":"writing layer sha256:f18a68eb09bf925bb1b669490407c1b1251c5db98dc4d3d81f3088498ea55690"}
			
 
				+{"status":"writing manifest"}
			
 
				+{"status":"success"}
			
 
				 ```
			
 
				 
			
 
				 ### Check if a Blob Exists
			
@@ -505,7 +622,8 @@ A stream of JSON objects. When finished, `status` is `success`.
 
				 HEAD /api/blobs/:digest
			
 
				 ```
			
 
				 
			
 
				-Check if a blob is known to the server.
			
 
				+Ensures that the file blob used for a FROM or ADAPTER field exists on the server. This is checking your Ollama server and not Ollama.ai.
			
 
				+
			
 
				 
			
 
				 #### Query Parameters
			
 
				 
			
@@ -529,7 +647,7 @@ Return 200 OK if the blob exists, 404 Not Found if it does not.
 
				 POST /api/blobs/:digest
			
 
				 ```
			
 
				 
			
 
				-Create a blob from a file. Returns the server file path.
			
 
				+Create a blob from a file on the server. Returns the server file path.
			
 
				 
			
 
				 #### Query Parameters
			
 
				 
			
@@ -545,7 +663,7 @@ curl -T model.bin -X POST http://localhost:11434/api/blobs/sha256:29fdb92e57cf08
 
				 
			
 
				 ##### Response
			
 
				 
			
 
				-Return 201 Created if the blob was successfully created.
			
 
				+Return 201 Created if the blob was successfully created, 400 Bad Request if the digest used is not expected.
			
 
				 
			
 
				 ## List Local Models
			
 
				 
			
@@ -571,14 +689,30 @@ A single JSON object will be returned.
 
				 {
			
 
				   "models": [
			
 
				     {
			
 
				-      "name": "llama2",
			
 
				-      "modified_at": "2023-08-02T17:02:23.713454393-07:00",
			
 
				-      "size": 3791730596
			
 
				+      "name": "codellama:13b",
			
 
				+      "modified_at": "2023-11-04T14:56:49.277302595-07:00",
			
 
				+      "size": 7365960935,
			
 
				+      "digest": "9f438cb9cd581fc025612d27f7c1a6669ff83a8bb0ed86c94fcf4c5440555697",
			
 
				+      "details": {
			
 
				+        "format": "gguf",
			
 
				+        "family": "llama",
			
 
				+        "families": null,
			
 
				+        "parameter_size": "13B",
			
 
				+        "quantization_level": "Q4_0"
			
 
				+      }
			
 
				     },
			
 
				     {
			
 
				-      "name": "llama2:13b",
			
 
				-      "modified_at": "2023-08-08T12:08:38.093596297-07:00",
			
 
				-      "size": 7323310500
			
 
				+      "name": "llama2:latest",
			
 
				+      "modified_at": "2023-12-07T09:32:18.757212583-08:00",
			
 
				+      "size": 3825819519,
			
 
				+      "digest": "fe938a131f40e6f6d40083c9f0f430a515233eb2edaa6d72eb85c50d64f2300e",
			
 
				+      "details": {
			
 
				+        "format": "gguf",
			
 
				+        "family": "llama",
			
 
				+        "families": null,
			
 
				+        "parameter_size": "7B",
			
 
				+        "quantization_level": "Q4_0"
			
 
				+      }
			
 
				     }
			
 
				   ]
			
 
				 }
			
@@ -610,12 +744,12 @@ curl http://localhost:11434/api/show -d '{
 
				 
			
 
				 ```json
			
 
				 {
			
 
				-  "modelfile": "# Modelfile generated by \"ollama show\"\n# To build a new Modelfile based on this one, replace the FROM line with:\n# FROM llava:latest\n\nFROM mike/llava:latest\nTEMPLATE \"\"\"\nUSER:{{ .Prompt }}\nASSISTANT:\n\"\"\"\nPARAMETER num_ctx 4096",
			
 
				-  "parameters": "num_ctx                        4096",
			
 
				-  "template": "\nUSER:{{ .Prompt }}\nASSISTANT:\n",
			
 
				-  "license:": "<license>",
			
 
				+  "modelfile": "# Modelfile generated by \"ollama show\"\n# To build a new Modelfile based on this one, replace the FROM line with:\n# FROM llava:latest\n\nFROM /Users/matt/.ollama/models/blobs/sha256:200765e1283640ffbd013184bf496e261032fa75b99498a9613be4e94d63ad52\nTEMPLATE \"\"\"{{ .System }}\nUSER: {{ .Prompt }}\nASSSISTANT: \"\"\"\nPARAMETER num_ctx 4096\nPARAMETER stop \"\u003c/s\u003e\"\nPARAMETER stop \"USER:\"\nPARAMETER stop \"ASSSISTANT:\"",
			
 
				+  "parameters": "num_ctx                        4096\nstop                           \u003c/s\u003e\nstop                           USER:\nstop                           ASSSISTANT:",
			
 
				+  "template": "{{ .System }}\nUSER: {{ .Prompt }}\nASSSISTANT: ",
			
 
				   "details": {
			
 
				     "format": "gguf",
			
 
				+    "family": "llama",
			
 
				     "families": ["llama", "clip"],
			
 
				     "parameter_size": "7B",
			
 
				     "quantization_level": "Q4_0"
			
@@ -644,7 +778,7 @@ curl http://localhost:11434/api/copy -d '{
 
				 
			
 
				 #### Response
			
 
				 
			
 
				-The only response is a 200 OK if successful.
			
 
				+Returns a 200 OK if successful, or a 404 Not Found if the source model doesn't exist.
			
 
				 
			
 
				 ## Delete a Model
			
 
				 
			
@@ -670,7 +804,7 @@ curl -X DELETE http://localhost:11434/api/delete -d '{
 
				 
			
 
				 #### Response
			
 
				 
			
 
				-If successful, the only response is a 200 OK.
			
 
				+Returns a 200 OK if successful, 404 Not Found if the model to be deleted doesn't exist.
			
 
				 
			
 
				 ## Pull a Model
			
 
				 
			
--- a/docs/faq.md
+++ b/docs/faq.md
@@ -1,138 +1,90 @@
 
				 # FAQ
			
 
				 
			
 
				-## How can I view the logs?
			
 
				+## How can I upgrade Ollama?
			
 
				 
			
 
				-On macOS:
			
 
				+To upgrade Ollama, run the installation process again. On the Mac, click the Ollama icon in the menubar and choose the restart option if an update is available.
			
 
				 
			
 
				-```
			
 
				-cat ~/.ollama/logs/server.log
			
 
				-```
			
 
				+## How can I view the logs?
			
 
				 
			
 
				-On Linux:
			
 
				+Review the [Troubleshooting](./troubleshooting.md) docs for more about using logs.
			
 
				 
			
 
				-```
			
 
				-journalctl -u ollama
			
 
				-```
			
 
				+## How do I use Ollama server environment variables on Mac
			
 
				 
			
 
				-If you're running `ollama serve` directly, the logs will be printed to the console.
			
 
				+On macOS, Ollama runs in the background and is managed by the menubar app. If adding environment variables, Ollama will need to be run manually.
			
 
				 
			
 
				-## How can I expose Ollama on my network?
			
 
				+1. Click the menubar icon for Ollama and choose **Quit Ollama**.
			
 
				+2. Open a new terminal window and run the following command (this example uses `OLLAMA_HOST` with an IP address of `123.1.1.1`):
			
 
				 
			
 
				-Ollama binds to 127.0.0.1 port 11434 by default. Change the bind address with the `OLLAMA_HOST` environment variable.
			
 
				+   ```bash
			
 
				+   OLLAMA_HOST=123.1.1.1 ollama serve
			
 
				+   ```
			
 
				 
			
 
				-On macOS:
			
 
				+## How do I use Ollama server environment variables on Linux?
			
 
				 
			
 
				-```bash
			
 
				-OLLAMA_HOST=0.0.0.0:11434 ollama serve
			
 
				-```
			
 
				+If Ollama is installed with the install script, a systemd service was created, running as the Ollama user. To add an environment variable, such as OLLAMA_HOST, follow these steps:
			
 
				 
			
 
				-On Linux:
			
 
				+1. Create a `systemd` drop-in directory and add a config file. This is only needed once.
			
 
				 
			
 
				-Create a `systemd` drop-in directory and set `Environment=OLLAMA_HOST`
			
 
				+   ```bash
			
 
				+   mkdir -p /etc/systemd/system/ollama.service.d
			
 
				+   echo '[Service]' >>/etc/systemd/system/ollama.service.d/environment.conf
			
 
				+   ```
			
 
				 
			
 
				-```bash
			
 
				-mkdir -p /etc/systemd/system/ollama.service.d
			
 
				-echo '[Service]' >>/etc/systemd/system/ollama.service.d/environment.conf
			
 
				-```
			
 
				+2. For each environment variable, add it to the config file:
			
 
				 
			
 
				-```bash
			
 
				-echo 'Environment="OLLAMA_HOST=0.0.0.0:11434"' >>/etc/systemd/system/ollama.service.d/environment.conf
			
 
				-```
			
 
				-
			
 
				-Reload `systemd` and restart Ollama:
			
 
				+   ```bash
			
 
				+   echo 'Environment="OLLAMA_HOST=0.0.0.0:11434"' >>/etc/systemd/system/ollama.service.d/environment.conf
			
 
				+   ```
			
 
				 
			
 
				-```bash
			
 
				-systemctl daemon-reload
			
 
				-systemctl restart ollama
			
 
				-```
			
 
				+3. Reload `systemd` and restart Ollama:
			
 
				 
			
 
				-## How can I allow additional web origins to access Ollama?
			
 
				+   ```bash
			
 
				+   systemctl daemon-reload
			
 
				+   systemctl restart ollama
			
 
				+   ```
			
 
				 
			
 
				-Ollama allows cross origin requests from `127.0.0.1` and `0.0.0.0` by default. Add additional origins with the `OLLAMA_ORIGINS` environment variable:
			
 
				+## How can I expose Ollama on my network?
			
 
				 
			
 
				-On macOS:
			
 
				+Ollama binds to 127.0.0.1 port 11434 by default. Change the bind address with the `OLLAMA_HOST` environment variable. Refer to the section above for how to use environment variables on your platform.
			
 
				 
			
 
				-```bash
			
 
				-OLLAMA_ORIGINS=http://192.168.1.1:*,https://example.com ollama serve
			
 
				-```
			
 
				+## How can I allow additional web origins to access Ollama?
			
 
				 
			
 
				-On Linux:
			
 
				+Ollama allows cross-origin requests from `127.0.0.1` and `0.0.0.0` by default. Add additional origins with the `OLLAMA_ORIGINS` environment variable. For example, to add all ports on 192.168.1.1 and https://example.com, use:
			
 
				 
			
 
				-```bash
			
 
				-echo 'Environment="OLLAMA_ORIGINS=http://192.168.1.1:*,https://example.com"' >>/etc/systemd/system/ollama.service.d/environment.conf
			
 
				+```shell
			
 
				+OLLAMA_ORIGINS=http://192.168.1.1:*,https://example.com
			
 
				 ```
			
 
				 
			
 
				-Reload `systemd` and restart Ollama:
			
 
				-
			
 
				-```bash
			
 
				-systemctl daemon-reload
			
 
				-systemctl restart ollama
			
 
				-```
			
 
				+Refer to the section above for how to use environment variables on your platform.
			
 
				 
			
 
				 ## Where are models stored?
			
 
				 
			
 
				-- macOS: Raw model data is stored under `~/.ollama/models`.
			
 
				-- Linux: Raw model data is stored under `/usr/share/ollama/.ollama/models`
			
 
				-
			
 
				-Below the models directory you will find a structure similar to the following:
			
 
				+- macOS: `~/.ollama/models`.
			
 
				+- Linux: `/usr/share/ollama/.ollama/models`
			
 
				 
			
 
				-```shell
			
 
				-.
			
 
				-├── blobs
			
 
				-└── manifests
			
 
				-   └── registry.ollama.ai
			
 
				-      ├── f0rodo
			
 
				-      ├── library
			
 
				-      ├── mattw
			
 
				-      └── saikatkumardey
			
 
				-```
			
 
				+See [the CLI Documentation](./cli.md) for more on this.
			
 
				 
			
 
				-There is a `manifests/registry.ollama.ai/namespace` path. In example above, the user has downloaded models from the official `library`, `f0rodo`, `mattw`, and `saikatkumardey` namespaces. Within each of those directories, you will find directories for each of the models downloaded. And in there you will find a file name representing each tag. Each tag file is the manifest for the model.  
			
 
				+## How do I set them to a different location?
			
 
				 
			
 
				-The manifest lists all the layers used in this model. You will see a `media type` for each layer, along with a digest. That digest corresponds with a file in the `models/blobs directory`.
			
 
				-
			
 
				-### How can I change where Ollama stores models?
			
 
				-
			
 
				-To modify where models are stored, you can use the `OLLAMA_MODELS` environment variable. Note that on Linux this means defining `OLLAMA_MODELS` in a drop-in `/etc/systemd/system/ollama.service.d` service file, reloading systemd, and restarting the ollama service.
			
 
				+If a different directory needs to be used, set the environment variable `OLLAMA_MODELS` to the chosen directory. Refer to the section above for how to use environment variables on your platform.
			
 
				 
			
 
				 ## Does Ollama send my prompts and answers back to Ollama.ai to use in any way?
			
 
				 
			
 
				-No. Anything you do with Ollama, such as generate a response from the model, stays with you. We don't collect any data about how you use the model. You are always in control of your own data.
			
 
				+No, Ollama runs entirely locally, and conversation data will never leave your machine.
			
 
				 
			
 
				 ## How can I use Ollama in Visual Studio Code?
			
 
				 
			
 
				-There is already a large collection of plugins available for VSCode as well as other editors that leverage Ollama. You can see the list of [extensions & plugins](https://github.com/jmorganca/ollama#extensions--plugins) at the bottom of the main repository readme.
			
 
				+There is already a large collection of plugins available for VSCode as well as other editors that leverage Ollama. See the list of [extensions & plugins](https://github.com/jmorganca/ollama#extensions--plugins) at the bottom of the main repository readme.
			
 
				 
			
 
				 ## How do I use Ollama behind a proxy?
			
 
				 
			
 
				-Ollama is compatible with proxy servers if `HTTP_PROXY` or `HTTPS_PROXY` are configured. When using either variables, ensure it is set where `ollama serve` can access the values.
			
 
				-
			
 
				-When using `HTTPS_PROXY`, ensure the proxy certificate is installed as a system certificate.
			
 
				-
			
 
				-On macOS:
			
 
				-
			
 
				-```bash
			
 
				-HTTPS_PROXY=http://proxy.example.com ollama serve
			
 
				-```
			
 
				-
			
 
				-On Linux:
			
 
				-
			
 
				-```bash
			
 
				-echo 'Environment="HTTPS_PROXY=https://proxy.example.com"' >>/etc/systemd/system/ollama.service.d/environment.conf
			
 
				-```
			
 
				-
			
 
				-Reload `systemd` and restart Ollama:
			
 
				-
			
 
				-```bash
			
 
				-systemctl daemon-reload
			
 
				-systemctl restart ollama
			
 
				-```
			
 
				+Ollama is compatible with proxy servers if `HTTP_PROXY` or `HTTPS_PROXY` are configured. When using either variables, ensure it is set where `ollama serve` can access the values. When using `HTTPS_PROXY`, ensure the proxy certificate is installed as a system certificate. Refer to the section above for how to use environment variables on your platform.
			
 
				 
			
 
				 ### How do I use Ollama behind a proxy in Docker?
			
 
				 
			
 
				 The Ollama Docker container image can be configured to use a proxy by passing `-e HTTPS_PROXY=https://proxy.example.com` when starting the container.
			
 
				 
			
 
				-Alternatively, Docker daemon can be configured to use a proxy. Instructions are available for Docker Desktop on [macOS](https://docs.docker.com/desktop/settings/mac/#proxies), [Windows](https://docs.docker.com/desktop/settings/windows/#proxies), and [Linux](https://docs.docker.com/desktop/settings/linux/#proxies), and Docker [daemon with systemd](https://docs.docker.com/config/daemon/systemd/#httphttps-proxy).
			
 
				+Alternatively, the Docker daemon can be configured to use a proxy. Instructions are available for Docker Desktop on [macOS](https://docs.docker.com/desktop/settings/mac/#proxies), [Windows](https://docs.docker.com/desktop/settings/windows/#proxies), and [Linux](https://docs.docker.com/desktop/settings/linux/#proxies), and Docker [daemon with systemd](https://docs.docker.com/config/daemon/systemd/#httphttps-proxy).
			
 
				 
			
 
				 Ensure the certificate is installed as a system certificate when using HTTPS. This may require a new Docker image when using a self-signed certificate.
			
 
				 
			
--- a/docs/import.md
+++ b/docs/import.md
@@ -72,7 +72,7 @@ docker run --rm -v .:/model ollama/quantize -q q4_0 /model
 
				 This will output two files into the directory:
			
 
				 
			
 
				 - `f16.bin`: the model converted to GGUF
			
 
				-- `q4_0.bin` the model quantized to a 4-bit quantization (we will use this file to create the Ollama model)
			
 
				+- `q4_0.bin` the model quantized to a 4-bit quantization (Ollama will use this file to create the Ollama model)
			
 
				 
			
 
				 ### Step 3: Write a `Modelfile`
			
 
				 
			
--- a/docs/modelfile.md
+++ b/docs/modelfile.md
@@ -1,6 +1,6 @@
 
				 # Ollama Model File
			
 
				 
			
 
				-> Note: this `Modelfile` syntax is in development
			
 
				+> Note: `Modelfile` syntax is in development
			
 
				 
			
 
				 A model file is the blueprint to create and share models with Ollama.
			
 
				 
			
@@ -75,7 +75,7 @@ There are two ways to view `Modelfile`s underlying the models in [ollama.ai/libr
 
				   3.  Scroll down to "Layers"
			
 
				       - Note: if the [`FROM` instruction](#from-required) is not present,
			
 
				         it means the model was created from a local file
			
 
				-- Option 2: use `ollama show` to print the `Modelfile` like so:
			
 
				+- Option 2: use `ollama show` to print the `Modelfile` for any local models like so:
			
 
				 
			
 
				   ```bash
			
 
				   > ollama show --modelfile llama2:13b
			
@@ -206,7 +206,7 @@ LICENSE """
 
				 
			
 
				 ## Notes
			
 
				 
			
 
				-- the **`Modelfile` is not case sensitive**. In the examples, we use uppercase for instructions to make it easier to distinguish it from arguments.
			
 
				-- Instructions can be in any order. In the examples, we start with FROM instruction to keep it easily readable.
			
 
				+- the **`Modelfile` is not case sensitive**. In the examples, uppercase instructions are used to make it easier to distinguish it from arguments.
			
 
				+- Instructions can be in any order. In the examples, the `FROM` instruction is first to keep it easily readable.
			
 
				 
			
 
				 [1]: https://ollama.ai/library
			
--- a/docs/troubleshooting.md
+++ b/docs/troubleshooting.md
@@ -0,0 +1,22 @@
 
				+# How to troubleshoot issues
			
 
				+
			
 
				+Sometimes Ollama may not perform as expected. One of the best ways to figure out what happened is to take a look at the logs. Find the logs on Mac by running the command:
			
 
				+
			
 
				+```shell
			
 
				+cat ~/.ollama/logs/server.log
			
 
				+```
			
 
				+
			
 
				+On Linux systems with systemd, the logs can be found with this command:
			
 
				+
			
 
				+```shell
			
 
				+journalctl -u ollama
			
 
				+```
			
 
				+
			
 
				+If manually running `ollama serve` in a terminal, the logs will be on that terminal.
			
 
				+
			
 
				+Join the [Discord](https://discord.gg/ollama) for help interpreting the logs.
			
 
				+
			
 
				+## Known issues
			
 
				+
			
 
				+
			
 
				+* `signal: illegal instruction (core dumped)`: Ollama requires AVX support from the CPU. This was introduced in 2011 and CPUs started offering it in 2012. CPUs from before that and some lower end CPUs after that may not have AVX support and thus are not supported by Ollama. Some users have had luck with building Ollama on their machines disabling the need for AVX.