Browse Source

docs: improve syntax highlighting in code blocks (#8854)

Azis Alvriyanto 2 months ago
parent
commit
b901a712c6
16 changed files with 158 additions and 127 deletions
  1. 23 21
      README.md
  2. 2 1
      api/examples/README.md
  3. 1 1
      app/README.md
  4. 17 16
      docs/api.md
  5. 10 10
      docs/development.md
  6. 27 23
      docs/docker.md
  7. 14 4
      docs/faq.md
  8. 2 2
      docs/import.md
  9. 1 1
      docs/linux.md
  10. 35 29
      docs/modelfile.md
  11. 8 5
      docs/openai.md
  12. 7 4
      docs/troubleshooting.md
  13. 1 0
      docs/windows.md
  14. 5 5
      llama/README.md
  15. 3 3
      llama/runner/README.md
  16. 2 2
      macapp/README.md

+ 23 - 21
README.md

@@ -18,7 +18,7 @@ Get up and running with large language models.
 
 
 ### Linux
 ### Linux
 
 
-```
+```shell
 curl -fsSL https://ollama.com/install.sh | sh
 curl -fsSL https://ollama.com/install.sh | sh
 ```
 ```
 
 
@@ -42,7 +42,7 @@ The official [Ollama Docker image](https://hub.docker.com/r/ollama/ollama) `olla
 
 
 To run and chat with [Llama 3.2](https://ollama.com/library/llama3.2):
 To run and chat with [Llama 3.2](https://ollama.com/library/llama3.2):
 
 
-```
+```shell
 ollama run llama3.2
 ollama run llama3.2
 ```
 ```
 
 
@@ -92,13 +92,13 @@ Ollama supports importing GGUF models in the Modelfile:
 
 
 2. Create the model in Ollama
 2. Create the model in Ollama
 
 
-   ```
+   ```shell
    ollama create example -f Modelfile
    ollama create example -f Modelfile
    ```
    ```
 
 
 3. Run the model
 3. Run the model
 
 
-   ```
+   ```shell
    ollama run example
    ollama run example
    ```
    ```
 
 
@@ -110,7 +110,7 @@ See the [guide](docs/import.md) on importing models for more information.
 
 
 Models from the Ollama library can be customized with a prompt. For example, to customize the `llama3.2` model:
 Models from the Ollama library can be customized with a prompt. For example, to customize the `llama3.2` model:
 
 
-```
+```shell
 ollama pull llama3.2
 ollama pull llama3.2
 ```
 ```
 
 
@@ -145,13 +145,13 @@ For more information on working with a Modelfile, see the [Modelfile](docs/model
 
 
 `ollama create` is used to create a model from a Modelfile.
 `ollama create` is used to create a model from a Modelfile.
 
 
-```
+```shell
 ollama create mymodel -f ./Modelfile
 ollama create mymodel -f ./Modelfile
 ```
 ```
 
 
 ### Pull a model
 ### Pull a model
 
 
-```
+```shell
 ollama pull llama3.2
 ollama pull llama3.2
 ```
 ```
 
 
@@ -159,13 +159,13 @@ ollama pull llama3.2
 
 
 ### Remove a model
 ### Remove a model
 
 
-```
+```shell
 ollama rm llama3.2
 ollama rm llama3.2
 ```
 ```
 
 
 ### Copy a model
 ### Copy a model
 
 
-```
+```shell
 ollama cp llama3.2 my-model
 ollama cp llama3.2 my-model
 ```
 ```
 
 
@@ -184,37 +184,39 @@ I'm a basic program that prints the famous "Hello, world!" message to the consol
 
 
 ```
 ```
 ollama run llava "What's in this image? /Users/jmorgan/Desktop/smile.png"
 ollama run llava "What's in this image? /Users/jmorgan/Desktop/smile.png"
-The image features a yellow smiley face, which is likely the central focus of the picture.
 ```
 ```
 
 
+> **Output**: The image features a yellow smiley face, which is likely the central focus of the picture.
+
 ### Pass the prompt as an argument
 ### Pass the prompt as an argument
 
 
+```shell
+ollama run llama3.2 "Summarize this file: $(cat README.md)"
 ```
 ```
-$ ollama run llama3.2 "Summarize this file: $(cat README.md)"
- Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.
-```
+
+> **Output**: Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.
 
 
 ### Show model information
 ### Show model information
 
 
-```
+```shell
 ollama show llama3.2
 ollama show llama3.2
 ```
 ```
 
 
 ### List models on your computer
 ### List models on your computer
 
 
-```
+```shell
 ollama list
 ollama list
 ```
 ```
 
 
 ### List which models are currently loaded
 ### List which models are currently loaded
 
 
-```
+```shell
 ollama ps
 ollama ps
 ```
 ```
 
 
 ### Stop a model which is currently running
 ### Stop a model which is currently running
 
 
-```
+```shell
 ollama stop llama3.2
 ollama stop llama3.2
 ```
 ```
 
 
@@ -230,13 +232,13 @@ See the [developer guide](https://github.com/ollama/ollama/blob/main/docs/develo
 
 
 Next, start the server:
 Next, start the server:
 
 
-```
+```shell
 ./ollama serve
 ./ollama serve
 ```
 ```
 
 
 Finally, in a separate shell, run a model:
 Finally, in a separate shell, run a model:
 
 
-```
+```shell
 ./ollama run llama3.2
 ./ollama run llama3.2
 ```
 ```
 
 
@@ -246,7 +248,7 @@ Ollama has a REST API for running and managing models.
 
 
 ### Generate a response
 ### Generate a response
 
 
-```
+```shell
 curl http://localhost:11434/api/generate -d '{
 curl http://localhost:11434/api/generate -d '{
   "model": "llama3.2",
   "model": "llama3.2",
   "prompt":"Why is the sky blue?"
   "prompt":"Why is the sky blue?"
@@ -255,7 +257,7 @@ curl http://localhost:11434/api/generate -d '{
 
 
 ### Chat with a model
 ### Chat with a model
 
 
-```
+```shell
 curl http://localhost:11434/api/chat -d '{
 curl http://localhost:11434/api/chat -d '{
   "model": "llama3.2",
   "model": "llama3.2",
   "messages": [
   "messages": [

+ 2 - 1
api/examples/README.md

@@ -2,9 +2,10 @@
 
 
 Run the examples in this directory with:
 Run the examples in this directory with:
 
 
-```
+```shell
 go run example_name/main.go
 go run example_name/main.go
 ```
 ```
+
 ## Chat - Chat with a model
 ## Chat - Chat with a model
 - [chat/main.go](chat/main.go)
 - [chat/main.go](chat/main.go)
 
 

+ 1 - 1
app/README.md

@@ -17,6 +17,6 @@ If you want to build the installer, youll need to install
 In the top directory of this repo, run the following powershell script
 In the top directory of this repo, run the following powershell script
 to build the ollama CLI, ollama app, and ollama installer.
 to build the ollama CLI, ollama app, and ollama installer.
 
 
-```
+```powershell
 powershell -ExecutionPolicy Bypass -File .\scripts\build_windows.ps1
 powershell -ExecutionPolicy Bypass -File .\scripts\build_windows.ps1
 ```
 ```

+ 17 - 16
docs/api.md

@@ -31,7 +31,7 @@ Certain endpoints stream responses as JSON objects. Streaming can be disabled by
 
 
 ## Generate a completion
 ## Generate a completion
 
 
-```shell
+```
 POST /api/generate
 POST /api/generate
 ```
 ```
 
 
@@ -485,7 +485,7 @@ A single JSON object is returned:
 
 
 ## Generate a chat completion
 ## Generate a chat completion
 
 
-```shell
+```
 POST /api/chat
 POST /api/chat
 ```
 ```
 
 
@@ -878,6 +878,7 @@ curl http://localhost:11434/api/chat -d '{
 ```
 ```
 
 
 ##### Response
 ##### Response
+
 ```json
 ```json
 {
 {
   "model": "llama3.2",
   "model": "llama3.2",
@@ -924,7 +925,7 @@ A single JSON object is returned:
 
 
 ## Create a Model
 ## Create a Model
 
 
-```shell
+```
 POST /api/create
 POST /api/create
 ```
 ```
 
 
@@ -1020,7 +1021,7 @@ curl http://localhost:11434/api/create -d '{
 
 
 A stream of JSON objects is returned:
 A stream of JSON objects is returned:
 
 
-```
+```json
 {"status":"quantizing F16 model to Q4_K_M"}
 {"status":"quantizing F16 model to Q4_K_M"}
 {"status":"creating new layer sha256:667b0c1932bc6ffc593ed1d03f895bf2dc8dc6df21db3042284a6f4416b06a29"}
 {"status":"creating new layer sha256:667b0c1932bc6ffc593ed1d03f895bf2dc8dc6df21db3042284a6f4416b06a29"}
 {"status":"using existing layer sha256:11ce4ee3e170f6adebac9a991c22e22ab3f8530e154ee669954c4bc73061c258"}
 {"status":"using existing layer sha256:11ce4ee3e170f6adebac9a991c22e22ab3f8530e154ee669954c4bc73061c258"}
@@ -1051,7 +1052,7 @@ curl http://localhost:11434/api/create -d '{
 
 
 A stream of JSON objects is returned:
 A stream of JSON objects is returned:
 
 
-```
+```json
 {"status":"parsing GGUF"}
 {"status":"parsing GGUF"}
 {"status":"using existing layer sha256:432f310a77f4650a88d0fd59ecdd7cebed8d684bafea53cbff0473542964f0c3"}
 {"status":"using existing layer sha256:432f310a77f4650a88d0fd59ecdd7cebed8d684bafea53cbff0473542964f0c3"}
 {"status":"writing manifest"}
 {"status":"writing manifest"}
@@ -1118,7 +1119,7 @@ Return 200 OK if the blob exists, 404 Not Found if it does not.
 
 
 ## Push a Blob
 ## Push a Blob
 
 
-```shell
+```
 POST /api/blobs/:digest
 POST /api/blobs/:digest
 ```
 ```
 
 
@@ -1142,7 +1143,7 @@ Return 201 Created if the blob was successfully created, 400 Bad Request if the
 
 
 ## List Local Models
 ## List Local Models
 
 
-```shell
+```
 GET /api/tags
 GET /api/tags
 ```
 ```
 
 
@@ -1195,7 +1196,7 @@ A single JSON object will be returned.
 
 
 ## Show Model Information
 ## Show Model Information
 
 
-```shell
+```
 POST /api/show
 POST /api/show
 ```
 ```
 
 
@@ -1261,7 +1262,7 @@ curl http://localhost:11434/api/show -d '{
 
 
 ## Copy a Model
 ## Copy a Model
 
 
-```shell
+```
 POST /api/copy
 POST /api/copy
 ```
 ```
 
 
@@ -1284,7 +1285,7 @@ Returns a 200 OK if successful, or a 404 Not Found if the source model doesn't e
 
 
 ## Delete a Model
 ## Delete a Model
 
 
-```shell
+```
 DELETE /api/delete
 DELETE /api/delete
 ```
 ```
 
 
@@ -1310,7 +1311,7 @@ Returns a 200 OK if successful, 404 Not Found if the model to be deleted doesn't
 
 
 ## Pull a Model
 ## Pull a Model
 
 
-```shell
+```
 POST /api/pull
 POST /api/pull
 ```
 ```
 
 
@@ -1382,7 +1383,7 @@ if `stream` is set to false, then the response is a single JSON object:
 
 
 ## Push a Model
 ## Push a Model
 
 
-```shell
+```
 POST /api/push
 POST /api/push
 ```
 ```
 
 
@@ -1447,7 +1448,7 @@ If `stream` is set to `false`, then the response is a single JSON object:
 
 
 ## Generate Embeddings
 ## Generate Embeddings
 
 
-```shell
+```
 POST /api/embed
 POST /api/embed
 ```
 ```
 
 
@@ -1515,7 +1516,7 @@ curl http://localhost:11434/api/embed -d '{
 ```
 ```
 
 
 ## List Running Models
 ## List Running Models
-```shell
+```
 GET /api/ps
 GET /api/ps
 ```
 ```
 
 
@@ -1562,7 +1563,7 @@ A single JSON object will be returned.
 
 
 > Note: this endpoint has been superseded by `/api/embed`
 > Note: this endpoint has been superseded by `/api/embed`
 
 
-```shell
+```
 POST /api/embeddings
 POST /api/embeddings
 ```
 ```
 
 
@@ -1602,7 +1603,7 @@ curl http://localhost:11434/api/embeddings -d '{
 
 
 ## Version
 ## Version
 
 
-```shell
+```
 GET /api/version
 GET /api/version
 ```
 ```
 
 

+ 10 - 10
docs/development.md

@@ -7,7 +7,7 @@ Install prerequisites:
 
 
 Then build and run Ollama from the root directory of the repository:
 Then build and run Ollama from the root directory of the repository:
 
 
-```
+```shell
 go run . serve
 go run . serve
 ```
 ```
 
 
@@ -23,14 +23,14 @@ Install prerequisites:
 
 
 Then, configure and build the project:
 Then, configure and build the project:
 
 
-```
+```shell
 cmake -B build
 cmake -B build
 cmake --build build
 cmake --build build
 ```
 ```
 
 
 Lastly, run Ollama:
 Lastly, run Ollama:
 
 
-```
+```shell
 go run . serve
 go run . serve
 ```
 ```
 
 
@@ -57,14 +57,14 @@ Install prerequisites:
 
 
 Then, configure and build the project:
 Then, configure and build the project:
 
 
-```
+```shell
 cmake -B build
 cmake -B build
 cmake --build build --config Release
 cmake --build build --config Release
 ```
 ```
 
 
 Lastly, run Ollama:
 Lastly, run Ollama:
 
 
-```
+```shell
 go run . serve
 go run . serve
 ```
 ```
 
 
@@ -88,26 +88,26 @@ Install prerequisites:
 
 
 Then, configure and build the project:
 Then, configure and build the project:
 
 
-```
+```shell
 cmake -B build
 cmake -B build
 cmake --build build
 cmake --build build
 ```
 ```
 
 
 Lastly, run Ollama:
 Lastly, run Ollama:
 
 
-```
+```shell
 go run . serve
 go run . serve
 ```
 ```
 
 
 ## Docker
 ## Docker
 
 
-```
+```shell
 docker build .
 docker build .
 ```
 ```
 
 
 ### ROCm
 ### ROCm
 
 
-```
+```shell
 docker build --build-arg FLAVOR=rocm .
 docker build --build-arg FLAVOR=rocm .
 ```
 ```
 
 
@@ -115,7 +115,7 @@ docker build --build-arg FLAVOR=rocm .
 
 
 To run tests, use `go test`:
 To run tests, use `go test`:
 
 
-```
+```shell
 go test ./...
 go test ./...
 ```
 ```
 
 

+ 27 - 23
docs/docker.md

@@ -2,7 +2,7 @@
 
 
 ### CPU only
 ### CPU only
 
 
-```bash
+```shell
 docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
 docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
 ```
 ```
 
 
@@ -11,42 +11,46 @@ Install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-
 
 
 #### Install with Apt
 #### Install with Apt
 1.  Configure the repository
 1.  Configure the repository
-```bash
-curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
-    | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
-curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
-    | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
-    | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
-sudo apt-get update
-```
+
+    ```shell
+    curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
+        | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
+    curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
+        | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
+        | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
+    sudo apt-get update
+    ```
+
 2.  Install the NVIDIA Container Toolkit packages
 2.  Install the NVIDIA Container Toolkit packages
-```bash
-sudo apt-get install -y nvidia-container-toolkit
-```
+
+    ```shell
+    sudo apt-get install -y nvidia-container-toolkit
+    ```
 
 
 #### Install with Yum or Dnf
 #### Install with Yum or Dnf
 1.  Configure the repository
 1.  Configure the repository
 
 
-```bash
-curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo \
-    | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
-```
+    ```shell
+    curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo \
+        | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
+    ```
 
 
 2. Install the NVIDIA Container Toolkit packages
 2. Install the NVIDIA Container Toolkit packages
 
 
-```bash
-sudo yum install -y nvidia-container-toolkit
-```
+    ```shell
+    sudo yum install -y nvidia-container-toolkit
+    ```
 
 
 #### Configure Docker to use Nvidia driver
 #### Configure Docker to use Nvidia driver
-```
+
+```shell
 sudo nvidia-ctk runtime configure --runtime=docker
 sudo nvidia-ctk runtime configure --runtime=docker
 sudo systemctl restart docker
 sudo systemctl restart docker
 ```
 ```
 
 
 #### Start the container
 #### Start the container
 
 
-```bash
+```shell
 docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
 docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
 ```
 ```
 
 
@@ -57,7 +61,7 @@ docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ol
 
 
 To run Ollama using Docker with AMD GPUs, use the `rocm` tag and the following command:
 To run Ollama using Docker with AMD GPUs, use the `rocm` tag and the following command:
 
 
-```
+```shell
 docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:rocm
 docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:rocm
 ```
 ```
 
 
@@ -65,7 +69,7 @@ docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 114
 
 
 Now you can run a model:
 Now you can run a model:
 
 
-```
+```shell
 docker exec -it ollama ollama run llama3.2
 docker exec -it ollama ollama run llama3.2
 ```
 ```
 
 

+ 14 - 4
docs/faq.md

@@ -24,7 +24,7 @@ By default, Ollama uses a context window size of 2048 tokens.
 
 
 To change this when using `ollama run`, use `/set parameter`:
 To change this when using `ollama run`, use `/set parameter`:
 
 
-```
+```shell
 /set parameter num_ctx 4096
 /set parameter num_ctx 4096
 ```
 ```
 
 
@@ -46,10 +46,15 @@ Use the `ollama ps` command to see what models are currently loaded into memory.
 
 
 ```shell
 ```shell
 ollama ps
 ollama ps
-NAME      	ID          	SIZE 	PROCESSOR	UNTIL
-llama3:70b	bcfb190ca3a7	42 GB	100% GPU 	4 minutes from now
 ```
 ```
 
 
+> **Output**:
+>
+> ```
+> NAME      	ID          	SIZE 	PROCESSOR	UNTIL
+> llama3:70b	bcfb190ca3a7	42 GB	100% GPU 	4 minutes from now
+> ```
+
 The `Processor` column will show which memory the model was loaded in to:
 The `Processor` column will show which memory the model was loaded in to:
 * `100% GPU` means the model was loaded entirely into the GPU
 * `100% GPU` means the model was loaded entirely into the GPU
 * `100% CPU` means the model was loaded entirely in system memory
 * `100% CPU` means the model was loaded entirely in system memory
@@ -88,7 +93,7 @@ If Ollama is run as a systemd service, environment variables should be set using
 
 
 4. Reload `systemd` and restart Ollama:
 4. Reload `systemd` and restart Ollama:
 
 
-   ```bash
+   ```shell
    systemctl daemon-reload
    systemctl daemon-reload
    systemctl restart ollama
    systemctl restart ollama
    ```
    ```
@@ -221,16 +226,19 @@ properties.
 If you are using the API you can preload a model by sending the Ollama server an empty request. This works with both the `/api/generate` and `/api/chat` API endpoints.
 If you are using the API you can preload a model by sending the Ollama server an empty request. This works with both the `/api/generate` and `/api/chat` API endpoints.
 
 
 To preload the mistral model using the generate endpoint, use:
 To preload the mistral model using the generate endpoint, use:
+
 ```shell
 ```shell
 curl http://localhost:11434/api/generate -d '{"model": "mistral"}'
 curl http://localhost:11434/api/generate -d '{"model": "mistral"}'
 ```
 ```
 
 
 To use the chat completions endpoint, use:
 To use the chat completions endpoint, use:
+
 ```shell
 ```shell
 curl http://localhost:11434/api/chat -d '{"model": "mistral"}'
 curl http://localhost:11434/api/chat -d '{"model": "mistral"}'
 ```
 ```
 
 
 To preload a model using the CLI, use the command:
 To preload a model using the CLI, use the command:
+
 ```shell
 ```shell
 ollama run llama3.2 ""
 ollama run llama3.2 ""
 ```
 ```
@@ -250,11 +258,13 @@ If you're using the API, use the `keep_alive` parameter with the `/api/generate`
 * '0' which will unload the model immediately after generating a response
 * '0' which will unload the model immediately after generating a response
 
 
 For example, to preload a model and leave it in memory use:
 For example, to preload a model and leave it in memory use:
+
 ```shell
 ```shell
 curl http://localhost:11434/api/generate -d '{"model": "llama3.2", "keep_alive": -1}'
 curl http://localhost:11434/api/generate -d '{"model": "llama3.2", "keep_alive": -1}'
 ```
 ```
 
 
 To unload the model and free up memory use:
 To unload the model and free up memory use:
+
 ```shell
 ```shell
 curl http://localhost:11434/api/generate -d '{"model": "llama3.2", "keep_alive": 0}'
 curl http://localhost:11434/api/generate -d '{"model": "llama3.2", "keep_alive": 0}'
 ```
 ```

+ 2 - 2
docs/import.md

@@ -20,13 +20,13 @@ Make sure that you use the same base model in the `FROM` command as you used to
 
 
 Now run `ollama create` from the directory where the `Modelfile` was created:
 Now run `ollama create` from the directory where the `Modelfile` was created:
 
 
-```bash
+```shell
 ollama create my-model
 ollama create my-model
 ```
 ```
 
 
 Lastly, test the model:
 Lastly, test the model:
 
 
-```bash
+```shell
 ollama run my-model
 ollama run my-model
 ```
 ```
 
 

+ 1 - 1
docs/linux.md

@@ -119,7 +119,7 @@ sudo systemctl status ollama
 
 
 To customize the installation of Ollama, you can edit the systemd service file or the environment variables by running:
 To customize the installation of Ollama, you can edit the systemd service file or the environment variables by running:
 
 
-```
+```shell
 sudo systemctl edit ollama
 sudo systemctl edit ollama
 ```
 ```
 
 

+ 35 - 29
docs/modelfile.md

@@ -28,7 +28,7 @@ A model file is the blueprint to create and share models with Ollama.
 
 
 The format of the `Modelfile`:
 The format of the `Modelfile`:
 
 
-```modelfile
+```
 # comment
 # comment
 INSTRUCTION arguments
 INSTRUCTION arguments
 ```
 ```
@@ -49,7 +49,7 @@ INSTRUCTION arguments
 
 
 An example of a `Modelfile` creating a mario blueprint:
 An example of a `Modelfile` creating a mario blueprint:
 
 
-```modelfile
+```
 FROM llama3.2
 FROM llama3.2
 # sets the temperature to 1 [higher is more creative, lower is more coherent]
 # sets the temperature to 1 [higher is more creative, lower is more coherent]
 PARAMETER temperature 1
 PARAMETER temperature 1
@@ -69,24 +69,30 @@ To use this:
 
 
 To view the Modelfile of a given model, use the `ollama show --modelfile` command.
 To view the Modelfile of a given model, use the `ollama show --modelfile` command.
 
 
-  ```bash
-  > ollama show --modelfile llama3.2
-  # Modelfile generated by "ollama show"
-  # To build a new Modelfile based on this one, replace the FROM line with:
-  # FROM llama3.2:latest
-  FROM /Users/pdevine/.ollama/models/blobs/sha256-00e1317cbf74d901080d7100f57580ba8dd8de57203072dc6f668324ba545f29
-  TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
-
-  {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
+```shell
+ollama show --modelfile llama3.2
+```
 
 
-  {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
+> **Output**:
+>
+> ```
+> # Modelfile generated by "ollama show"
+> # To build a new Modelfile based on this one, replace the FROM line with:
+> # FROM llama3.2:latest
+> FROM /Users/pdevine/.ollama/models/blobs/sha256-00e1317cbf74d901080d7100f57580ba8dd8de57203072dc6f668324ba545f29
+> TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>
+>
+> {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
+>
+> {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
+>
+> {{ .Response }}<|eot_id|>"""
+> PARAMETER stop "<|start_header_id|>"
+> PARAMETER stop "<|end_header_id|>"
+> PARAMETER stop "<|eot_id|>"
+> PARAMETER stop "<|reserved_special_token"
+> ```
 
 
-  {{ .Response }}<|eot_id|>"""
-  PARAMETER stop "<|start_header_id|>"
-  PARAMETER stop "<|end_header_id|>"
-  PARAMETER stop "<|eot_id|>"
-  PARAMETER stop "<|reserved_special_token"
-  ```
 
 
 ## Instructions
 ## Instructions
 
 
@@ -94,13 +100,13 @@ To view the Modelfile of a given model, use the `ollama show --modelfile` comman
 
 
 The `FROM` instruction defines the base model to use when creating a model.
 The `FROM` instruction defines the base model to use when creating a model.
 
 
-```modelfile
+```
 FROM <model name>:<tag>
 FROM <model name>:<tag>
 ```
 ```
 
 
 #### Build from existing model
 #### Build from existing model
 
 
-```modelfile
+```
 FROM llama3.2
 FROM llama3.2
 ```
 ```
 
 
@@ -111,7 +117,7 @@ Additional models can be found at:
 
 
 #### Build from a Safetensors model
 #### Build from a Safetensors model
 
 
-```modelfile
+```
 FROM <model directory>
 FROM <model directory>
 ```
 ```
 
 
@@ -125,7 +131,7 @@ Currently supported model architectures:
 
 
 #### Build from a GGUF file
 #### Build from a GGUF file
 
 
-```modelfile
+```
 FROM ./ollama-model.gguf
 FROM ./ollama-model.gguf
 ```
 ```
 
 
@@ -136,7 +142,7 @@ The GGUF file location should be specified as an absolute path or relative to th
 
 
 The `PARAMETER` instruction defines a parameter that can be set when the model is run.
 The `PARAMETER` instruction defines a parameter that can be set when the model is run.
 
 
-```modelfile
+```
 PARAMETER <parameter> <parametervalue>
 PARAMETER <parameter> <parametervalue>
 ```
 ```
 
 
@@ -183,7 +189,7 @@ TEMPLATE """{{ if .System }}<|im_start|>system
 
 
 The `SYSTEM` instruction specifies the system message to be used in the template, if applicable.
 The `SYSTEM` instruction specifies the system message to be used in the template, if applicable.
 
 
-```modelfile
+```
 SYSTEM """<system message>"""
 SYSTEM """<system message>"""
 ```
 ```
 
 
@@ -193,7 +199,7 @@ The `ADAPTER` instruction specifies a fine tuned LoRA adapter that should apply
 
 
 #### Safetensor adapter
 #### Safetensor adapter
 
 
-```modelfile
+```
 ADAPTER <path to safetensor adapter>
 ADAPTER <path to safetensor adapter>
 ```
 ```
 
 
@@ -204,7 +210,7 @@ Currently supported Safetensor adapters:
 
 
 #### GGUF adapter
 #### GGUF adapter
 
 
-```modelfile
+```
 ADAPTER ./ollama-lora.gguf
 ADAPTER ./ollama-lora.gguf
 ```
 ```
 
 
@@ -212,7 +218,7 @@ ADAPTER ./ollama-lora.gguf
 
 
 The `LICENSE` instruction allows you to specify the legal license under which the model used with this Modelfile is shared or distributed.
 The `LICENSE` instruction allows you to specify the legal license under which the model used with this Modelfile is shared or distributed.
 
 
-```modelfile
+```
 LICENSE """
 LICENSE """
 <license text>
 <license text>
 """
 """
@@ -222,7 +228,7 @@ LICENSE """
 
 
 The `MESSAGE` instruction allows you to specify a message history for the model to use when responding. Use multiple iterations of the MESSAGE command to build up a conversation which will guide the model to answer in a similar way.
 The `MESSAGE` instruction allows you to specify a message history for the model to use when responding. Use multiple iterations of the MESSAGE command to build up a conversation which will guide the model to answer in a similar way.
 
 
-```modelfile
+```
 MESSAGE <role> <message>
 MESSAGE <role> <message>
 ```
 ```
 
 
@@ -237,7 +243,7 @@ MESSAGE <role> <message>
 
 
 #### Example conversation
 #### Example conversation
 
 
-```modelfile
+```
 MESSAGE user Is Toronto in Canada?
 MESSAGE user Is Toronto in Canada?
 MESSAGE assistant yes
 MESSAGE assistant yes
 MESSAGE user Is Sacramento in Canada?
 MESSAGE user Is Sacramento in Canada?

+ 8 - 5
docs/openai.md

@@ -1,6 +1,7 @@
 # OpenAI compatibility
 # OpenAI compatibility
 
 
-> **Note:** OpenAI compatibility is experimental and is subject to major adjustments including breaking changes. For fully-featured access to the Ollama API, see the Ollama [Python library](https://github.com/ollama/ollama-python), [JavaScript library](https://github.com/ollama/ollama-js) and [REST API](https://github.com/ollama/ollama/blob/main/docs/api.md).
+> [!NOTE]
+> OpenAI compatibility is experimental and is subject to major adjustments including breaking changes. For fully-featured access to the Ollama API, see the Ollama [Python library](https://github.com/ollama/ollama-python), [JavaScript library](https://github.com/ollama/ollama-js) and [REST API](https://github.com/ollama/ollama/blob/main/docs/api.md).
 
 
 Ollama provides experimental compatibility with parts of the [OpenAI API](https://platform.openai.com/docs/api-reference) to help connect existing applications to Ollama.
 Ollama provides experimental compatibility with parts of the [OpenAI API](https://platform.openai.com/docs/api-reference) to help connect existing applications to Ollama.
 
 
@@ -59,8 +60,10 @@ embeddings = client.embeddings.create(
     input=["why is the sky blue?", "why is the grass green?"],
     input=["why is the sky blue?", "why is the grass green?"],
 )
 )
 ```
 ```
+
 #### Structured outputs
 #### Structured outputs
-```py
+
+```python
 from pydantic import BaseModel
 from pydantic import BaseModel
 from openai import OpenAI
 from openai import OpenAI
 
 
@@ -144,7 +147,7 @@ const embedding = await openai.embeddings.create({
 
 
 ### `curl`
 ### `curl`
 
 
-``` shell
+```shell
 curl http://localhost:11434/v1/chat/completions \
 curl http://localhost:11434/v1/chat/completions \
     -H "Content-Type: application/json" \
     -H "Content-Type: application/json" \
     -d '{
     -d '{
@@ -319,7 +322,7 @@ ollama pull llama3.2
 
 
 For tooling that relies on default OpenAI model names such as `gpt-3.5-turbo`, use `ollama cp` to copy an existing model name to a temporary name:
 For tooling that relies on default OpenAI model names such as `gpt-3.5-turbo`, use `ollama cp` to copy an existing model name to a temporary name:
 
 
-```
+```shell
 ollama cp llama3.2 gpt-3.5-turbo
 ollama cp llama3.2 gpt-3.5-turbo
 ```
 ```
 
 
@@ -343,7 +346,7 @@ curl http://localhost:11434/v1/chat/completions \
 
 
 The OpenAI API does not have a way of setting the context size for a model. If you need to change the context size, create a `Modelfile` which looks like:
 The OpenAI API does not have a way of setting the context size for a model. If you need to change the context size, create a `Modelfile` which looks like:
 
 
-```modelfile
+```
 FROM <some model>
 FROM <some model>
 PARAMETER num_ctx <context size>
 PARAMETER num_ctx <context size>
 ```
 ```

+ 7 - 4
docs/troubleshooting.md

@@ -17,6 +17,7 @@ When you run Ollama in a **container**, the logs go to stdout/stderr in the cont
 ```shell
 ```shell
 docker logs <container-name>
 docker logs <container-name>
 ```
 ```
+
 (Use `docker ps` to find the container name)
 (Use `docker ps` to find the container name)
 
 
 If manually running `ollama serve` in a terminal, the logs will be on that terminal.
 If manually running `ollama serve` in a terminal, the logs will be on that terminal.
@@ -28,6 +29,7 @@ When you run Ollama on **Windows**, there are a few different locations. You can
 - `explorer %TEMP%` where temporary executable files are stored in one or more `ollama*` directories
 - `explorer %TEMP%` where temporary executable files are stored in one or more `ollama*` directories
 
 
 To enable additional debug logging to help troubleshoot problems, first **Quit the running app from the tray menu** then in a powershell terminal
 To enable additional debug logging to help troubleshoot problems, first **Quit the running app from the tray menu** then in a powershell terminal
+
 ```powershell
 ```powershell
 $env:OLLAMA_DEBUG="1"
 $env:OLLAMA_DEBUG="1"
 & "ollama app.exe"
 & "ollama app.exe"
@@ -49,12 +51,13 @@ Dynamic LLM libraries [rocm_v6 cpu cpu_avx cpu_avx2 cuda_v11 rocm_v5]
 
 
 You can set OLLAMA_LLM_LIBRARY to any of the available LLM libraries to bypass autodetection, so for example, if you have a CUDA card, but want to force the CPU LLM library with AVX2 vector support, use:
 You can set OLLAMA_LLM_LIBRARY to any of the available LLM libraries to bypass autodetection, so for example, if you have a CUDA card, but want to force the CPU LLM library with AVX2 vector support, use:
 
 
-```
+```shell
 OLLAMA_LLM_LIBRARY="cpu_avx2" ollama serve
 OLLAMA_LLM_LIBRARY="cpu_avx2" ollama serve
 ```
 ```
 
 
 You can see what features your CPU has with the following.
 You can see what features your CPU has with the following.
-```
+
+```shell
 cat /proc/cpuinfo| grep flags | head -1
 cat /proc/cpuinfo| grep flags | head -1
 ```
 ```
 
 
@@ -62,8 +65,8 @@ cat /proc/cpuinfo| grep flags | head -1
 
 
 If you run into problems on Linux and want to install an older version, or you'd like to try out a pre-release before it's officially released, you can tell the install script which version to install.
 If you run into problems on Linux and want to install an older version, or you'd like to try out a pre-release before it's officially released, you can tell the install script which version to install.
 
 
-```sh
-curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION="0.1.29" sh
+```shell
+curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.5.7 sh
 ```
 ```
 
 
 ## Linux tmp noexec 
 ## Linux tmp noexec 

+ 1 - 0
docs/windows.md

@@ -47,6 +47,7 @@ If Ollama is already running, Quit the tray application and relaunch it from the
 ## API Access
 ## API Access
 
 
 Here's a quick example showing API access from `powershell`
 Here's a quick example showing API access from `powershell`
+
 ```powershell
 ```powershell
 (Invoke-WebRequest -method POST -Body '{"model":"llama3.2", "prompt":"Why is the sky blue?", "stream": false}' -uri http://localhost:11434/api/generate ).Content | ConvertFrom-json
 (Invoke-WebRequest -method POST -Body '{"model":"llama3.2", "prompt":"Why is the sky blue?", "stream": false}' -uri http://localhost:11434/api/generate ).Content | ConvertFrom-json
 ```
 ```

+ 5 - 5
llama/README.md

@@ -8,7 +8,7 @@ Ollama vendors [llama.cpp](https://github.com/ggerganov/llama.cpp/) and [ggml](h
 
 
 If you update the vendoring code, start by running the following command to establish the tracking llama.cpp repo in the `./vendor/` directory.
 If you update the vendoring code, start by running the following command to establish the tracking llama.cpp repo in the `./vendor/` directory.
 
 
-```
+```shell
 make -f Makefile.sync apply-patches
 make -f Makefile.sync apply-patches
 ```
 ```
 
 
@@ -22,7 +22,7 @@ When updating to a newer base commit, the existing patches may not apply cleanly
 
 
 Start by applying the patches. If any of the patches have conflicts, the `git am` will stop at the first failure.
 Start by applying the patches. If any of the patches have conflicts, the `git am` will stop at the first failure.
 
 
-```
+```shell
 make -f Makefile.sync apply-patches
 make -f Makefile.sync apply-patches
 ```
 ```
 
 
@@ -30,7 +30,7 @@ If there are conflicts, you will see an error message. Resolve the conflicts in
 
 
 Once all patches are applied, commit the changes to the tracking repository.
 Once all patches are applied, commit the changes to the tracking repository.
 
 
-```
+```shell
 make -f Makefile.sync format-patches sync
 make -f Makefile.sync format-patches sync
 ```
 ```
 
 
@@ -38,13 +38,13 @@ make -f Makefile.sync format-patches sync
 
 
 When working on new fixes or features that impact vendored code, use the following model. First get a clean tracking repo with all current patches applied:
 When working on new fixes or features that impact vendored code, use the following model. First get a clean tracking repo with all current patches applied:
 
 
-```
+```shell
 make -f Makefile.sync clean apply-patches
 make -f Makefile.sync clean apply-patches
 ```
 ```
 
 
 Iterate until you're ready to submit PRs. Once your code is ready, commit a change in the `./vendor/` directory, then generate the patches for ollama with
 Iterate until you're ready to submit PRs. Once your code is ready, commit a change in the `./vendor/` directory, then generate the patches for ollama with
 
 
-```
+```shell
 make -f Makefile.sync format-patches
 make -f Makefile.sync format-patches
 ```
 ```
 
 

+ 3 - 3
llama/runner/README.md

@@ -4,18 +4,18 @@
 
 
 A minimial runner for loading a model and running inference via a http web server.
 A minimial runner for loading a model and running inference via a http web server.
 
 
-```
+```shell
 ./runner -model <model binary>
 ./runner -model <model binary>
 ```
 ```
 
 
 ### Completion
 ### Completion
 
 
-```
+```shell
 curl -X POST -H "Content-Type: application/json" -d '{"prompt": "hi"}' http://localhost:8080/completion
 curl -X POST -H "Content-Type: application/json" -d '{"prompt": "hi"}' http://localhost:8080/completion
 ```
 ```
 
 
 ### Embeddings
 ### Embeddings
 
 
-```
+```shell
 curl -X POST -H "Content-Type: application/json" -d '{"prompt": "turn me into an embedding"}' http://localhost:8080/embedding
 curl -X POST -H "Content-Type: application/json" -d '{"prompt": "turn me into an embedding"}' http://localhost:8080/embedding
 ```
 ```

+ 2 - 2
macapp/README.md

@@ -6,14 +6,14 @@ This app builds upon Ollama to provide a desktop experience for running models.
 
 
 First, build the `ollama` binary:
 First, build the `ollama` binary:
 
 
-```
+```shell
 cd ..
 cd ..
 go build .
 go build .
 ```
 ```
 
 
 Then run the desktop app with `npm start`:
 Then run the desktop app with `npm start`:
 
 
-```
+```shell
 cd macapp
 cd macapp
 npm install
 npm install
 npm start
 npm start