1 year ago · a5ba0fcf78
--- a/Dockerfile
+++ b/Dockerfile
@@ -1,5 +1,6 @@
 
				 ARG GOLANG_VERSION=1.22.1
			
 
				 ARG CMAKE_VERSION=3.22.1
			
 
				+# this CUDA_VERSION corresponds with the one specified in docs/gpu.md
			
 
				 ARG CUDA_VERSION=11.3.1
			
 
				 ARG ROCM_VERSION=6.0
			
 
				 
			
--- a/docs/faq.md
+++ b/docs/faq.md
@@ -14,6 +14,10 @@ curl -fsSL https://ollama.com/install.sh | sh
 
				 
			
 
				 Review the [Troubleshooting](./troubleshooting.md) docs for more about using logs.
			
 
				 
			
 
				+## Is my GPU compatible with Ollama?
			
 
				+
			
 
				+Please refer to the [GPU docs](./gpu.md).
			
 
				+
			
 
				 ## How can I specify the context window size?
			
 
				 
			
 
				 By default, Ollama uses a context window size of 2048 tokens.
			
--- a/docs/gpu.md
+++ b/docs/gpu.md
@@ -0,0 +1,74 @@
 
				+# GPU
			
 
				+## Nvidia
			
 
				+Ollama supports Nvidia GPUs with compute capability 5.0+.
			
 
				+
			
 
				+Check your compute compatibility to see if your card is supported:
			
 
				+[https://developer.nvidia.com/cuda-gpus](https://developer.nvidia.com/cuda-gpus)
			
 
				+
			
 
				+| Compute Capability | Family              | Cards                                                                                                       |
			
 
				+| ------------------ | ------------------- | ----------------------------------------------------------------------------------------------------------- |
			
 
				+| 9.0                | NVIDIA              | `H100`                                                                                                      |
			
 
				+| 8.9                | GeForce RTX 40xx    | `RTX 4090` `RTX 4080` `RTX 4070 Ti` `RTX 4060 Ti`                                                           |
			
 
				+|                    | NVIDIA Professional | `L4` `L40` `RTX 6000`                                                                                       |
			
 
				+| 8.6                | GeForce RTX 30xx    | `RTX 3090 Ti` `RTX 3090` `RTX 3080 Ti` `RTX 3080` `RTX 3070 Ti` `RTX 3070` `RTX 3060 Ti` `RTX 3060`         |
			
 
				+|                    | NVIDIA Professional | `A40` `RTX A6000` `RTX A5000` `RTX A4000` `RTX A3000` `RTX A2000` `A10` `A16` `A2`                          |
			
 
				+| 8.0                | NVIDIA              | `A100` `A30`                                                                                                |
			
 
				+| 7.5                | GeForce GTX/RTX     | `GTX 1650 Ti` `TITAN RTX` `RTX 2080 Ti` `RTX 2080` `RTX 2070` `RTX 2060`                                    |
			
 
				+|                    | NVIDIA Professional | `T4` `RTX 5000` `RTX 4000` `RTX 3000` `T2000` `T1200` `T1000` `T600` `T500`                                 |
			
 
				+|                    | Quadro              | `RTX 8000` `RTX 6000` `RTX 5000` `RTX 4000`                                                                 |
			
 
				+| 7.0                | NVIDIA              | `TITAN V` `V100` `Quadro GV100`                                                                             |
			
 
				+| 6.1                | NVIDIA TITAN        | `TITAN Xp` `TITAN X`                                                                                        |
			
 
				+|                    | GeForce GTX         | `GTX 1080 Ti` `GTX 1080` `GTX 1070 Ti` `GTX 1070` `GTX 1060` `GTX 1050`                                     |
			
 
				+|                    | Quadro              | `P6000` `P5200` `P4200` `P3200` `P5000` `P4000` `P3000` `P2200` `P2000` `P1000` `P620` `P600` `P500` `P520` |
			
 
				+|                    | Tesla               | `P40` `P4`                                                                                                  |
			
 
				+| 6.0                | NVIDIA              | `Tesla P100` `Quadro GP100`                                                                                 |
			
 
				+| 5.2                | GeForce GTX         | `GTX TITAN X` `GTX 980 Ti` `GTX 980` `GTX 970` `GTX 960` `GTX 950`                                          |
			
 
				+|                    | Quadro              | `M6000 24GB` `M6000` `M5000` `M5500M` `M4000` `M2200` `M2000` `M620`                                        |
			
 
				+|                    | Tesla               | `M60` `M40`                                                                                                 |
			
 
				+| 5.0                | GeForce GTX         | `GTX 750 Ti` `GTX 750` `NVS 810`                                                                            |
			
 
				+|                    | Quadro              | `K2200` `K1200` `K620` `M1200` `M520` `M5000M` `M4000M` `M3000M` `M2000M` `M1000M` `K620M` `M600M` `M500M`  |
			
 
				+
			
 
				+
			
 
				+## AMD Radeon
			
 
				+Ollama supports the following AMD GPUs:
			
 
				+| Family         | Cards and accelerators                                                                                                               |
			
 
				+| -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
			
 
				+| AMD Radeon RX  | `7900 XTX` `7900 XT` `7900 GRE` `7800 XT` `7700 XT` `7600 XT` `7600` `6950 XT` `6900 XTX` `6900XT` `6800 XT` `6800` `Vega 64` `Vega 56`    |
			
 
				+| AMD Radeon PRO | `W7900` `W7800` `W7700` `W7600` `W7500` `W6900X` `W6800X Duo` `W6800X` `W6800` `V620` `V420` `V340` `V320` `Vega II Duo` `Vega II` `VII` `SSG` |
			
 
				+| AMD Instinct   | `MI300X` `MI300A` `MI300` `MI250X` `MI250` `MI210` `MI200` `MI100` `MI60` `MI50`                                                               |
			
 
				+
			
 
				+### Overrides
			
 
				+Ollama leverages the AMD ROCm library, which does not support all AMD GPUs. In
			
 
				+some cases you can force the system to try to use a similar LLVM target that is
			
 
				+close.  For example The Radeon RX 5400 is `gfx1034` (also known as 10.3.4)
			
 
				+however, ROCm does not currently support this target. The closest support is
			
 
				+`gfx1030`.  You can use the environment variable `HSA_OVERRIDE_GFX_VERSION` with
			
 
				+`x.y.z` syntax.  So for example, to force the system to run on the RX 5400, you
			
 
				+would set `HSA_OVERRIDE_GFX_VERSION="10.3.0"` as an environment variable for the
			
 
				+server.  If you have an unsupported AMD GPU you can experiment using the list of
			
 
				+supported types below.
			
 
				+
			
 
				+At this time, the known supported GPU types are the following LLVM Targets.
			
 
				+This table shows some example GPUs that map to these LLVM targets:
			
 
				+| **LLVM Target** | **An Example GPU** |
			
 
				+|-----------------|---------------------|
			
 
				+| gfx900 | Radeon RX Vega 56 |
			
 
				+| gfx906 | Radeon Instinct MI50 |
			
 
				+| gfx908 | Radeon Instinct MI100 |
			
 
				+| gfx90a | Radeon Instinct MI210 |
			
 
				+| gfx940 | Radeon Instinct MI300 |
			
 
				+| gfx941 | |
			
 
				+| gfx942 | |
			
 
				+| gfx1030 | Radeon PRO V620 |
			
 
				+| gfx1100 | Radeon PRO W7900 |
			
 
				+| gfx1101 | Radeon PRO W7700 |
			
 
				+| gfx1102 | Radeon RX 7600 |
			
 
				+
			
 
				+AMD is working on enhancing ROCm v6 to broaden support for families of GPUs in a
			
 
				+future release which should increase support for more GPUs.
			
 
				+
			
 
				+Reach out on [Discord](https://discord.gg/ollama) or file an
			
 
				+[issue](https://github.com/ollama/ollama/issues) for additional help.
			
 
				+
			
 
				+### Metal (Apple GPUs)
			
 
				+Ollama supports GPU acceleration on Apple devices via the Metal API.
			
--- a/docs/troubleshooting.md
+++ b/docs/troubleshooting.md
@@ -67,40 +67,6 @@ You can see what features your CPU has with the following.
 
				 cat /proc/cpuinfo| grep flags  | head -1

			
 
				 ```

			
 
				 

			
 
				-## AMD Radeon GPU Support

			
 
				-

			
 
				-Ollama leverages the AMD ROCm library, which does not support all AMD GPUs. In

			
 
				-some cases you can force the system to try to use a similar LLVM target that is

			
 
				-close.  For example The Radeon RX 5400 is `gfx1034` (also known as 10.3.4)

			
 
				-however, ROCm does not currently support this target. The closest support is

			
 
				-`gfx1030`.  You can use the environment variable `HSA_OVERRIDE_GFX_VERSION` with

			
 
				-`x.y.z` syntax.  So for example, to force the system to run on the RX 5400, you

			
 
				-would set `HSA_OVERRIDE_GFX_VERSION="10.3.0"` as an environment variable for the

			
 
				-server.  If you have an unsupported AMD GPU you can experiment using the list of

			
 
				-supported types below.

			
 
				-

			
 
				-At this time, the known supported GPU types are the following LLVM Targets.

			
 
				-This table shows some example GPUs that map to these LLVM targets:

			
 
				-| **LLVM Target** | **An Example GPU** |

			
 
				-|-----------------|---------------------|

			
 
				-| gfx900 | Radeon RX Vega 56 |

			
 
				-| gfx906 | Radeon Instinct MI50 |

			
 
				-| gfx908 | Radeon Instinct MI100 |

			
 
				-| gfx90a | Radeon Instinct MI210 |

			
 
				-| gfx940 | Radeon Instinct MI300 |

			
 
				-| gfx941 | |

			
 
				-| gfx942 | |

			
 
				-| gfx1030 | Radeon PRO V620 |

			
 
				-| gfx1100 | Radeon PRO W7900 |

			
 
				-| gfx1101 | Radeon PRO W7700 |

			
 
				-| gfx1102 | Radeon RX 7600 |

			
 
				-

			
 
				-AMD is working on enhancing ROCm v6 to broaden support for families of GPUs in a

			
 
				-future release which should increase support for more GPUs.

			
 
				-

			
 
				-Reach out on [Discord](https://discord.gg/ollama) or file an

			
 
				-[issue](https://github.com/ollama/ollama/issues) for additional help.

			
 
				-

			
 
				 ## Installing older or pre-release versions on Linux

			
 
				 

			
 
				 If you run into problems on Linux and want to install an older version, or you'd