Branch: jmorganca/execstack

api

bmizerany/client-registry

bmizerany/client2resume

bmizerany/embedspeedup

bmizerany/fastverify

bmizerany/filepathnobuild

bmizerany/filepathwithcoloninhost

bmizerany/grammar

bmizerany/hrm

bmizerany/modenameenforcealphanum

bmizerany/nameswork

bmizerany/noseek

bmizerany/nosillyggufslurps

bmizerany/replacecolon

bmizerany/types/model/defaultfix

bmizerany/validatenames

bmizerany/x

bruce/iq-quants

brucemacd/allow-ollama

brucemacd/browser-key-register

brucemacd/check-key-register

brucemacd/check-key-register-structured-err

brucemacd/convert-cli

brucemacd/ctx-shift-err

brucemacd/doc-go-engine

brucemacd/done-reason

brucemacd/err-hint

brucemacd/err-no-vocab

brucemacd/forward-test

brucemacd/go_qwen2

brucemacd/install-path-clean

brucemacd/jomorganca/mistral

brucemacd/llama-mem-calc

brucemacd/logprobs

brucemacd/mistral

brucemacd/mistral-small-convert

brucemacd/new_runner_e2e

brucemacd/new_runner_graph_bench

brucemacd/new_runner_qwen2

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/parallel-embed-models

brucemacd/push-name-validation

brucemacd/qwen2_5

brucemacd/rope-config

brucemacd/runner-completion

brucemacd/shim-grammar

brucemacd/tokenize

build_dist

cgo

cp-model

cuda-search

delete-fix

deletemodels

dhiltgen/remove_submodule

distribution

editor

fix-model-names

fix-unknown-model

format-config

go-opts

insecure-registry

jessegross/sample

jessegross/semaphore

jmorgan/sample-fix-sorting-extras

jmorganca/add-missing-symlink-eval

jmorganca/batch-embeddings

jmorganca/degin-1

jmorganca/done-reason

jmorganca/enable-fa

jmorganca/execstack

jmorganca/faster-releases

jmorganca/fix-gguf-error

jmorganca/fix-null-format

jmorganca/fix-proxy

jmorganca/ga

jmorganca/ggml-static

jmorganca/if-none-match

jmorganca/initcmake

jmorganca/limit

jmorganca/llama-bump

jmorganca/llama-cpp-7c26775

jmorganca/llama-cpp-8960fe8

jmorganca/llama-vit

jmorganca/mistral

jmorganca/mistral-wip

jmorganca/mistral3

jmorganca/mllama

jmorganca/mm

jmorganca/native

jmorganca/no-concat

jmorganca/no-error-template

jmorganca/openai-context

jmorganca/openai-fix-first-message

jmorganca/options

jmorganca/qwen2vl

jmorganca/replace-assets

jmorganca/temp-0-images

jmorganca/template-mistral

jmorganca/testing

jmorganca/vendor-081b29bd

jyan/auth

jyan/convert-prog

jyan/format

jyan/local

jyan/local2

jyan/ollama-v

jyan/p2

jyan/paligemma

jyan/palitest

jyan/parse-temp

jyan/progress

jyan/q4_4/8

jyan/quant3

jyan/quant4

jyan/quant5

jyan/reord-g

jyan/v0.146

language_support

license-layers

list-models

main

matt/examplemodelfiles

matt/streamingapi

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/communitylinks

mattw/faq-context

mattw/howtoquant

mattw/noprune

mattw/python-functioncalling

mattw/quantcontext

mattw/selfqueryingretrieval

mattw/whatneedstorun

modelfile-readme

modelpath

modenameenforcealphanum

mxyng/api-models

mxyng/cmd-history

mxyng/create-context

mxyng/environ-2

mxyng/extra-args

mxyng/fix-memory

mxyng/fs-config

mxyng/func-checks

mxyng/gin-slog

mxyng/install

mxyng/layers-from-files

mxyng/mllama

mxyng/modelname-6

mxyng/modelname-7

mxyng/next-bert

mxyng/next-debug

mxyng/next-mlx

mxyng/no-deprecated-gpu-targets

mxyng/server-timestamp

mxyng/split-bin

mxyng/tune-concurrency

mxyng/update-registry-domain

native

nogogen

ollama.com

paligemma-support

parth/cmd-cleanup-SO

parth/constrained-sampling-json

parth/disallow-streaming-tools

parth/fix-default-to-warn-json

parth/fix-referencing-so

parth/log-probs

parth/openai-stream-usage

parth/sample-correctness-fix

parth/sample-fix-sorting

parth/sample-unmarshal-json-for-params

parth/sampling-structured-outputs

parth/set-context-size-openai

parth/templating

parth/tokenize-detokenize

pdevine/bfloat16

pdevine/convert-cohere2

pdevine/fix-template

pdevine/geems-2b

pdevine/gemma2

pdevine/ggla

pdevine/import-docs

pdevine/logging

pdevine/newlines

pdevine/ps-glitches

pdevine/showggmlinfo

progress-flicker

progressbar

pulse

readme-updates

remove-first

rename

revert-5963-revert-5924-mxyng/llama3.1-rope

rmdisplaylong

roy-embed-parallel

royh-embed-parallel

royh-imgembed

royh-ls

royh-name

royh-openai-delete

royh-openai-suffixdocs

royh-params

royh-precision

royh-show-rigid

royh-testdelete

royh/embed-viz

royh/ep-methods

royh/stream-tools

royh/whisper

scratch

shell

skip-list

stream-tools-stop

timeout

update-nous-hermes

upgrade-all

upload-progress

whitespace-detection

Jeffrey Morgan 1ffb1e2874 update llama.cpp submodule to `77d1ac7` (#3030)		1 year ago
..
CMakeLists.txt	1b249748ab Add multiple CPU variants for Intel Mac	1 year ago
README.md	8da7bef05f Support multiple variants for a given llm lib type	1 year ago
ext_server.cpp	1ffb1e2874 update llama.cpp submodule to `77d1ac7` (#3030)	1 year ago
ext_server.h	4613a080e7 update llama.cpp submodule to `66c1968f7` (#2618)	1 year ago

Extern C Server

This directory contains a thin facade we layer on top of the Llama.cpp server to expose extern C interfaces to access the functionality through direct API calls in-process. The llama.cpp code uses compile time macros to configure GPU type along with other settings. During the go generate ./... execution, the build will generate one or more copies of the llama.cpp extern C server based on what GPU libraries are detected to support multiple GPU types as well as CPU only support. The Ollama go build then embeds these different servers to support different GPUs and settings at runtime.

If you are making changes to the code in this directory, make sure to disable caching during your go build to ensure you pick up your changes. A typical iteration cycle from the top of the source tree looks like:

go generate ./... && go build -a .

README.md

Extern C Server