llama.cpp example | Proval Docs

llama.cpp can serve a local model with an OpenAI-compatible HTTP API. This guide covers starting the server. To add the model provider in Proval, follow Set LLM.

Prerequisites

Proval is running on your server (Quick Start)
A GGUF model file on that server

Start the server

Run llama.cpp with the OpenAI-compatible HTTP server. Set an API key so only your clients can call the API:

llama-server -m /path/to/your-model.gguf --port 8080 --api-key <your-llm-api-key>

Confirm the server responds:

http://<llm-host>:8080/v1/models

Replace <llm-host> with the hostname or IP address of the machine running llama.cpp.

llama.cpp server running in a terminal — llama.cpp server listening for API requests

If you run without --api-key, enter any placeholder in that field. We recommend run llama.cpp with --api-key

In Proval, set Base URL to http://<llm-host>:8080/v1. See Set LLM for the rest of the form and connection test.

What's next

Set LLM: add the model provider in Proval
GitLab · Forgejo · GitHub: connect a repository