llama.cpp example
Run a local llama.cpp server for Proval.
llama.cpp can serve a local model with an OpenAI-compatible HTTP API. This guide covers starting the server. To add the model provider in Proval, follow Set LLM.
Prerequisites
- Proval is running on your server (Quick Start)
- A GGUF model file on that server
Start the server
Run llama.cpp with the OpenAI-compatible HTTP server. Set an API key so only your clients can call the API:
llama-server -m /path/to/your-model.gguf --port 8080 --api-key <your-llm-api-key>
Confirm the server responds:
http://<llm-host>:8080/v1/models
Replace <llm-host> with the hostname or IP address of the machine running llama.cpp.
If you run without
--api-key, enter any placeholder in that field. We recommend run llama.cpp with--api-key
In Proval, set Base URL to http://<llm-host>:8080/v1. See Set LLM for the rest of the form and connection test.