π« Local Bench
Validate your inference. Benchmark local LLM performance on your own hardware so you know exactly how fast your Ollama models run.
Install & availability
Available now. Install from the Companion Hub App Store. Verified on the test fleet: the published image pulls, starts, serves its web UI, and survives a container restart.
| Hub App Store | Search Local Bench β Install |
| Container image | ghcr.io/companionintelligence/ci-local-bench:latest (public) |
| Web UI port | 3000 |
| Architecture | linux/amd64 and linux/arm64 (multi-arch) |
Overview
Local Bench is a benchmarking tool for local language models served by Ollama. It runs prompts through your models, records throughput and timing alongside your hardware specs, and stores the results so you can compare runs over time. Use it before deploying a model to understand how it actually performs on your Hub.
Current capabilities
What Local Bench does today, as shipped:
- Benchmarks models served by Ollama
- Records results together with system specs for context
- Persists runs to disk and exposes them via a REST API
- Ships a curated model catalog as a fallback when Ollama isnβt reachable
- Shows a friendly empty state on first run before any benchmarks exist
Connecting Ollama
Local Bench talks to Ollama via the OLLAMA_API_URL environment variable (default http://localhost:11434). Inside a container, localhost is the container itself, so point it at the host or a LAN address:
# Ollama running on the Docker host
docker run -p 3000:3000 \
-e OLLAMA_API_URL=http://host.docker.internal:11434 \
-v localbench-data:/app \
ghcr.io/companionintelligence/ci-local-bench:latest
# Or a LAN address
-e OLLAMA_API_URL=http://192.168.1.50:11434When Ollama is unreachable, GET /api/models returns HTTP 503 along with a curated catalog of models, so the UI degrades gracefully instead of breaking. Fix the connection by correcting OLLAMA_API_URL.
Persistence
Local Bench writes two files into /app:
benchmark_data.dbβ the SQLite results databasebenchmark_results.csvβ a CSV mirror of results
Mount a volume at /app (as in the example above) to keep your benchmark history across restarts.
API
| Endpoint | Method | Purpose |
|---|---|---|
/api/results | GET | Stored benchmark results |
/api/results-with-specs | GET | Results joined with system specs (?limit= supported) |
/api/system-specs | GET | Detected hardware specs |
/api/prompts | GET | Benchmark prompt set |
/api/models | GET | Available models (503 + curated catalog if Ollama is down) |
/api/run-benchmark | POST | Run a benchmark (blocks while the run is in progress) |
POST /api/run-benchmark runs synchronously and blocks until the benchmark completes. Expect the request to stay open for the duration of the run.
Use Cases
- βWhich model runs fastest on my hardware right now?β
- Capture before/after numbers when you change GPUs or quantizations
- Keep a CSV/SQLite history of how your Hub performs over time
- Document your Hubβs specs alongside real benchmark figures
Setup
Install from Hub
Search for Local Bench in the Hub app store and install.
Open Local Bench
Navigate to http://local-bench.ci.localhost. On first run youβll see a friendly empty state until you record a benchmark.
Point it at Ollama
Set OLLAMA_API_URL to a reachable Ollama endpoint (e.g. http://host.docker.internal:11434 from a container, or a LAN address). With the default http://localhost:11434, the container only sees itself.
Run a benchmark
Pick a model from your Ollama library and start a run. Results are saved and appear via /api/results.
Troubleshooting
No models listed
Ollama is likely unreachable, so /api/models returned 503 with the curated catalog. Check OLLAMA_API_URL points to a host/LAN address β not the containerβs own localhost.
Results disappear after restart
Mount a volume at /app so benchmark_data.db and benchmark_results.csv persist.
A benchmark request seems to hang
POST /api/run-benchmark blocks for the whole run by design. Wait for it to finish.