AI & Inference

Companion Hub includes built-in support for running and routing AI models. The Onboarding Wizard configures your inference backend, recommended models, and optional cloud API keys on first login.

Inference backends

Backend	Status	Notes
Ollama	Available	Default. Runs on your host or in a container. OpenAI-compatible `/v1` API.
Cloud providers	Available	OpenAI, Anthropic, Google, GitHub Copilot — configured in Settings
vLLM	Planned	High-throughput GPU serving
Lemonade	Planned	AMD-optimised inference

Hub detects your hardware (CPU, RAM, GPU vendor, VRAM) and recommends models that fit. You can override preferences in Settings → AI.

Ollama on your host

By default, Hub connects to Ollama on your host machine at port 11434. Containers reach it via host.docker.internal:11434.

Install Ollama

Download from ollama.com or:


curl -fsSL https://ollama.com/install.sh | sh

Pull a model:


ollama pull llama3.2
# or via Hub CLI:
cihub models install llama3.2
cihub models list

Bind Ollama to all interfaces

Ollama listens on 127.0.0.1 by default. Docker containers need it reachable on the host network interface. Set:


# Linux (systemd override)
sudo systemctl edit ollama

Add:


[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"

Then restart:


sudo systemctl restart ollama

On macOS/Windows with Docker Desktop, host.docker.internal usually works without changing the bind address.

[GIF OF OLLAMA GPU SETUP] — Screen recording showing Ollama detecting a GPU and Hub’s System Inspector confirming VRAM.

GPU acceleration

NVIDIA (CUDA): Install the NVIDIA Container Toolkit so Docker can pass GPU devices. Ollama auto-detects CUDA when drivers are installed.

AMD (ROCm/HIP): Install ROCm drivers for your GPU. Ollama supports AMD GPUs on Linux with ROCm. Verify with ollama ps while a model is loaded.

Hub’s hardware inspector reads GPU info and tags recommended models accordingly in the wizard.

Standardized inference variables for apps

Apps that use AI can opt in to standardized Hub inference variables via hub_integration.inference in their config.json. At install time, Hub resolves the correct values based on your settings and hardware, then writes them into the app’s app.env.

Hub variable keys

Key	Resolved env (internal)	Description
`llm_base_url`	`CI_LLM_BASE_URL`	OpenAI-compatible base URL (Ollama `/v1` or cloud provider)
`llm_api_key`	`CI_LLM_API_KEY`	API key (`ollama` for local Ollama)
`chat_model`	`CI_CHAT_MODEL`	Default chat/general model ID
`embedding_model`	`CI_EMBEDDING_MODEL`	Default embedding model ID
`vision_model`	`CI_VISION_MODEL`	Default vision-capable model ID
`ollama_host`	`OLLAMA_HOST`	Native Ollama URL (not OpenAI-compatible)

Example in `config.json`

Map Hub-resolved values to the env variable names your app expects:


{
  "hub_integration": {
    "inference": {
      "llm_base_url": "LLM_API_BASE",
      "llm_api_key": "LLM_API_KEY",
      "chat_model": "LLM_DEFAULT_CHAT_MODEL",
      "embedding_model": "LLM_DEFAULT_EMBEDDING_MODEL",
      "ollama_host": "OLLAMA_HOST"
    }
  }
}

Apps without hub_integration.inference receive no AI variables — zero overhead for non-AI apps.

Resolution order

For each variable, Hub resolves in this order:

Your Hub-wide preference (Settings → AI)
Hardware-aware recommendation from the model registry
Omitted — the app must handle the variable being absent

If Ollama is unavailable and no cloud provider is configured, Hub omits all inference variables and logs a warning.

MCP integration for agent apps

Agent apps (like OpenClaw) can declare hub_integration.mcp_client: true to receive Hub MCP endpoints:

Variable	Purpose
`HUB_URL`	Internal Hub API URL
`HUB_MCP_URL`	MCP SSE endpoint
`HUB_MCP_MESSAGES_URL`	MCP messages endpoint
`HUB_MCP_API_KEY`	Auth key (when MCP is enabled)
`HUB_WAKE_SECRET`	Wake hook secret

Enable MCP on your Hub:


cihub mcp setup
cihub mcp config

See Companion Agent for the agent ecosystem.

CLI model management


cihub models list
cihub models install mistral
cihub models rm mistral
cihub status    # shows installed models in the Models section