Skip to Content
DocumentationFirst-Party Apps🌌 Tools Cache Mount

🌌 Tools Cache Mount

Accelerated execution. A dedicated, optimized local caching layer designed to speed up tool deployments and manage your AI assets efficiently.

Overview

Tools Cache Mount is a persistent, content-addressed caching service for your Companion Hub. It eliminates redundant downloads and redundant computation by storing model weights, container build layers, agent tool binaries, and intermediate computation artifacts in a shared, optimized cache that all Hub applications can access.

For users with multiple AI apps on their Hub (Ollama, SD WebUI, ComfyUI, Local Bench, Static Container Builder), Tools Cache Mount provides a single authoritative store for shared assets β€” so a model file downloaded once is available to every app that needs it.

Key Features

  • Content-addressed storage β€” assets are addressed by their SHA-256 hash; no duplicates
  • Model weight deduplication β€” multiple apps can share the same model file without duplication
  • Build layer cache β€” the Static Container Builder uses this as its layer cache backend
  • Bandwidth throttling β€” configurable download rate limits to prevent network saturation during asset pulls
  • Prefetch β€” schedule asset downloads in advance (e.g. β€œdownload Llama 3.2 tonight”)
  • Eviction policies β€” LRU and pinned-item policies to manage storage limits
  • Access control β€” per-app read/write permissions for the cache
  • Cache metrics β€” hit/miss ratios, storage breakdown, bandwidth saved

Use Cases

  • Share a single copy of llama3.1:8b between Ollama, LocalAI, and Open WebUI without using 3Γ— the disk
  • Speed up Static Container Builder builds by caching Docker base layers
  • Pre-download large model weights overnight before you need them
  • Track exactly which AI assets are on your Hub and how much space they consume
  • Free disk space by identifying and removing unused models across all apps

Architecture

Hub Applications (Ollama, SD WebUI, ComfyUI, Builder, LocalAI...) β”‚ cache read/write via mount or API β–Ό Tools Cache Mount β”œβ”€β”€ Content-addressed object store (NVMe optimized) β”œβ”€β”€ Metadata index (asset name β†’ hash β†’ path) β”œβ”€β”€ Download manager (queued, throttled pulls) └── Eviction manager (LRU / pinned policy) β”‚ └──▢ Hub file storage (configurable path)

Apps access the cache via a FUSE mount (so no code changes needed), a bind mount, or REST API.

Setup

Install from Hub

Search for Tools Cache Mount in the Hub app store and install.

Configure storage path

In http://tools-cache.ci.localhost β†’ Settings β†’ Storage, set the cache root directory. This should be on your fastest available storage β€” ideally NVMe SSD.

Set size limits

Configure Max Cache Size and choose an eviction policy:

  • LRU (default) β€” evict least-recently-used items when full
  • Pinned first β€” never evict pinned items; evict LRU from unpinned only

Connect Ollama (example)

Add a bind mount to your Ollama compose configuration to route its model storage through the cache:

# In Ollama's docker-compose.yml (managed by Hub) volumes: - ci-tools-cache:/root/.ollama/models

Hub automatically manages this for Companion-native apps. For third-party apps, see the integration guideΒ .

Usage

Viewing Cache Contents

Navigate to http://tools-cache.ci.localhost β†’ Assets for a full inventory of every cached asset, its size, last accessed time, and which apps reference it.

Pinning an Asset

Right-click any asset β†’ Pin. Pinned assets are never evicted regardless of the eviction policy.

Prefetching

Queue a download for later:

ci-cache prefetch \ --url https://huggingface.co/bartowski/Llama-3.2-8B-Instruct-GGUF/resolve/main/Llama-3.2-8B-Instruct-Q4_K_M.gguf \ --schedule "tonight at 2am"

Clearing Unused Assets

From the Assets view, click Cleanup β†’ Remove Unreferenced to delete assets with zero active references.

API Usage

# Check cache hit for a specific hash curl http://tools-cache.ci.localhost/api/objects/sha256:<hash> # List all cached objects curl http://tools-cache.ci.localhost/api/objects | jq '.[].name'

Troubleshooting

Hub apps still downloading models redundantly The app must be configured to use the cache mount path. For Companion-native apps this is automatic. For manually installed apps, add the appropriate volume bind in their compose file.

Cache hit rate is low Check the metrics dashboard for the hit/miss ratio. Low hit rates usually mean apps are accessing model files from non-cache paths. Review each app’s volume configuration.

Disk full despite eviction policy If all assets are pinned, eviction cannot free space. Review pinned items in the Assets view and unpin those no longer needed.

Permission denied accessing cache Check the app’s cache access permissions in Settings β†’ Access Control. Hub-managed apps are granted access automatically at install time.

Last updated on