π Tools Cache Mount
Accelerated execution. A dedicated, optimized local caching layer designed to speed up tool deployments and manage your AI assets efficiently.
Overview
Tools Cache Mount is a persistent, content-addressed caching service for your Companion Hub. It eliminates redundant downloads and redundant computation by storing model weights, container build layers, agent tool binaries, and intermediate computation artifacts in a shared, optimized cache that all Hub applications can access.
For users with multiple AI apps on their Hub (Ollama, SD WebUI, ComfyUI, Local Bench, Static Container Builder), Tools Cache Mount provides a single authoritative store for shared assets β so a model file downloaded once is available to every app that needs it.
Key Features
- Content-addressed storage β assets are addressed by their SHA-256 hash; no duplicates
- Model weight deduplication β multiple apps can share the same model file without duplication
- Build layer cache β the Static Container Builder uses this as its layer cache backend
- Bandwidth throttling β configurable download rate limits to prevent network saturation during asset pulls
- Prefetch β schedule asset downloads in advance (e.g. βdownload Llama 3.2 tonightβ)
- Eviction policies β LRU and pinned-item policies to manage storage limits
- Access control β per-app read/write permissions for the cache
- Cache metrics β hit/miss ratios, storage breakdown, bandwidth saved
Use Cases
- Share a single copy of
llama3.1:8bbetween Ollama, LocalAI, and Open WebUI without using 3Γ the disk - Speed up Static Container Builder builds by caching Docker base layers
- Pre-download large model weights overnight before you need them
- Track exactly which AI assets are on your Hub and how much space they consume
- Free disk space by identifying and removing unused models across all apps
Architecture
Hub Applications
(Ollama, SD WebUI, ComfyUI, Builder, LocalAI...)
β cache read/write via mount or API
βΌ
Tools Cache Mount
βββ Content-addressed object store (NVMe optimized)
βββ Metadata index (asset name β hash β path)
βββ Download manager (queued, throttled pulls)
βββ Eviction manager (LRU / pinned policy)
β
ββββΆ Hub file storage (configurable path)Apps access the cache via a FUSE mount (so no code changes needed), a bind mount, or REST API.
Setup
Install from Hub
Search for Tools Cache Mount in the Hub app store and install.
Configure storage path
In http://tools-cache.ci.localhost β Settings β Storage, set the cache root directory. This should be on your fastest available storage β ideally NVMe SSD.
Set size limits
Configure Max Cache Size and choose an eviction policy:
- LRU (default) β evict least-recently-used items when full
- Pinned first β never evict pinned items; evict LRU from unpinned only
Connect Ollama (example)
Add a bind mount to your Ollama compose configuration to route its model storage through the cache:
# In Ollama's docker-compose.yml (managed by Hub)
volumes:
- ci-tools-cache:/root/.ollama/modelsHub automatically manages this for Companion-native apps. For third-party apps, see the integration guideΒ .
Usage
Viewing Cache Contents
Navigate to http://tools-cache.ci.localhost β Assets for a full inventory of every cached asset, its size, last accessed time, and which apps reference it.
Pinning an Asset
Right-click any asset β Pin. Pinned assets are never evicted regardless of the eviction policy.
Prefetching
Queue a download for later:
ci-cache prefetch \
--url https://huggingface.co/bartowski/Llama-3.2-8B-Instruct-GGUF/resolve/main/Llama-3.2-8B-Instruct-Q4_K_M.gguf \
--schedule "tonight at 2am"Clearing Unused Assets
From the Assets view, click Cleanup β Remove Unreferenced to delete assets with zero active references.
API Usage
# Check cache hit for a specific hash
curl http://tools-cache.ci.localhost/api/objects/sha256:<hash>
# List all cached objects
curl http://tools-cache.ci.localhost/api/objects | jq '.[].name'Troubleshooting
Hub apps still downloading models redundantly The app must be configured to use the cache mount path. For Companion-native apps this is automatic. For manually installed apps, add the appropriate volume bind in their compose file.
Cache hit rate is low Check the metrics dashboard for the hit/miss ratio. Low hit rates usually mean apps are accessing model files from non-cache paths. Review each appβs volume configuration.
Disk full despite eviction policy If all assets are pinned, eviction cannot free space. Review pinned items in the Assets view and unpin those no longer needed.
Permission denied accessing cache Check the appβs cache access permissions in Settings β Access Control. Hub-managed apps are granted access automatically at install time.