One fetch from the cloud. Every repeat from cache.
Stop paying for the same bytes twice.
Large-file repositories generate massive object storage egress bills — every clone, fetch, and hydrate downloads gigabytes from S3, GCS, or Azure. The Crab cache service intercepts those requests and serves repeated reads from local NVMe, so you only pay for the first download.
Object Storage Bills Add Up Fast
Every crab clone, crab hydrate, and git fetch downloads xorbs, shards, and packs from your bucket. With large repos, a single clone can pull tens of gigabytes. Multiply that by your team size and CI runners — egress charges dominate your cloud bill.
Egress Costs
Cloud providers charge per GB downloaded. S3 egress is $0.09/GB after the first 100 GB/month. A 50 GB repo cloned by 20 developers = 1 TB of egress per month.
Redundant Downloads
Without caching, every clone and fetch re-downloads the same immutable objects (xorbs, shards, packs) that another team member already fetched minutes ago.
Team Multiplier
Costs scale linearly with team size. CI pipelines make it worse — each job starts fresh and downloads everything from scratch on every run.
Cache Eliminates Repeats
The cache service stores objects on local NVMe after the first fetch. Every subsequent request for the same object is served from cache — zero egress, zero cost.
How the savings work
Crab objects are immutable and content-addressed (blake3 hashes). Once an xorb, shard, or pack is fetched from origin, it never changes. The cache service exploits this: it stores every fetched object on disk and serves all future requests for that hash from local storage. Push warming goes further — newly uploaded objects are written to the cache immediately, so teammates never hit origin at all.
Three-Tier Cache Hierarchy
Lookups cascade through each tier until a hit is found. Misses populate the cache on the way back.
Every Layer Optimized for Large Files
From in-process memory to shared network cache to cloud origin — each tier handles different access patterns.
Orders of Magnitude Faster
Cache hits vs cache misses — the difference is dramatic for large-file workflows.
Built for Production Workloads
Enforced limits protect the service from abuse while supporting large-scale workflows.
- 0 MiBMax request body
Tower middleware rejects oversized uploads with HTTP 413.
- 0k hashesDedup batch size
Chunk hashes per dedup query request to the index.
- 0 reqConcurrency limit
Simultaneous requests handled before backpressure.
- 0 TiBDefault cache budget
Configurable max_cache_bytes with LRU eviction.
Inside crab-cache-server
A single binary with auth middleware, content-addressed storage on NVMe, cross-repo chunk index, and background eviction.
Production-Ready from Day One
Authentication, observability, eviction, and deployment — all built in.
Simple TOML Configuration
Point at your bucket, set an auth key, and the server handles the rest. Eviction, metrics, and health checks work out of the box.
With Cache vs Without
The cache service turns per-user egress costs into a one-time fetch. Here is what changes.
| Feature | With Cache Service | Without (Origin Only) |
|---|---|---|
| Cloud egress per repeated fetch | $0 — served from local NVMe | Full egress cost per download |
| Cost scaling with team size | Fixed (one fetch populates cache for all) | Linear (each user pays full egress) |
| CI runner egress | Cache hit on warm objects | Full download every pipeline run |
| Second clone/fetch latency | ~5ms per object (network cache hit) | ~100ms per object (S3 round-trip) |
| Cross-repo dedup | ||
| Push warming for teammates | ||
| Origin traffic reduction | Only first fetch hits cloud | Every fetch hits cloud |
| Observability | Prometheus metrics + structured logs | Client-side only |
| Infrastructure required | Single binary + NVMe volume | None |
Deploy in Minutes
A single Rust binary with no external dependencies. Choose your deployment model.
Docker
Multi-stage Dockerfile builds a minimal Debian image with just the binary and CA certificates. Mount your config and NVMe volume.
Kubernetes
Ready-made manifests with liveness (/v1/health/live) and readiness (/v1/health) probes. Use a local NVMe PVC — the cache is ephemeral and rebuilds on miss.
systemd
Unit file with filesystem hardening, dedicated service user, and 65,536 file descriptor limit. Graceful SIGTERM shutdown drains in-flight requests.
Stop paying for the same bytes twice
Deploy the cache service to collapse redundant cloud egress into a single fetch. Your team and CI runners get warm cache hits — your cloud bill drops.