Slash Cloud Egress Bills

One fetch from the cloud. Every repeat from cache.
Stop paying for the same bytes twice.

Large-file repositories generate massive object storage egress bills — every clone, fetch, and hydrate downloads gigabytes from S3, GCS, or Azure. The Crab cache service intercepts those requests and serves repeated reads from local NVMe, so you only pay for the first download.

Read the Docs Contact Us

Why Cache

Object Storage Bills Add Up Fast

Every crab clone, crab hydrate, and git fetch downloads xorbs, shards, and packs from your bucket. With large repos, a single clone can pull tens of gigabytes. Multiply that by your team size and CI runners — egress charges dominate your cloud bill.

Egress Costs

Cloud providers charge per GB downloaded. S3 egress is $0.09/GB after the first 100 GB/month. A 50 GB repo cloned by 20 developers = 1 TB of egress per month.

Redundant Downloads

Without caching, every clone and fetch re-downloads the same immutable objects (xorbs, shards, packs) that another team member already fetched minutes ago.

Team Multiplier

Costs scale linearly with team size. CI pipelines make it worse — each job starts fresh and downloads everything from scratch on every run.

Cache Eliminates Repeats

The cache service stores objects on local NVMe after the first fetch. Every subsequent request for the same object is served from cache — zero egress, zero cost.

How the savings work

Crab objects are immutable and content-addressed (blake3 hashes). Once an xorb, shard, or pack is fetched from origin, it never changes. The cache service exploits this: it stores every fetched object on disk and serves all future requests for that hash from local storage. Push warming goes further — newly uploaded objects are written to the cache immediately, so teammates never hit origin at all.

1×

Each object fetched from cloud exactly once

N×

Served from cache for all N subsequent requests

Egress cost for every cache hit

Architecture

Three-Tier Cache Hierarchy

Lookups cascade through each tier until a hit is found. Misses populate the cache on the way back.

Cache Components

Every Layer Optimized for Large Files

From in-process memory to shared network cache to cloud origin — each tier handles different access patterns.

Local Disk Cache

Hash-verified, LRU-evicted caching of shards, file-index entries, and manifests on disk at ~/.cache/crab/. Always active — zero configuration required.

Metadata Cache Warming

Proactively warms shard and file-index metadata on clone and fetch. Subsequent hydrations resolve chunk locations without round-trips to cloud storage.

Shared Cache Service

Optional HTTP cache (crab-cache-server) that sits between clients and cloud storage. Multiple developers share a warm cache backed by NVMe disk with redb metadata.

Cross-Repo Chunk Dedup

The cache service maintains a chunk index spanning multiple repositories. Dedup queries batch up to 100,000 chunk hashes per request, identifying data already stored.

Push Warming

Newly pushed xorbs are written to the cache service immediately after upload to origin. Teammates benefit from warm cache hits without waiting for cold-start downloads.

Blake3-Verified Storage

Every object written to the cache is verified against its blake3 content hash. Hash mismatches from origin are rejected with HTTP 502 — no corrupt data served.

Performance

Orders of Magnitude Faster

Cache hits vs cache misses — the difference is dramatic for large-file workflows.

Cache Hit (Local Disk)

<1ms

Cache Hit (Service)

~5ms

Cache Miss (Cloud)

~100ms

Cache hits deliver up to 100× lower latency than cloud fetches

Service Limits

Built for Production Workloads

Enforced limits protect the service from abuse while supporting large-scale workflows.

0 MiB
Max request body
Tower middleware rejects oversized uploads with HTTP 413.
0k hashes
Dedup batch size
Chunk hashes per dedup query request to the index.
0 req
Concurrency limit
Simultaneous requests handled before backpressure.
0 TiB
Default cache budget
Configurable max_cache_bytes with LRU eviction.

Cache Service Internals

Inside crab-cache-server

A single binary with auth middleware, content-addressed storage on NVMe, cross-repo chunk index, and background eviction.

Cache Service Features

Production-Ready from Day One

Authentication, observability, eviction, and deployment — all built in.

Flexible Authentication

Three auth modes: pre-shared key (X-Cache-PSK header), bearer token (JWT via JWKS), or mutual TLS with client certificate identity. Choose based on your infrastructure.

Weighted LRU Eviction

Background evictor runs every 60 seconds with configurable high/low water marks. Nudged immediately after writes that cross the threshold. Type-weighted to prefer evicting xorbs over shards.

Prometheus Metrics

Built-in /v1/metrics endpoint exposes cache hits, misses, bytes served, origin fetch latency, dedup query performance, push warming counts, and current disk usage.

Health Probes

Kubernetes-ready health endpoints: /v1/health checks origin connectivity (cached 5s TTL, returns 503 if unreachable), /v1/health/live always returns 200 for liveness.

Deploy Anywhere

Ships as a single binary. Deploy via Docker (multi-stage Dockerfile), Kubernetes (Deployment + ConfigMap + PVC), or systemd with filesystem hardening.

Request Limits

Tower middleware enforces a 256 MiB max request body. Dedup queries are capped at 100,000 chunk hashes. Concurrency limited to 200 simultaneous requests with 300s timeout.

Configuration

Simple TOML Configuration

Point at your bucket, set an auth key, and the server handles the rest. Eviction, metrics, and health checks work out of the box.

Cost Impact

With Cache vs Without

The cache service turns per-user egress costs into a one-time fetch. Here is what changes.

Feature	With Cache Service	Without (Origin Only)
Cloud egress per repeated fetch	$0 — served from local NVMe	Full egress cost per download
Cost scaling with team size	Fixed (one fetch populates cache for all)	Linear (each user pays full egress)
CI runner egress	Cache hit on warm objects	Full download every pipeline run
Second clone/fetch latency	~5ms per object (network cache hit)	~100ms per object (S3 round-trip)
Cross-repo dedup
Push warming for teammates
Origin traffic reduction	Only first fetch hits cloud	Every fetch hits cloud
Observability	Prometheus metrics + structured logs	Client-side only
Infrastructure required	Single binary + NVMe volume	None

Deployment

Deploy in Minutes

A single Rust binary with no external dependencies. Choose your deployment model.

Docker

Multi-stage Dockerfile builds a minimal Debian image with just the binary and CA certificates. Mount your config and NVMe volume.

Kubernetes

Ready-made manifests with liveness (/v1/health/live) and readiness (/v1/health) probes. Use a local NVMe PVC — the cache is ephemeral and rebuilds on miss.

systemd

Unit file with filesystem hardening, dedicated service user, and 65,536 file descriptor limit. Graceful SIGTERM shutdown drains in-flight requests.

Stop paying for the same bytes twice

Deploy the cache service to collapse redundant cloud egress into a single fetch. Your team and CI runners get warm cache hits — your cloud bill drops.

Deployment Guide Contact Us

Feature

With Cache Service

Without (Origin Only)

Cloud egress per repeated fetch

$0 — served from local NVMe

Full egress cost per download

Cost scaling with team size

Fixed (one fetch populates cache for all)

Linear (each user pays full egress)

CI runner egress

Cache hit on warm objects

Full download every pipeline run

Second clone/fetch latency

~5ms per object (network cache hit)

~100ms per object (S3 round-trip)

Cross-repo dedup

Push warming for teammates

Origin traffic reduction

Only first fetch hits cloud

Every fetch hits cloud

Observability

Prometheus metrics + structured logs

Client-side only

Infrastructure required

Single binary + NVMe volume

None

One fetch from the cloud. Every repeat from cache.Stop paying for the same bytes twice.

Object Storage Bills Add Up Fast

Egress Costs

Redundant Downloads

Team Multiplier

Cache Eliminates Repeats

How the savings work

Three-Tier Cache Hierarchy

Every Layer Optimized for Large Files

Orders of Magnitude Faster

Built for Production Workloads

Inside crab-cache-server

Production-Ready from Day One

Simple TOML Configuration

With Cache vs Without

Deploy in Minutes

Docker

Kubernetes

systemd

Stop paying for the same bytes twice

One fetch from the cloud. Every repeat from cache.Stop paying for the same bytes twice.

Object Storage Bills Add Up Fast

Egress Costs

Redundant Downloads

Team Multiplier

Cache Eliminates Repeats

How the savings work

Three-Tier Cache Hierarchy

Every Layer Optimized for Large Files

Orders of Magnitude Faster

Built for Production Workloads

Inside crab-cache-server

Production-Ready from Day One

Simple TOML Configuration

With Cache vs Without

Deploy in Minutes

Docker

Kubernetes

systemd

Stop paying for the same bytes twice

One fetch from the cloud. Every repeat from cache.
Stop paying for the same bytes twice.

One fetch from the cloud. Every repeat from cache.
Stop paying for the same bytes twice.