How It Works
The cache service sits between Crab clients and the object store. It does not own repository state and does not replace the object store. It caches immutable data and answers dedup queries.
Read Flow
For cacheable data, Crab reads in this order:
- Local machine cache.
- Shared cache service.
- Origin object store.
If the shared cache is unavailable, Crab falls back to origin access.
Push Flow
Crab uploads to the origin object store first. After origin upload succeeds, the client can warm the cache service with the uploaded data.
Push warming is best effort. A cache warming failure logs a warning but does not fail the push.
Before uploading new chunks, Crab can also ask the cache service which chunks are already known. Known chunks can be reused instead of uploaded again.
Cached Data
| Data | Cached |
|---|---|
| Crab xorbs | Yes |
| Crab shards | Yes |
| Crab file-index objects | Yes |
| Git pack files and indexes | Yes |
Git refs, HEAD, locks, and mutable config | No |
Mutable repository state is never cached. This prevents stale refs and stale metadata from breaking clone, fetch, and push behavior.
Availability Model
Treat the cache as an acceleration layer:
- Cache failure should cause slower reads, not broken reads.
- Origin access must remain configured for Crab clients.
- Readiness checks should remove a cache instance from rotation when origin is unreachable.
- Liveness checks should restart only a stuck cache process.
Isolation Model
Use one cache service per trust domain. If teams, tenants, or environments must not share dedup visibility, run separate cache services.
Authorization policies are useful for operational access control. They are not a substitute for separate cache instances in regulated multi-tenant deployments.