Cache Service
Troubleshooting
Start with health, auth, and client configuration. Most cache-service incidents are one of those three.
Quick Checks
curl -fsS https://crab-cache.example.com:8443/v1/health/live
curl -fsS https://crab-cache.example.com:8443/v1/health
curl -i https://crab-cache.example.com:8443/v1/admin/stats
curl -i -H "X-Cache-PSK: $CRAB_CACHE_PSK" \
https://crab-cache.example.com:8443/v1/admin/statsExpected:
- Liveness returns
ok. - Readiness returns
ok. - Admin without auth returns 401.
- Admin with valid auth returns 200.
Client Is Not Using The Cache
Symptoms:
- No cache traffic in server logs.
- No cache hit or miss metrics move.
- Client logs say the service is unhealthy.
Checks:
- Confirm
[cache].service_urlis present in resolved Crab config. - Confirm the client can reach
/v1/health. - Confirm auth mode matches server config.
- Confirm
CRAB_CACHE_PSKorCRAB_CACHE_TOKENis present in the environment runningcrab. - Run with
RUST_LOG=info,crab::cache=debug.
401 Unauthorized
Likely causes:
- Missing PSK or bearer token.
- Wrong PSK.
- Client
service_authdoes not match servermechanism. - mTLS proxy did not forward the validated client identity.
Fix:
- Recompute the PSK hash and compare against server config.
- Re-export secrets in CI steps.
- Confirm proxy identity forwarding for mTLS deployments.
403 Forbidden
The client authenticated, but policy denied the action.
Common fixes:
- Add
readfor fetch, clone, and hydrate. - Add
writefor push warming. - Add
dedupfor dedup queries. - Add
adminfor admin stats and manual eviction. - Include
.crabin repo patterns for normal Crab object traffic.
404 Not Found
For object reads, 404 means the object is not in cache and not found at origin.
Check:
- The server
[origin].urlpoints at the correct bucket. - The object exists in the object store.
- The client is using the expected repository remote.
503 Or 504 Origin Problems
Readiness failure or origin timeout usually means the cache service cannot reach the object store.
Check:
- Cloud provider status.
- Bucket name and region.
- IAM or service-account read permissions.
- S3-compatible endpoint URL.
- Network policy and DNS.
507 Storage Pressure
The cache disk is full or too close to full.
Fix:
- Increase disk size.
- Lower
cache.max_bytes. - Lower eviction high and low watermarks.
- Trigger manual eviction if needed:
curl -X POST -H "X-Cache-PSK: $CRAB_CACHE_PSK" \
https://crab-cache.example.com:8443/v1/admin/evictPush Warming Does Not Increase
Checks:
- Confirm client config has
push_warming = true. - Confirm
service_mode = "cache+dedup"orservice_mode = "cache". - Confirm the cache service health check succeeds from the client.
- Check server logs for 401, 403, or 507 during push.
- Confirm the push actually uploaded new large-file content.
Hit Rate Is Low
Common causes:
- Cache is new and still warming.
- Cache is smaller than the working set.
- Clients point to different cache instances.
- CI jobs read data that is rarely reused.
- Push warming is disabled.
Fix:
- Let the cache warm under normal traffic.
- Increase cache size.
- Route clients consistently.
- Enable push warming for team workflows.
Slow Reads Despite Cache
Check:
- Are requests hitting cache or missing to origin?
- Is the cache disk saturated?
- Is the client far from the cache?
- Is TLS or proxy latency high?
- Is the working set being evicted too aggressively?
Use metrics first, then logs for a specific request window.
When To Escalate
File an issue or escalate through your support path when you see:
- Repeated panics.
- Reproducible data mismatch errors.
- Non-deterministic responses for the same object.
- Cache failures that also prevent normal origin fallback.
Include:
- Server version.
- Sanitized config.
- Recent logs.
/v1/admin/statsoutput.- The exact Crab command that reproduced the issue.