crab workflow
Manage workflow pipelines and synchronize local stage cache with the remote object store.
Synopsis
crab workflow push-cache --all [--json]Description
crab workflow provides subcommands for managing Crab's workflow pipeline
infrastructure. Currently it exposes push-cache, which backfills the remote
cache from locally-produced stage cache entries.
push-cache
Scans all local stage cache entries under .crab/cache/stages/ and pushes any
that lack a corresponding remote ref. This is the batch counterpart to
crab run --cache-push, which pushes incrementally after each stage commits.
Use push-cache when you have accumulated local cache entries (from runs
without --cache-push) and want to share them with collaborators or CI
runners that pull from the remote cache. It reuses the same credential chain
and store construction as crab push, so no separate credential configuration
is needed.
If no local cache entries exist (.crab/cache/ is absent), the command
reports zero entries and exits successfully.
Options
| Option | Short | Default | Description |
|---|---|---|---|
--all | false | Push all local stage cache entries missing from the remote. Required for batch operation. | |
--json | false | Emit structured JSON output instead of human-readable text. |
Prerequisites
- Workflow feature enabled with
crab config set workflow.enabled true. - A valid
crab://remote URL configured (viacrab initor git remote). - Valid cloud credentials for the configured object store (same as
crab push).
Examples
Push all local cache entries to the remote
crab workflow push-cache --allScans .crab/cache/stages/, identifies entries without a remote counterpart,
and uploads them. Output:
Pushed 12 entries, skipped 3 (already remote), 0 errorsPush cache entries with JSON output for scripting
crab workflow push-cache --all --jsonEmits a structured JSON envelope suitable for CI pipelines or automation scripts that need to parse the result programmatically.
Backfill remote cache after offline work
# After running stages locally without --cache-push:
crab run --name train --deps data/ --outs models/ -- python train.py
crab run --name eval --deps models/ --outs reports/ -- python eval.py
# Batch-push all accumulated cache entries:
crab workflow push-cache --allJSON Output
Supports --json for structured output.
crab workflow push-cache --json
{
"schema": "workflow.push_cache",
"version": "1",
"data": {
"pushed": 12,
"skipped": 3,
"errors": 0
}
}Schema Reference
| Field | Type | Description |
|---|---|---|
pushed | integer | Number of cache entries successfully uploaded to the remote |
skipped | integer | Number of entries already present on the remote (no upload needed) |
errors | integer | Number of entries that failed to upload |
See Structured Output for envelope details and error handling conventions.
Related Commands
crab run— execute stages with content-addressed caching (use--cache-pushfor incremental uploads).crab workflow journal— inspect run journals and stage history.crab workflow-lockfile— inspect workflow lockfiles.crab dag— visualize the stage dependency graph.crab metrics— track and query stage metrics.