Working with Files
Adding Files
Use crab add when a file should be stored by Crab instead of directly in
Git. It chunks matching files, deduplicates the content, stores new chunks in
.crab/staging/, and stages pointer blobs in Git.
crab add models/checkpoint.safetensors
crab add 'data/*.parquet'
crab add .When to Use It
| Situation | Command |
|---|---|
| Add one large file | crab add models/model.bin |
| Add files by pattern | crab add '*.safetensors' |
| Add everything matched by Crab tracking rules | crab add . |
| Preview without staging | crab add . --dry-run |
| Tune local parallelism | crab add . --jobs 16 |
Run crab track <glob> first for reusable patterns, then commit the resulting
.gitattributes change so collaborators use the same rules.
After Adding
crab status
crab ls-files --size
git diff --cached
git commit -m "Add dataset"
crab push origin maincrab add stages Crab pointer blobs for Git. Keep using normal Git commands
for review and commits.
Common Pitfalls
- Shell globs can expand before Crab sees them. Quote patterns like
crab add '*.bin'when you want Crab to match them. crab addis local until you push. The staged chunks stay in.crab/staging/untilcrab pushuploads them.- If a file is not matched, check
.gitattributesandcrab track --list. - Re-run
crab addafter changing a large file before committing.
For complete syntax, flags, path matching details, and JSON/JSONL output, see the crab add reference.