Hydra Workflows
Crab can compose Hydra-style config groups before running experiments. This is useful when model, optimizer, dataset, or deployment choices live in separate config files instead of one flat params file.
Enable Hydra Composition
crab config set workflow.enabled true
crab config set hydra.enabled true
crab config set hydra.config_dir conf
crab config set hydra.config_name config.yamlThese settings live in .crab/config.toml, which is local by default. Commit
workflow YAML and config files, but do not commit credentials or machine-local
paths.
Recommended Layout
conf/
config.yaml
train/
model/
linear.yaml
efficientnet.yaml
optimizer/
adam.yaml
sgd.yaml
params.yaml
crab.yamlRoot config:
# conf/config.yaml
defaults:
- train/model: linear
- train/optimizer: adam
train:
epochs: 20Config group:
# conf/train/model/efficientnet.yaml
train:
model:
name: efficientnet
variant: b0Another group:
# conf/train/optimizer/adam.yaml
train:
optimizer:
name: adam
lr: 0.001Run with Config Groups
crab exp run \
-S train/model=efficientnet \
-S train.optimizer.lr=0.0005 \
-n efficientnet-low-lrGroup overrides select config files. Dotted overrides update scalar values in the composed config. Crab then applies the composed params to the experiment worktree before executing the DAG.
Override Precedence
Use this mental model:
- Load
hydra.config_namefromhydra.config_dir. - Apply defaults listed in the root config.
- Apply config-group overrides such as
train/model=efficientnet. - Apply dotted scalar overrides such as
train.optimizer.lr=0.0005. - Run the workflow in the experiment worktree.
When the same value appears in multiple places, the later step wins.
Stage Declarations
Stages should depend on the values they actually use:
params:
- params.yaml
stages:
train:
cmd: python src/train.py --params params.yaml
deps:
- src/train.py
- data/features.parquet
params:
- train.model.name
- train.model.variant
- train.optimizer.lr
- train.epochs
outs:
- models/model.pkl
metrics:
- metrics/train.jsonThis keeps stage hashes honest. If a config value changes the output, declare that value as a stage param.
Production Tips
- Keep config groups small and named by product choice, such as model family or optimizer, not by one-off experiment names.
- Prefer
crab exp run -S group=value -S key=valueover editing committed params for exploratory runs. - Commit baseline config files before running experiments that teammates must reproduce.
- Use
crab exp show <experiment-id> --jsonto record the composed settings in automation. - Avoid secrets in Hydra configs. Use environment or cloud credential chains for credentials.
Troubleshooting
If an override does not take effect:
crab exp run -S train/model=efficientnet --dry --jsonCheck that hydra.enabled is true, the config directory exists, the root config
name is correct, and the group path matches the directory structure.
If an experiment works locally but not in another clone, confirm the config files are committed and the other clone has the same Hydra config settings.