My take on a solution for this is https://ossature.dev — .smd spec markdown files + ossature audit / build that gives you DAG orchestration, SHA-traced increments, and tiny focused contexts.
One thing I've been thinking about with tools like this: as you chain more Claude Code calls, the cost compounds fast. A single "cook" run with 5-6 steps could easily burn through $10-15 in API credits if you're not careful about which model handles which step.
Has anyone experimented with routing different recipe steps to different model tiers? E.g., using Haiku for boilerplate generation and Opus only for the steps that actually need deep reasoning? That's where the real savings are in multi-agent orchestration.