Models update. Prompts evolve. Small output shifts can silently break production logic.
If you're extracting structured data (invoices, tickets, reports) from LLMs, a tiny change in model output can cascade into incorrect downstream behavior.
Continuum records a multi-step LLM workflow once, then deterministically replays and verifies it later.
If anything changes — raw model output, parsed JSON, or derived memory — your CI fails.
Example:
1. Run `continuum invoice-demo` 2. It extracts structured fields from an invoice 3. Run `continuum verify-all --strict` → PASS 4. Modify a stored value (e.g., 72 → 99) 5. Run verify again → FAIL
It’s a simple drift guard for LLM pipelines.
No hosted service. No external storage. Just deterministic replay + strict diffing.
Repository: https://github.com/Mofa1245/Continuum
Feedback welcome.
- This isn’t trying to make LLMs deterministic. - It records the full workflow output once, then replays and diffs it later. - The goal is CI drift detection, not runtime enforcement.
Curious how others are currently guarding against silent output drift in production.