Modern AI agents operate with authority — executing tools, accessing credentials, and interacting with external systems. Many defenses focus on detecting malicious inputs. MVAR instead enforces deterministic security boundaries at execution sinks, where privileged actions occur.
Core design principle: separate influence from authority.
Untrusted data may influence reasoning; privileged execution is governed by policy invariants.
MVAR implements three enforcement layers:
1. Provenance-based information flow control All data carries integrity and confidentiality labels with conservative propagation. Policy decisions derive from data lineage rather than payload inspection.
2. Capability-based runtime constraints No ambient authority. Tools execute within explicitly declared permissions. Targets are enforced individually (e.g., api.gmail.com ≠ arbitrary domains).
3. Deterministic sink policy evaluation Privileged actions are evaluated against strict invariants:
UNTRUSTED + CRITICAL → BLOCK
Decisions are deterministic and produce evaluation traces.
When enabled, decisions may be cryptographically signed (QSEAL Ed25519) for tamper-evident auditability.
Validation
Evaluated against a reproducible 50 vector adversarial corpus spanning nine attack categories (command injection, encoding/obfuscation, multi-stage execution, credential theft, etc.).
Validation suite runs locally in ~2 minutes.
Scope, assumptions, and limitations are explicitly documented in THREAT_MODEL.md.
This release represents Phase 1, focused on deterministic enforcement rather than detection or behavioral scoring. Composition attacks and automatic sink discovery are future work.
Open source (Apache 2.0).
Repository: https://github.com/mvar-security/mvar Site: https://mvar.io
MVAR is an IFC-based reference monitor for AI agent runtimes. Rather than attempting to detect prompt injection at the model layer, it enforces deterministic policy at privileged execution sinks.
Core invariant:
UNTRUSTED + CRITICAL → BLOCK
All data carries integrity and confidentiality labels with conservative propagation. Policy decisions depend on provenance and sink classification, not payload inspection or intent scoring.
Enforcement is structural rather than content-aware. MVAR does not parse prompts or evaluate semantics; it evaluates data lineage flowing into privileged sinks.
The goal is impact reduction: preventing untrusted-derived outputs from triggering unsafe tool execution.
Phase 1 scope and known limitations are documented in THREAT_MODEL.md (manual sink registration, no composition attack modeling yet, etc.).
Reproduce locally: ./scripts/launch-gate.sh
Happy to answer technical questions and welcome adversarial feedback.