After two years building AI agents in production, I experienced firsthand how frustrating it is to manage context at scale. Storing messages, iterating system prompts, debugging behavior and multi-agent patterns—all while keeping track of everything without breaking anything. It was driving me insane.
So I built UltraContext. The mental model is git for context:
- Updates and deletes automatically create versions (history is never lost)
- Replay state at any point
The API is 5 methods:
uc.create() // new context (can fork from existing)
uc.append() // add message
uc.get() // retrieve by version, timestamp, or index
uc.update() // edit message → creates version
uc.delete() // remove message → creates version
Messages are schema-free. Store conversation history, tool calls, system prompts—whatever shape you need. Pass it straight to your LLM using any framework you'd like.What it's for:
- Persisting conversation state across sessions
- Debugging agent behavior (rewind to decision point)
- Forking contexts to test different flows
- Audit trails without building audit infrastructure
- Multi-agent and sub-agent patterns
What it's NOT:
- Not a memory/RAG system (no semantic search)
- Not a vector database
- Not an Orchestration/LLM framework
UltraContext handles versioning, branching, history. You get time-travel with one line.
Docs: https://ultracontext.ai/docs
Early access: https://ultracontext.ai
Would love feedback! Especially from anyone who's rolled their own context engineering and can tell me what I'm missing.
I’ve been working on AI memory backends and context management myself and the core insight here — that context needs to be versionable and inspectable, not just a growing blob — is spot on.
Tried UltraContext in my project TruthKeeper and it clicked immediately. Being able to trace back why an agent “remembered” something wrong is a game changer for production debugging.
One thing I’d love to see: any thoughts on compression strategies for long-running agents? I’ve been experimenting with semantic compression to keep context windows manageable without losing critical information. Great work, will be following this closely.
The goal is to focus on the core building blocks that enable more sophisticated use cases. Compression, compaction, and offloading strategies should be straightforward to build on top of UC rather than baked in at the core layer.
Quick backstory: every agent project I worked on, I spent more time on context infrastructure than the actual product. Same pattern—duct-tape a store, lose history, debug blind when things broke.
The "aha" was needing git semantics for a project where users wanted to edit messages while still being able to travel back. So that's what I built: immutable history, branch on change, rewind to any commit. But I didn't want to expose that complexity. So the API is just contexts and messages. Versioning happens automatically.
Still early. What context engineering problems are you hitting with your agents?
[see https://news.ycombinator.com/item?id=45988611 for explanation]
Also, let's say an agent runs like 1000s of times, would each of those times become a version history?
I'm particularly interested in how parsing through agent context would work!
On scaling: appends don't create versions, only updates and deletes do. So for your 10k message conversations, uc.get() is O(n) reads. Standard database scaling. The versioning overhead only kicks in when you're actually mutating context, and even then we handle the optimization so you don't have to think about it.
On version history: each agent run doesn't create a version. Versions are created when you update or delete a message. So if your agent appends 1000 messages across 1000 runs, that's just 1000 appends. No version explosion.
Time travel (rewinding to a specific point) is also O(n). This was my personal main bottleneck when deploying B2C agents, so the API is heavily optimized for it.
For your 5 accounts x 5 conversations setup: you'd have 25 separate contexts. Each scales independently. Parse through them however you want, filter by metadata, retrieve by timestamp or index.
Vector DBs are great for retrieval: "find relevant chunks to add to the prompt." But they're not designed for state management. When you update a message, there's no version history. When something breaks, you can't rewind to see exactly what the agent saw.
UltraContext manages the structured context (conversation history, tool calls, system prompts) with git semantics. Fork, rewind, merge. You can't git revert a vector embedding.
They're complementary. You'd use a vector DB to decide what goes in the context, and UltraContext to manage what's already there.
For single-agent flows, parallel works fine.
On UltraContext, you'll do it like this:
// Shared context all agents read/write to
await uc.append(sharedCtx, { agent: 'planner', content: '...' })
await uc.append(sharedCtx, { agent: 'researcher', content: '...' })
// Or fork into separate branches per agent
const branch = await uc.create({ from: sharedCtx, at: 5 })
Schema-free, so you can tag messages by agent, track provenance in metadata, and branch/merge however you want. Full history on everything.
For sensitive use cases, I'm exploring a few options: client-side encryption, data anonymization, or a local-first deployment where context never leaves your infra. Not there yet, but it's on the roadmap.
What's your use case? Happy to chat about what privacy guarantees you'd need.