Let's say I submit or let it create a piece of code, and we're working on improving it. At some point I want to consider the piece of code to be significantly better that what I had initially, so all those initial interactions containing old code could be removed from the context.
I like how Google AI Studio allows one to delete sections and they are then no longer part of the context. Not possible in Claude, ChatGPT or Gemini, I think there one can only delete the last response.
Maybe even AI could suggest which parts to disable.
I have the same peeve. My assumption is the ability to freely edit context is seen as not intuitive for most users - LLM products want to keep the illusion of a classic chat UI where that kind of editing doesn't make sense. I do wish ChatGPT & co had a pro or advanced mode that was more similar to Google AI Studio.
/compact we will now work on x, discard y, keep z
Each new prompt involves asking Claude to read CURRENT.md for additional context.
I'm not sure if I should move this to CLAUDE.md but the stuff in CURRENT.md are very short term information that gets useless after a while.
---
There is one time where Claude entirely messed up the directory when moving things around and it sort of stuck in a weird "panic" loop in chat for quite a while (involving oh no / oh dear in chat), nothing git can't fix, but I suspect is due to the directory info in CLAUDE.md getting stale. Ever since then I moved things that might get stale to a separate file and frequently keep it updated/trimmed as needed.
You can simulate this of course by doing the reverse and maintaining explicit memory via a markdown files or w/e of what you want to keep in context. I could see wanting both, since a lot of the time it would be easier to just say "forget that last exploration we did" while still having it remember everything from before that. Think of it like an exploratory twig on a branch that you don't want to keep.
Ultimately I just adapt by making my tasks smaller, using git branches and committing often, writing plans to markdown, etc.
I built a terminal tui to manage my contexts/prompts: https://github.com/pluqqy/pluqqy-terminal
Its faster and cheaper (in most cases) to leave the history as is and hit the cache.
There are 3rd-party chat interfaces out there that have much better context controls if it matters enough for you that you're willing to resort to direct API usage.
But never fear, Gemini 3.0 is rumored to be coming out Tuesday.
Gemini outputs what I want with a similar regularity as the other bots.
I'm so tired of the religious thinking around these models. show me a measurement.
> show me a measurement
Your comment encapsulates why we have religious thinking around models.
Plus, the model got trained and RLed with a continuous context, except if they now tune it with messing with the context as well.
https://manus.im/blog/Context-Engineering-for-AI-Agents-Less...
It's kind of qualitatively different from the human perspective, so not a useless concept, but I think that is mainly because we can't help anthropomorphizing these things.
Context editing: Replace tool call results in message history (i.e replace a file output with an indicator that it’s no longer available).
Memory: Give LLM access to read and write .md files like a virtual file system
I feel like these formalizations of tools are on the path towards managing message history on the server, which means better vendor lock in, but not necessarily a big boon to the user of the API (well, bandwidth and latency will improve). I see the ChatGPT Responses API going a similar path, and together these changes will make it harder to swap transparently between providers, something I enjoy having the ability to do.
I feel that managing context should be doable with a non-SOTA model even locally. Just need a way to select/deselect messages from the context manually say in Claude-CLI.
Funny, I was just talking about my personal use of these techniques recently (tool output summarization/abliteration with memory backend). This isn't something that needs to be Claude Code specific though, you can 100% implement this with tool wrappers.
I've been doing this for a bit, dropping summarized old tool output from context is a big win, but it's still level ~0 context engineering. It'll be interesting to see which of my tricks they figure out next.
Outside of context manipulation, it'd also be nice to standardize a format to label sections of the context and later append keywords to the context to ignore / unignore those sections (or refer back to them, emphasize them, whatever). With that, I imagine we'd be able to have a pretty good set of standard LoRA adapters that enable all LLMs to be controlled in the same fashion. That way agents will be able to manipulate LLMs in a standard way, without having to rewrite context itself.
They fine tuned 4.5 to have `clear_tool_uses` marker tools that it understands without regressing the quality of future responses. You will however pay for the cache invalidation hit anyways, so it would need some evaluations how much this helps.
I'll wait for the day they will release a js library to programmatically do all this llm context juggling, instead of using a UI, and then I will adopt it by doing what I do now, writting code.
I will write code that orchestrates llms for writting code.
Edit: This is obviously a joke... but is it really a joke?
We try to solve a similar problem to put long documents in context. We built an MCP for Claude to allow you to put long PDFs in your context window that go beyond the context limits: https://pageindex.ai/mcp.
For context: I have background in CV and ML in general. Currently reviewing and revising RL.
Any idea how I can get into RL?
I have 3 years of industry/research experience.
Whenever I see post like this, it triggers a massive fomo creating a scene of urgency on I should work in these problems.
Not being able to work here is making be anxious.
what does it take for someone in Non-US/Non-EU region to get into big labs such as these?
Do I really have to pursue PhD? I am already old that pursuing PhD is a huge burden that I can't afford.
The leading AI companies have the kind of capital to be able to hire the very best of the industry. If you’re only just starting now and even worse yet need your hand held to do so you’re totally out of the running…
You're basically like a junior dev trying to apply to be a lead dev for a team at Google
Really need to get hands dirty here. I remember taking RL course from coursera during 2020 covid. I didn't have the chance to apply it in the problems I worked post covid.
But I really want to start doing RL again. Interested in world models and simulation for RL.
Steam and others have figured it out, but Anthropic/Discord (who just had a breach like yesterday) still don't let you remove your payment info.
I haven't figured out* how to switch to a direct subscription other than cancelling and resubscribing, and I'm afraid of messing up my account access.
* Caveat that I haven't spent more than 20 minutes trying to solve this.
1. Multi-agent orchestration 2. Summarising and chunking large tool and agent responses 3. Passing large context objects by reference between agents and tools
Two things to note that might be interesting to the community:
Firstly, when managing context, I recommend adding some evals to our context management flow, so you can measure effectiveness as you add improvements and changes.
For example, our evals will measure the impact of using Anthropics memory over time. Thus allowing our team to make a better informed decisions on that tools to use with our agents.
Secondly, there's a tradeoff not mentioned in this article: speed vs. accuracy. Faster summarisation (or 'compaction') comes at a cost of accuracy. If you want good compaction, it can be slow. Depending on the use case, you should adjust your compaction strategy accordingly. For example, (forgive my major generalisation), for consumer facing products speed is usually preferred over a bump in accuracy. However, in business accuracy is generally preferred over speed.