We've been working on a similar problem, but went the route of pushing this into the model and runtime layer instead of the orchestration layer. Less scaffolding, more baked into how the model reasons. Happy to share more if useful
Hiding details of reasoning and work traces to delay compaction as much as possible makes sense because in my experience compaction essentially borks our sessions
Good luck keeping your skills and prompts intact with a system wide compaction operation!