For the last year we've been building an AI mobile app builder. Last week we decided to generalize our agent harness and benchmark it against vanilla Claude Code. Turns out the tools we built for app building were significantly more efficient than the native Claude Code tools. When we ran benchmarks we beat Claude code on speed (~40% faster) , cost (25-55% cheaper) and performance (+17% on Terminal Bench 2.0). Today we're making our agent harness publicly available as a Claude Code plugin.