I have been using Claude code for the past 6 months. In that time, multiple revisions of each model have come out. I have seen some improvement, especially in regards to sycophancy, with recent iterations.
However, I can't differentiate the outputs of either. To me, sonnet seems just as capable as opus.
Have any of y'all run real life tests? Mine seem to be too random to say either way.
My main example of the "models are different", I have a legacy codebase (dating back to 1999) that has a rare crashing bug. Multiple humans have been trying to debug this thing for over 10 years. I personally put in maybe 100 hours late last year trying to solve this one crashing bug. I've thrown this problem at every AI model that came out too, the Sonnets didn't find anything. Opus 4.5 was the first to create a "workaround", that would shut down the program just before the crash and at least let a customer save their work. But Opus 4.6 actually solved the entire bug on its first try. That's the moment when I really wished AI had existed earlier, thinking of the 100s of hours of my life wasted trying to debug this thing - time I would rather have spent with loved ones.
As for Sonnet, just yesterday I used Sonnet 4.6 to write a USB driver for myself. I only chose Sonnet because I was forced to use the API yesterday, and I didn't want to pay Opus 4.7 premium API costs for this. The poor thing was hammering away for multiple hours, enormous copious multi-turn levels of just-thinking blocks with no tool actions. At one point, Sonnet even got stuck in a thinking loop, and I had to coax it to relax and just give its best effort at some code so we could at least try debugging... which, actually worked. I'm impressed that Sonnet got a minimal but working USB audio driver on an obscure OS for just $30 of API costs.
That said - I then gave Sonnet's code to Opus 4.7 today when I had access to my Claude Max again. 4.7 immediately found lots of pitfalls in the code on the first turn and presented a much more coherent plan for continued development & debugging. Sonnet's code worked, as long as you didn't touch any audio settings, because then it exploded with spectacular kernel panics.
Sonnet being faster alone would not be worth the failure rate for me.
At home i just not want to pay more than 20 bucks for incidental projects.
And opus max would just consume my tokens in one round.
This may be worth the discount. Or not if your time and attention is worth (quite) a lot.