Ask HN: Can you tell the difference between Claude Sonnet and Opus?
4 points
9 hours ago
| 6 comments
| HN
Hello

I have been using Claude code for the past 6 months. In that time, multiple revisions of each model have come out. I have seen some improvement, especially in regards to sycophancy, with recent iterations.

However, I can't differentiate the outputs of either. To me, sonnet seems just as capable as opus.

Have any of y'all run real life tests? Mine seem to be too random to say either way.

SyneRyder
2 hours ago
[-]
Big differences. But sometimes those differences aren't necessary, you don't need someone with a PhD to cook your hamburgers. You might not need Opus for what you're doing.

My main example of the "models are different", I have a legacy codebase (dating back to 1999) that has a rare crashing bug. Multiple humans have been trying to debug this thing for over 10 years. I personally put in maybe 100 hours late last year trying to solve this one crashing bug. I've thrown this problem at every AI model that came out too, the Sonnets didn't find anything. Opus 4.5 was the first to create a "workaround", that would shut down the program just before the crash and at least let a customer save their work. But Opus 4.6 actually solved the entire bug on its first try. That's the moment when I really wished AI had existed earlier, thinking of the 100s of hours of my life wasted trying to debug this thing - time I would rather have spent with loved ones.

As for Sonnet, just yesterday I used Sonnet 4.6 to write a USB driver for myself. I only chose Sonnet because I was forced to use the API yesterday, and I didn't want to pay Opus 4.7 premium API costs for this. The poor thing was hammering away for multiple hours, enormous copious multi-turn levels of just-thinking blocks with no tool actions. At one point, Sonnet even got stuck in a thinking loop, and I had to coax it to relax and just give its best effort at some code so we could at least try debugging... which, actually worked. I'm impressed that Sonnet got a minimal but working USB audio driver on an obscure OS for just $30 of API costs.

That said - I then gave Sonnet's code to Opus 4.7 today when I had access to my Claude Max again. 4.7 immediately found lots of pitfalls in the code on the first turn and presented a much more coherent plan for continued development & debugging. Sonnet's code worked, as long as you didn't touch any audio settings, because then it exploded with spectacular kernel panics.

reply
thom-gtdp
3 hours ago
[-]
Yes. I'm using them from GitHub copilot, since not all models are priced the same I use cheap ones by default, then upgrade if needed. It happened a few times that ChatGPT-5-mini could not solve something decently. Claude Sonnet is good enough most of the time. If not, I switch to Opus and so far it solved all problems Sonnet couldn't
reply
eddyzh
7 hours ago
[-]
At work I use opus max Fast It hardy ever fails for no reason even if I forget to give it all the right context. At home i run sonnet, and it does not get what I meant or expected 20-35% of the time. Due to the enormous difference in cost, depending on the value of your time (hourly rate) that might be a nett benefit.

Sonnet being faster alone would not be worth the failure rate for me.

At home i just not want to pay more than 20 bucks for incidental projects.

And opus max would just consume my tokens in one round.

reply
sminchev
7 hours ago
[-]
Yes. When things get too complex Sonnet misses some things. For example, it creates all the components, but does not link them. Or it does not go deep enough in the code and misses certain usages and possible regressions. In other words, it does not, pro-actively, search for things that I have forgotten to tell the model about.
reply
eddyzh
7 hours ago
[-]
Exactly this.

This may be worth the discount. Or not if your time and attention is worth (quite) a lot.

reply
nawi
9 hours ago
[-]
You are not missing anything. For 95% of dev work, sonnet, especially 3.5 and 3.7 has basically win opus, value per price. in my experience the difference boils down to this 1. Sonnet is the faster. It's concise, follows instructions literally, and is significantly better at agentic tasks. 2. Opus is the philosopher. It’s better at high level architecture, creative writing, or spotting subtle nuances in a 50 pages document. the reason your tests feel random is that for standard coding, sonnet is actually the superior model now. it is faster, less prone to over engineering, and has much lower latency. if you have a massive, messy refactor where you need the model to reason through 10 files without adding bugs, opus might still have a slight edge in coherence. for everythng else, Sonnet is the meta. Stick with it and save the credits.
reply
aykutseker
5 hours ago
[-]
in short tasks they look identical and most people can't tell. opus shows its edge in long agent loops and 50k+ context, when sonnet starts dropping tool calls or rerunning steps. sonnet's fine for short stuff and the price is better. on longer agentic flows opus actually earns the cost in my experience.
reply