We swapped OpenAI out for Claude and it required updating about 15 lines of code. All these guys are just commodity to us. If next week there’s a better supplier of commodity AI we’ll spend an hour and swap to something else again. There’s zero loyalty here.
But right now we have 3-5 top contenders that are so evenly matched that the de-facto sticking point is mostly the harness, ie. the collection of proven plugins/commands/tools/agent features that are tuned to the users personal workflow.
It's because the real value of the models is in what we (humanity) fed them, and all of them have eaten the same thing for free.
I don’t see the core models getting dramatically better from where they are now. We’ve clearly hit a plateau.
When I use the planning mode and then code the success rate is much higher. When I ask it to work on specific isolated chunks of code with clear success/failure modes the success rate is again much higher.
Now imagine a world where it recognizes that from my simple throw away non specific prompt. If it was able to fire off 20 different prompts in quick succession it could easily cut my time spent in front of the screen by a third.
The patterns are obvious but they don't do that right now because it's a lot of compute.
We'll be looking at this time where there's a progress bar showing context space the way we look at the Turbo button.
Because the truth is to get the baseline I'm talking about is a finite amount of compute at a certain point.
But no, businesses are dumb. They always have been. Existing businesses get disrupted by new ideas and new technology all the time. This very site is a temple to disruption!
Proprietary advantage is, 99.999% of the time, just structural advantage. You can't compete with Procter & Gamble because they already built their brands and factories and supply chains and you'd have to do all that from scratch while selling cheaper products as upstart value options. And there's not enough money in consumer junk to make that worth it.
But if you did have funding and wanted to beat them on first principles? Would you really start by training an LLM on what they're already doing? No, you'd throw money at a bunch of hackers from YC. Duh.
They are neck-and-neck only because they are participating in the arms race. The only other way to keep up is mass-distillation, which could prove to be fragile (so far it seems to be sustainable).
I think one needs to at least recognize the possibility that... there just isn't any more data for training. We've done it all. The models we have today have already distilled all of the output of human cleverness throughout history. If there's more data to be had, we need to make it the hard way.
That's not true. Moreover the progress can slow to a crawl where it's barely noticeable. And in that world the humans continues to stay ahead - that's the magic of humans. To be aware of surroundings and adapt sufficiently whilst taking advantage of tools and leveraging them.
But if you say stuff like this on here you get down voted. Why?
It's all remains free, but you need to email me for a username and password.
If I put in time and effort to make content and OpenAI et al copy it and sell it through their LLM such that no one comes to me any more, then plainly it makes no sense for me to create that content; and then it would not exist for OpenAI to take, or for anyone else. We all lose.
It seems parasitic.
Virtually no scraper has logic to handle that sort of situation, but it's trivial for humans. Way easier than an LLM
https://arstechnica.com/information-technology/2026/02/most-...
CloudBolt’s survey also examined how respondents are migrating workloads off of VMware. Currently, 36 percent of participants said they migrated 1–24 percent of their environment off of VMware. Another 32 percent said that they have migrated 25–49 percent; 10 percent said that they’ve migrated 50–74 percent of workloads; and 2 percent have migrated 75 percent or more of workloads. Five percent of respondents said that they have not migrated from VMware at all.
Among migrated workloads, 72 percent moved to public cloud infrastructure as a service, followed by Microsoft’s Hyper-V/Azure stack (43 percent of respondents).
Overall, 86 percent of respondents “are actively reducing their VMware footprint,” CloudBolt’s report said.
I feel like the company that can figure out how to 100% safely live migrate any VMWare workload to another "cheaper" solution, will do quite well.
In my case, I always use Opus 4.6 in my work, but quite often I get a 504 error, and that's quite annoying. I get errors like that with Gemini too. I can't estimate if I'd get a similar number of errors with ChatGPT, since I use it very infrequently.
But imagine that at some point one of the big 3 (OpenAI, Anthropic, Google) gets very high availability, while the others have very poor availability. Then people would switch to them, even if their models were a bit worse.
Now, OpenAI has been building like crazy, and contracting for future builds like crazy too. Google has very deep pockets, so they'll probably have enough compute to stay in the game. But I fear that Anthropic will not be able to match OpenAI and Google in terms of datacenter build, so it's only a matter of time (and not a lot of time) until they'll be in a pretty tight spot.
Just want to note something there:
Okay, premise that AI really is 'intelligent' up to the point of business decisions.
So, this all then implies that 'intelligence' is then a commodity too?
Like, I'm trying to drive at that your's, mine, all of our 'intelligence' is now no longer a trait that I hold, but a thing to be used, at least as far as the economy is concerned.
We did this with muscles and memory previously. We invented writing and so those with really good memories became just like everyone else. Then we did it with muscles and the industrial revolution, and so really strong or endurant people became just like everyone else. Yes, many exceptions here, but they mostly prove the rule, I think.
Now it seems that really smart people we've made AI and so they're going to be like everyone else?
Who cares, as long as the end results are close (or close enough for the uses they are put to)?
Besides, "has nothing to do with how the human brain works" is an overstatement.
"The term “predictive brain” depicts one of the most relevant concepts in cognitive neuroscience which emphasizes the importance of “looking into the future”, namely prediction, preparation, anticipation, prospection or expectations in various cognitive domains. Analogously, it has been suggested that predictive processing represents one of the fundamental principles of neural computations and that errors of prediction may be crucial for driving neural and cognitive processes as well as behavior."
https://pmc.ncbi.nlm.nih.gov/articles/PMC2904053/
https://maxplanckneuroscience.org/our-brain-is-a-prediction-...
This is obviously already the case with the intelligence level required to produce blog posts and article slop, generade coding agent quality code, do mid-level translations, and things like that...
We have basically 4 companies in the world one can seriously consider, and they all seem to heavily subsidise usage, so under normal market conditions not all of them are going to survive.
The training runs aren’t priced in, but the cost of inference is clearly pretty cheap.
It's a market worth many billions so the prize is a slice of that market. Perhaps it is just a commodity, but you can build a big company if you can take a big slice of that commodity e.g. by building a good product (claude code) on top of your commodity model.
I have curated my youtube recommendations over the years. It knows my likes and dislikes very well. It knows about me a lot.
The same moat exists in interactions with Claude. Claude remembers so many of preferences. It knows that I work in Python and Pandas and starts writing code for that combination. It knows about what type of person I am and what kind of toys I want my nephews and nieces to play. These "facts" about the person are the moat now. Stackoverflow was a repository of "facts" about what worked and what didn't. Those facts or user chat sessions are now Anthropic's moat.
The moat - at this point in time - is really not as deep and wide as you are making it out to be. What you are imagining doesn’t exist yet. Indexing prior conversations is trivially easy at this point, you can do it locally using an api client right this moment.
Besides all that, you will be shocked at how quickly a new service can reconstruct your preferences. I started a new YouTube account, and it was basically the same feed within a few days.
In any case, my feeling is that we should have learned at this point not to keep our data in someone else’s walled garden.
Because your location data, wifi name and etc hones in on the fact this is the same person as before. You are actually supporting my point than denying it.
Then you can feed them into another service.
is it ever clear? pretty much everything seems to be a senseless race to bottom.
As a non-US person, that sounds far more concerning than no statement at all. Because if their tools weren't used for surveillance against Europeans they would have said so as a marketing message...
You have this one? You are subhuman, treated as such and you have very limited rights on our soil, we can do nasty things to you without any court, defense, or hope for fairness. You have that one? Please welcome back.
Sociopathic behavior. Then don't wonder why most of the world is again starting to hate US with passion. I don't mean countries where you already killed hundreds of thousands of civilians, I mean whole world. There isn't a single country out there currently even OK with US, thats more than 95% of the mankind. Why the fuck do you guys allow this? Its not even current gov, rather long term US tradition going back at least till 9/11.
Anthropic literally said the same, but seem to be getting positive PR.
https://www.cbsnews.com/news/ai-executive-dario-amodei-on-th...
https://www.lesswrong.com/posts/FSGfzDLFdFtRDADF4/openai-s-s...
For a lot of people (me included) the lack of integrity and the gaslighting is what has soured them on OpenAI, rather than them signing up to build surveillance and weaponry.
To non-US citizens, all AI companies are as dangerous as each other, OpenAI just really botched the optics here.
Plus, you know, you'd think they'd ask their cleaner or baker or something. Or hire someone.
Around 2005, a Yale Psychology PhD candidate asked me to write a web-based survey instrument with various questions, some on complex but straight forward business questions (the controls) and others with moral/ethical aspects. Senior executives participated and they answered similarly to rank & file, often completing the entire survey much faster. What they didn't know -- we were tracking how long they spent on each question. Questions with moral/ethical concerns took senior executives relatively longer than the rank & file.
Late Addendum: Sorry that I don't recall the author/paper. The survey population spanned multiple industries representing many Fortune 500s, including huge tech companies. The survey was the same for everyone. The questions were story problems from business and law school case reports. The participating companies were anonymized on our end. We provided HR departments with survey link; only subject rank (not identity) was collected. Survey was voluntary, with informed consent according to IRB approval.
Since executives have to make decisions where choosing the moral option may impose an economic (or operational) cost, this requires thinking through the actual choice.
Morality for the "rank and file" is just a signalling issue: there's nothing to think through, the answer they are "supposed to choose" is the one they do so, at no cost to them.
When making an operational decision that affects the direction of the business, morality is almost always a concern -- even at the level of "do our customers benefit from this vs., do we?" etc.
Meanwhile morality is almost always one of least important factors when making operational decisions.
This study showed executives spent relatively more time on questions with moral/ethical concerns. Perhaps the control questions were more similar daily work and hence familiar, while there were fewer encounters with questions having moral/ethical concerns. Perhaps executives decided more care was required for these questions to ensure people were not hurt.
Getting back to the grandparent post, executives are certainly aware of situations with moral/ethical concerns and need not consult their barber to answer them.
At first I found the Gemini Code Assist to be absolutely terrible, bordering on unusable. It would mess up parameter order for function calls in simple 200 line Python. But then I found out about the "model router" which is a layer on top which dynamically routes requests between the flash and pro model. Disabling it and always using the pro model did wonders for my results.
There are however some pretty aggressive rate limits that reset every 24 hours. For me it's okay though. As a hobbyist I only use it about 2-3 hours per day at most anyway.
Meanwhile codex is ... boring. It keeps chugging on, asking for "please proceed" once in a while. No drama. Which is in complete contrast with ChatGPT the chatbot, that is a completely unusable, arrogant, unhelpful, and confrontational. How they made both from the same loaf I dunno.
IDEK what that means, specific examples?
You are making tool X. It currently processes test dataset in 15 seconds. You ask claude code to implement some change. It modifies code, compiles, runs the test - the tool sits in a 100% CPU busyloop. Possible reactions on being told there is a busy loop:
"the program is processing large amount of data. This is normal operation. I will wait until it finishes. [sets wait timeout in 30 minutes]."
"this is certainly the result of using zig toolchain, musl libc malloc has known performance issues. Let me instead continue working on the plan."
"[checks env] There are performance issues when running in a virtual machine. This is a known problem."
"[kills the program]. Let me check if the issue existed previously. [git stash/checkout/build/run/stash pop]. Previous version did not have the issue. Maybe user has changed something in the code."
Bonus episode: since claude code "search" gadget is buggy, LLM often gets empty search results.
"The changes are gone! Maybe user delete the code? Let me restore last commited version [git checkout]. The function is still missing! Must be an issue with the system git. Let me read the repository directly."
Double whammy, I guess, because I also always downvote comments asking (or complaining) about a parent comment getting downvoted.
codex often speaks in very dense technical terms that I'm not familiar with and tends to use acronyms I've not encountered so there's a learning curve. It also often thinks I'm providing feedback when I'm just trying to understand what it just said. But it does give nice explanations once it understands that I'm just confused.
And I've been absolutely amazed with Codex. I started using that with version ChatGPT 5.3-Codex, and it was so much better than online ChatGPT 5.2, even sticking to single page apps which both can do. I don't have any way to measure the "smarts" for of the new 5.4, but it seems similar.
Anyways, I'll try to get Claude running if it's better in some significant way. I'm happy enough the the Codex GUI on MacOS, but that's just one of several things that could be different between them.
Claude, IMO, is much better at empathizing with me as a user: It asks better questions, tries harder to understand WHY I'm trying to do something, and is more likely to tell me if there's a better way.
Both have plenty of flaws. Codex might be better if you want to set it loose on a well-defined problem and let it churn overnight. But if you want a back-and-forth collaboration, I find Claude far better.
Not Claude Code specifically, but you can try the Claude Opus and Sonnet 4.6 models for free using Google Antigravity.
Recently did the full transition to Claude, the model is great, but what I really love is how they seem to have landed on a clear path for their GUI/ecosystem. The cowork feature fits my workflows really well and connecting enterprise apps, skills and plugins works really well.
Haven’t been this excited about AI since GPT 4o launched.
On an unrelated note, UI is such a personal preference that it's impossible, beyond core pillars that have been studied for decades, to say one is better over the other. That being said, I like OpenAI's design system much better than Anthropic. OpenAI products (cli and chat ui) "feel" nice and consumer focused whereas Anthropic's products feel utilitarian and "designed for business".
Yep there really is no switching cost it seems.
People generally want something from a model and then leave. I think people are sub-consciously forming relationships with Tech firms such that they do not care about them, and its all about what the user themselves gets. Generally there is no attachment. There's some examples of psychotic stuff but that's thankfully the exception not the norm.
That's why Apple cares deeply about its brand - it doesn't want to fall into that group of firms.
A couple of weeks ago, to huge numbers of people, ChatGPT was AI. The biggest public perception shift that will have come from the DoD/DoW spat will be how many people know that Claude exists at all, that they are being unreasonably punished by the government for taking a principled stance will benefit.
People have been made aware of a product, made aware that it's good enough that the government wants to use it. They have then been shown a archetypical underdog against the government narrative. That makes almost a perfect storm for gaining customers.
When they actually use the thing and discover that it actually is good, They will stay, and they will tell their friends.
At this rate they should be sending Hegseth a thank you card.
I've been on the internet since 93 or 94 and I've never once heard it called that. If anything, "Al Gore".
I've literally never heard anybody call the Internet "Reagan's internet", the best I can do is the Al Gore quote and who's calling anything Trump's AI?
What ideas are you trying to express here?
My experience has, for a few months, been that OpenAI's models are consistently quite noticeably better for me, and so my Codex CLI usage had been probably 5x as much as my Claude Code usage. So it's a major bummer to have cancelled, but I don't have it in me to keep giving them money.
I'd love to get off Anthropic too, despite the admirable stance they took, the whole deal made me extra uncomfortable that they were ever a defense contractor (war contractor?) to begin with.
OpenAI simply provides more value for the money at the moment.
Anthropic is the outlier here, obviously they can limit their subscriptions as they want but it's a major disadvantage compared to their competitors.
Explicit means it's stated in OpenAI docs somewhere, but I can't find it. Link?
There's probably a better source somewhere but this is the one I had at hand.
- chatgpt is kinda autistic and must follow procedures no matter what and writes like some bland soulless but kinda correct style. great at research, horrible at creativity, slow at getting things done but at least getting there. good architect, mid builder, horrible designer/writer.
- claude is the sensitive diva that is able to really produce elegant code but has to be reminded of correctness checks and quality gates repeatedly, so it arrives at something good very fast (sometimes oneshot) but then loses time for correction loops and "those details". great overall balance, but permanent helicoptering needed or else it derails into weird loops.
- grok is the maker, super fast and on target, but doesn't think deeply as the others, its entirely goal/achievement focussed and does just enough things to get there. uniqiely it doesn't argue or self-monologue constantly about doubts or safety or ethics, but drives forward where other stuggles, and faster than others. cannot conenctrate for too long, but delivers fast. tons of quick edits? grok it is. "experimental" stuff that is not safe talking about... definitely grok.
- gemini is whatever you quickly need in your GSuite, plus looking at what others are doing and helping out with a sometimes different perspective, but beyond that worse than all the others on top.
- kimi: currently using it on the side, not bad at all so far, but also nothing distinct I crystallized in my head.
5.4 seems terrible at anything that's even somewhat out-of-distribution.
It surprisingly went at it progressively, starting with a basic CPU renderer, all the way to a basic special-purpose Metal shader. Now it’s trying its teeth at adding passthrough support. YMMV.
Google seems to be on a hot streak with their models, and, since they're playing from behind, I'd expect favorable pricing and terms. But, I don't know anyone who is using or talking about Gemini. All the chatter seems to be Anthropic vs. OpenAI.
I used to think tokens are a commodity, but it’s becoming clear that the jagged frontier is different enough even for the easiest use case of SWE that there’s room for having two if not three providers of different foundational models. It isn’t a winner takes all, they’re all winning together. Cursor isn’t properly taking advantage of the situation yet.
I didn't mean to advocate for Anthropic, apologies.
I would love to, but a practical look at that concept seems practically impossible.
My .02c : Claude was already involved in underhanded shit I don't want a part of[0] and that generated little ethical response from Anthropic , i've had better luck as a 200/mo tier customer with ChatGPT, and I don't really think that Dario claiming that their newest LLM is conscious[1] on a market schedule is all that ethical, either.
[0]: https://en.wikipedia.org/wiki/Project_Maven [1]: https://tech.yahoo.com/ai/claude/articles/anthropic-ceo-admi...
realistically: AI WILL get used in military and for killing autonomously, like it or not, believe it or not. I am also against that in principle but I do accept the fact my opinion just doesn't matter and practice radial acceptance or reality as-is. twitter/X is also alive and kicking, despite musk and anti-musk-hate. xAI/Grok is genuinely really good too compared to OAI/Claude, a bit different but very good. At this point all the "outcries" feel like noise I just skip on principle. But it could turn up the fire under the OAI team to go aggressive feature/pricing wise in order to retain/increase their userbase again, which is ... good, after all.
They're just keeping up with the outrage news cycles.