FilterHN

17 hours ago

[-]

All this demonstrates how non-sticky all this tech really is. When your product is basically just an API call it’s trivial to just swap you out for someone else. As such it’s unclear what the prize at the end of the present race to the bottom is.

We swapped OpenAI out for Claude and it required updating about 15 lines of code. All these guys are just commodity to us. If next week there’s a better supplier of commodity AI we’ll spend an hour and swap to something else again. There’s zero loyalty here.

HotHotLava

16 hours ago

[-]

It's an ironic situation; logically what should be the moat are the models, costing hundreds of millions of investment cost to train and operate so it would make sense if we see different provider focusing in different directions.

But right now we have 3-5 top contenders that are so evenly matched that the de-facto sticking point is mostly the harness, ie. the collection of proven plugins/commands/tools/agent features that are tuned to the users personal workflow.

groestl

16 hours ago

[-]

> are so evenly matched

It's because the real value of the models is in what we (humanity) fed them, and all of them have eaten the same thing for free.

nradov

16 hours ago

[-]

That's why the frontier LLM companies are now spending a lot more to license exclusive proprietary training data from private sources in order to gain a quality edge in certain business domains.

15 hours ago

[-]

But those holding said proprietary data have figured out they’re holding the cards now and have gotten a lot smarter recently. Companies are being very careful about what gets used for inference vs what they allow to be used for training.

I don’t see the core models getting dramatically better from where they are now. We’ve clearly hit a plateau.

hparadiz

15 hours ago

[-]

Really? I mean I see regularly as I'm coding how much better it could be simply by running obvious prompts for me.

When I use the planning mode and then code the success rate is much higher. When I ask it to work on specific isolated chunks of code with clear success/failure modes the success rate is again much higher.

Now imagine a world where it recognizes that from my simple throw away non specific prompt. If it was able to fire off 20 different prompts in quick succession it could easily cut my time spent in front of the screen by a third.

The patterns are obvious but they don't do that right now because it's a lot of compute.

We'll be looking at this time where there's a progress bar showing context space the way we look at the Turbo button.

Because the truth is to get the baseline I'm talking about is a finite amount of compute at a certain point.

bryanrasmussen

15 hours ago

[-]

so can it be the one that gets ahead on having people go find things for them - https://news.ycombinator.com/item?id=47285283

indigodaddy

14 hours ago

[-]

Interesting

ajross

16 hours ago

[-]

That sounds like spin to me. If there were a clear "quality edge" in "certain business domains" stemming from "exclusive proprietary data", someone would have been exploiting it already using meat computers.

But no, businesses are dumb. They always have been. Existing businesses get disrupted by new ideas and new technology all the time. This very site is a temple to disruption!

Proprietary advantage is, 99.999% of the time, just structural advantage. You can't compete with Procter & Gamble because they already built their brands and factories and supply chains and you'd have to do all that from scratch while selling cheaper products as upstart value options. And there's not enough money in consumer junk to make that worth it.

But if you did have funding and wanted to beat them on first principles? Would you really start by training an LLM on what they're already doing? No, you'd throw money at a bunch of hackers from YC. Duh.

linkregister

15 hours ago

[-]

Frontier labs are paying the same constellation of firms offering proprietary data and access to experts in their fields to train LLMs.

They are neck-and-neck only because they are participating in the arms race. The only other way to keep up is mass-distillation, which could prove to be fragile (so far it seems to be sustainable).

ajross

15 hours ago

[-]

Meh. I think there's basically no benefit shown so far to careful curation. That's where we've been in machine learning for three decades, after all. Also recognize that the Great Leap Forward of LLMs was when they got big enough to abandon that strategy and just slurp in the Library of All The Junk.

I think one needs to at least recognize the possibility that... there just isn't any more data for training. We've done it all. The models we have today have already distilled all of the output of human cleverness throughout history. If there's more data to be had, we need to make it the hard way.

bonoboTP

15 hours ago

[-]

Ok, maybe pretraining is now complete and solved. Next up: post-training, reinforcement learning, engineering RL environments for realistic problem solving, recording data online during use, then offline simulation of how it could have gone better and faster, distilling that into the next model etc. etc. There's still decades worth of progress to be made this way.

k32k

11 hours ago

[-]

" There's still decades worth of progress to be made this way."

That's not true. Moreover the progress can slow to a crawl where it's barely noticeable. And in that world the humans continues to stay ahead - that's the magic of humans. To be aware of surroundings and adapt sufficiently whilst taking advantage of tools and leveraging them.

k32k

11 hours ago

[-]

I think this is where we are at, too.

But if you say stuff like this on here you get down voted. Why?

nradov

15 hours ago

[-]

The quality edge hasn't shown up yet. If this strategy actually works then the quality improvements will only become apparent in the next round of major LLM updates. There's a lot of valuable training data locked up behind corporate firewalls. But this is all somewhat speculative for now.

Max-Ganz-II

15 hours ago

[-]

To stop this, I today put most of my Amazon Redshift research web-site behind a basic auth username/password wall.

It's all remains free, but you need to email me for a username and password.

If I put in time and effort to make content and OpenAI et al copy it and sell it through their LLM such that no one comes to me any more, then plainly it makes no sense for me to create that content; and then it would not exist for OpenAI to take, or for anyone else. We all lose.

It seems parasitic.

bestouff

13 hours ago

[-]

An AI is more likely than me to take the time to send you an email for requesting access - I'm too lazy.

maest

12 hours ago

[-]

I think a better approach would be to have a login form and just say "the password is 1234" or whatever.

Virtually no scraper has logic to handle that sort of situation, but it's trivial for humans. Way easier than an LLM

simulator5g

9 hours ago

[-]

Not true, even Windows Defender is capable of extracting "the password is 1234" from context like emails or webpages.

selfhoster11

12 hours ago

[-]

Please add Internet Archive's bot to your auto-allows, at least. Their bot is presumably well behaved, and for public benefit.

ryandvm

13 hours ago

[-]

Ironic indeed. The Great Replacers of white collar jobs are finding themselves easily replaceable. Delicious.

wonnage

16 hours ago

[-]

Cost is never a good moat.

https://arstechnica.com/information-technology/2026/02/most-...

dd82

16 hours ago

[-]

the companies migrating off vmware due to broadcom shittiness would disagree with you

CloudBolt’s survey also examined how respondents are migrating workloads off of VMware. Currently, 36 percent of participants said they migrated 1–24 percent of their environment off of VMware. Another 32 percent said that they have migrated 25–49 percent; 10 percent said that they’ve migrated 50–74 percent of workloads; and 2 percent have migrated 75 percent or more of workloads. Five percent of respondents said that they have not migrated from VMware at all.

Among migrated workloads, 72 percent moved to public cloud infrastructure as a service, followed by Microsoft’s Hyper-V/Azure stack (43 percent of respondents).

Overall, 86 percent of respondents “are actively reducing their VMware footprint,” CloudBolt’s report said.

latchkey

16 hours ago

[-]

It is easier to do in the cloud than it is to do with actual hardware though, because you'll need enough hardware to do the migration. There is a capital moat around that.

I feel like the company that can figure out how to 100% safely live migrate any VMWare workload to another "cheaper" solution, will do quite well.

credit_guy

12 hours ago

[-]

The moat is compute.

In my case, I always use Opus 4.6 in my work, but quite often I get a 504 error, and that's quite annoying. I get errors like that with Gemini too. I can't estimate if I'd get a similar number of errors with ChatGPT, since I use it very infrequently.

But imagine that at some point one of the big 3 (OpenAI, Anthropic, Google) gets very high availability, while the others have very poor availability. Then people would switch to them, even if their models were a bit worse.

Now, OpenAI has been building like crazy, and contracting for future builds like crazy too. Google has very deep pockets, so they'll probably have enough compute to stay in the game. But I fear that Anthropic will not be able to match OpenAI and Google in terms of datacenter build, so it's only a matter of time (and not a lot of time) until they'll be in a pretty tight spot.

Balgair

13 hours ago

[-]

→All these guys are just commodity to us.

Just want to note something there:

Okay, premise that AI really is 'intelligent' up to the point of business decisions.

So, this all then implies that 'intelligence' is then a commodity too?

Like, I'm trying to drive at that your's, mine, all of our 'intelligence' is now no longer a trait that I hold, but a thing to be used, at least as far as the economy is concerned.

We did this with muscles and memory previously. We invented writing and so those with really good memories became just like everyone else. Then we did it with muscles and the industrial revolution, and so really strong or endurant people became just like everyone else. Yes, many exceptions here, but they mostly prove the rule, I think.

Now it seems that really smart people we've made AI and so they're going to be like everyone else?

EQmWgw87pw

13 hours ago

[-]

Well as of right now, mathematically and scientifically, the way an LLM works has nothing to do with how the human brain works.

https://pmc.ncbi.nlm.nih.gov/articles/PMC2904053/

coldtea

12 hours ago

[-]

The way this thing "looks like a duck, swims like a duck, and quacks like a duck" has nothing to do with the way a real duck "looks like a duck, swims like a duck, and quacks like a duck".

Who cares, as long as the end results are close (or close enough for the uses they are put to)?

Besides, "has nothing to do with how the human brain works" is an overstatement.

"The term “predictive brain” depicts one of the most relevant concepts in cognitive neuroscience which emphasizes the importance of “looking into the future”, namely prediction, preparation, anticipation, prospection or expectations in various cognitive domains. Analogously, it has been suggested that predictive processing represents one of the fundamental principles of neural computations and that errors of prediction may be crucial for driving neural and cognitive processes as well as behavior."

https://maxplanckneuroscience.org/our-brain-is-a-prediction-...

anon373839

9 hours ago

[-]

But the end results aren’t actually close. That is why frontier LLMs don’t know you need to drive your car to the car wash (until they are inevitably fine-tuned on this specific failure mode). I don’t think there is much true generalization happening with these models - more a game of whack-a-mole all the way down.

bmitc

11 hours ago

[-]

The human doesn't just predict. It predicts based upon simulations that it runs. These LLMs do not work like this.

groestl

4 hours ago

[-]

If you're able to predict, you're able to simulate.

_aavaa_

11 hours ago

[-]

So? Does a submarine swim?

coldtea

12 hours ago

[-]

>So, this all then implies that 'intelligence' is then a commodity too? Like, I'm trying to drive at that your's, mine, all of our 'intelligence' is now no longer a trait that I hold, but a thing to be used, at least as far as the economy is concerned.

This is obviously already the case with the intelligence level required to produce blog posts and article slop, generade coding agent quality code, do mid-level translations, and things like that...

oytis

16 hours ago

[-]

> someone else

We have basically 4 companies in the world one can seriously consider, and they all seem to heavily subsidise usage, so under normal market conditions not all of them are going to survive.

dghlsakjg

13 hours ago

[-]

Open Router shows that commodity api providers have figured out how to do this unsubsidized.

The training runs aren’t priced in, but the cost of inference is clearly pretty cheap.

pinkmuffinere

16 hours ago

[-]

Ya, agreed. This makes me think that (long term) the ai race won’t be won on the merits of individual models, but on pricing — I think Google has a some strong advantages here because they know how to provide cheap compute, and they already have a ton of engineers doing similar things, so it’s a marginal cost for them instead of having to hire and maintain whole devoted teams.

rblatz

16 hours ago

[-]

AI consumes entire data centers of compute. You aren’t tucking a few racks into a corner of a data center, you are building entirely new ones. There will be whole devoted teams.

pinkmuffinere

16 hours ago

[-]

But Google already builds data centers. Will there really be devoted AI-datacenter teams? Or will they just expand the normal datacenter teams, and ask them to use GPUs/TPUs instead of CPUs?

remus

15 hours ago

[-]

> As such it’s unclear what the prize at the end of the present race to the bottom is.

It's a market worth many billions so the prize is a slice of that market. Perhaps it is just a commodity, but you can build a big company if you can take a big slice of that commodity e.g. by building a good product (claude code) on top of your commodity model.

15 hours ago

[-]

The revenue slice is there, problem is though in a race to the bottom like we’re in now there isn’t much profit at the bottom. And these companies desperately need profit to justify the gigantic capital spend and depreciation title wave that’s on the horizon. There’s no clear way now things don’t just get really pretty quickly.

adammarples

15 hours ago

[-]

The entire point of a race to the bottom is that your competitors keep reducing their prices until those billions disappear

spiffytech

16 hours ago

[-]

Unfortunately this is why Anthropic is so aggressive about preventing Claude subscriptions from being used with other tools.

latchkey

15 hours ago

[-]

According to this article, they can't even service the amount of paying customers that they have.

tonyedgecombe

13 hours ago

[-]

They should put their prices up then.

ghywertelling

15 hours ago

[-]

Let me explain a possible moat with an example.

I have curated my youtube recommendations over the years. It knows my likes and dislikes very well. It knows about me a lot.

The same moat exists in interactions with Claude. Claude remembers so many of preferences. It knows that I work in Python and Pandas and starts writing code for that combination. It knows about what type of person I am and what kind of toys I want my nephews and nieces to play. These "facts" about the person are the moat now. Stackoverflow was a repository of "facts" about what worked and what didn't. Those facts or user chat sessions are now Anthropic's moat.

15 hours ago

[-]

It takes about 30 seconds to export all that into a file and take your history elsewhere. There’s no moat there.

dghlsakjg

15 hours ago

[-]

“Hey Claude, write out a markdown file of all of my preferences so any AI agent can pick up where you left off”

politelemon

15 hours ago

[-]

In fact, here, I'll do it myself.

ghywertelling

15 hours ago

[-]

You are missing the correlations that Claude can derive across all these user sessions across all users. In Google analytics, when I visit a page and navigate around till I find what I was looking for or didn't find it, that session data is important for website owners how to optimize. Even in Google search results, when I think on 6th link and not the first, it sends a signal how to rearrange the results next time or even personalize. That same paradigm will be applicable here. This is network effects and personalization and ranking coming togther beautifully. Once Anthropic builds that moat, it will be irreplaceable. If not, ask all users to jump from Whatsapp to Telegram or Signal and see how difficult it is. When anthropic gives you the best answer without asking too much, the experience is 100x better.

dghlsakjg

13 hours ago

[-]

The underlying technology is a thin layer of queryable knowledge/“memories” in between you and the llm, that in turn gets added to the context of your message to the llm. Likely RAG. It can be as simple as a agents.md that you give it permission to modify as needed. I really don’t think that they are correlating your “memories” with other people’s conversations. There is no way for the LLM to know what is or isn’t appropriate to share between sessions, at the moment. That functionality may exist in the future, but if you just export your preferences, it still works.

The moat - at this point in time - is really not as deep and wide as you are making it out to be. What you are imagining doesn’t exist yet. Indexing prior conversations is trivially easy at this point, you can do it locally using an api client right this moment.

Besides all that, you will be shocked at how quickly a new service can reconstruct your preferences. I started a new YouTube account, and it was basically the same feed within a few days.

In any case, my feeling is that we should have learned at this point not to keep our data in someone else’s walled garden.

ghywertelling

4 hours ago

[-]

> Besides all that, you will be shocked at how quickly a new service can reconstruct your preferences. I started a new YouTube account, and it was basically the same feed within a few days.

Because your location data, wifi name and etc hones in on the fact this is the same person as before. You are actually supporting my point than denying it.

RcouF1uZ4gsC

15 hours ago

[-]

You can have Claude write all these out to a file.

Then you can feed them into another service.

timcobb

16 hours ago

[-]

> As such it’s unclear what the prize at the end of the present race to the bottom is.

is it ever clear? pretty much everything seems to be a senseless race to bottom.

seydor

16 hours ago

[-]

This is the new web hosting. All the valuations are absurd

gruez

16 hours ago

[-]

Doesn't "web hosting" print money for Amazon?

gdilla

15 hours ago

[-]

just having strict control over context management in session is a nice differentiator. Shared tooling between desktop and cli and is nice too. they've differentiated enough.

t0mas88

17 hours ago

[-]

> OpenAI, meanwhile, has been attempting to quell the backlash against its deal with the U.S. government, putting out a blog post claiming that “our tools will not be used to conduct domestic surveillance of U.S. persons,”

As a non-US person, that sounds far more concerning than no statement at all. Because if their tools weren't used for surveillance against Europeans they would have said so as a marketing message...

adrianN

17 hours ago

[-]

With n-eyes agreements it’s quite meaningless anyway. Whatever passport you have, somebody spies on you and sells the information to your government.

shimman

16 hours ago

[-]

It's also meaningless because we know governments get around these "agreements" by buying data from third party companies that bought the data from OpenAI. The only way to stop this is to legislate it out of existence.

drob518

17 hours ago

[-]

Yep. “You spy on mine, and I’ll spy on yours, and then we’ll share info.”

kakacik

17 hours ago

[-]

I wouldn't give them any free pass and just give up, its highly amoral and inhuman behavior. Modern form of racism but based on passport.

You have this one? You are subhuman, treated as such and you have very limited rights on our soil, we can do nasty things to you without any court, defense, or hope for fairness. You have that one? Please welcome back.

Sociopathic behavior. Then don't wonder why most of the world is again starting to hate US with passion. I don't mean countries where you already killed hundreds of thousands of civilians, I mean whole world. There isn't a single country out there currently even OK with US, thats more than 95% of the mankind. Why the fuck do you guys allow this? Its not even current gov, rather long term US tradition going back at least till 9/11.

https://www.cbsnews.com/news/ai-executive-dario-amodei-on-th...

lavezzi

17 hours ago

[-]

> "We have these two red lines... Not allowing Anthropic's AI to perform mass surveillance of Americans, and prohibiting its AI from powering fully-autonomous weapons..."

Anthropic literally said the same, but seem to be getting positive PR.

https://www.lesswrong.com/posts/FSGfzDLFdFtRDADF4/openai-s-s...

gruez

16 hours ago

[-]

It's not "literally the same".

fluidcruft

17 hours ago

[-]

The difference is that Anthropic actually dotted the i's and crossed the t's whereas OpenAI fell for the weaselwords and is now desperately trying to renegotiate.

beachy

11 hours ago

[-]

OpenAI didn't fall for anything, they knew exactly what they were signing and went ahead anyway, then started gaslighting people about what they had signed.

For a lot of people (me included) the lack of integrity and the gaslighting is what has soured them on OpenAI, rather than them signing up to build surveillance and weaponry.

To non-US citizens, all AI companies are as dangerous as each other, OpenAI just really botched the optics here.

spwa4

17 hours ago

[-]

It's amazing how bad FANG executives are at even knowing what a normal moral thought would be for average people ...

Plus, you know, you'd think they'd ask their cleaner or baker or something. Or hire someone.

clcaev

17 hours ago

[-]

Executives are certainly capable of understanding moral/ethical concerns.

Around 2005, a Yale Psychology PhD candidate asked me to write a web-based survey instrument with various questions, some on complex but straight forward business questions (the controls) and others with moral/ethical aspects. Senior executives participated and they answered similarly to rank & file, often completing the entire survey much faster. What they didn't know -- we were tracking how long they spent on each question. Questions with moral/ethical concerns took senior executives relatively longer than the rank & file.

Late Addendum: Sorry that I don't recall the author/paper. The survey population spanned multiple industries representing many Fortune 500s, including huge tech companies. The survey was the same for everyone. The questions were story problems from business and law school case reports. The participating companies were anonymized on our end. We provided HR departments with survey link; only subject rank (not identity) was collected. Survey was voluntary, with informed consent according to IRB approval.

mjburgess

16 hours ago

[-]

You would also need to control for the degree to which people had a stake in the outcome (ie., virtue signalling).

Since executives have to make decisions where choosing the moral option may impose an economic (or operational) cost, this requires thinking through the actual choice.

Morality for the "rank and file" is just a signalling issue: there's nothing to think through, the answer they are "supposed to choose" is the one they do so, at no cost to them.

Timon3

2 hours ago

[-]

"Rank and file" employees choosing to prioritize morality very, very frequently pay real costs for doing so - with a much larger personal impact than executives feel.

mjburgess

1 hour ago

[-]

Only in very rare circumstances where the obvious answer and their procedural work dont align.

When making an operational decision that affects the direction of the business, morality is almost always a concern -- even at the level of "do our customers benefit from this vs., do we?" etc.

Timon3

1 hour ago

[-]

Where do you get the idea that those circumstances are "very rare"? Workers are being asked to break rules and do unethical things all the time, and you're pretty much guaranteed to pay a personal cost if you refuse.

Meanwhile morality is almost always one of least important factors when making operational decisions.

clcaev

15 hours ago

[-]

I hope the addendum helps clarify.

This study showed executives spent relatively more time on questions with moral/ethical concerns. Perhaps the control questions were more similar daily work and hence familiar, while there were fewer encounters with questions having moral/ethical concerns. Perhaps executives decided more care was required for these questions to ensure people were not hurt.

Getting back to the grandparent post, executives are certainly aware of situations with moral/ethical concerns and need not consult their barber to answer them.

17 hours ago

[-]

It helps a lot that Claude is just better. Codex isn't BAD, and in some narrow technical ways might even be more capable, but I find Claude to be hands-down the best collaborator of all the AI models and it has never been close.

kivle

16 hours ago

[-]

I switched from ChatGPT Plus to Gemini Pro instead of Claude, since I'm a hobbyist and appreciate having more than just text chat and coding assist with my subscription (image gen, video gen, etc are all nice to have).

At first I found the Gemini Code Assist to be absolutely terrible, bordering on unusable. It would mess up parameter order for function calls in simple 200 line Python. But then I found out about the "model router" which is a layer on top which dynamically routes requests between the flash and pro model. Disabling it and always using the pro model did wonders for my results.

There are however some pretty aggressive rate limits that reset every 24 hours. For me it's okay though. As a hobbyist I only use it about 2-3 hours per day at most anyway.

kolinko

14 hours ago

[-]

With Claude you just tell it to set up whatever it needs and you have a smooth access to everything. Mine uses Nanobanana for image generation, Sora for video, Gemini for supplementary image processing and so on. Setting up each one was 5-10 min of Claude’s work

10xDev

15 hours ago

[-]

With Gemini Pro on Antigravity you get a quota reset every 5 hours and access to Claude Opus 4.6. That's what I use at home and don't need anything else.

indigodaddy

14 hours ago

[-]

Didn't they tighten that quota WAY down though since everyone caught on to the AG/Opus game?

grallm

15 hours ago

[-]

Did you leave OpenAI because of the current backlash? If so, is Google even better?

112233

17 hours ago

[-]

Interesting to hear! I've had completely opposite experience, with Claude having 5 minutes of peerless lucidity, followed by panicking, existential crisis, attempts to sabotage it's own tests and code, psyops targeted at making user doubt their computer, OS, memory... Plus it prompts every 15 seconds, with alternative being YOLO.

Meanwhile codex is ... boring. It keeps chugging on, asking for "please proceed" once in a while. No drama. Which is in complete contrast with ChatGPT the chatbot, that is a completely unusable, arrogant, unhelpful, and confrontational. How they made both from the same loaf I dunno.

AlotOfReading

16 hours ago

[-]

I wish I could get Claude to stop every 15 seconds. There's a persistent bug in the state machine that causes it to miss esc/stop/ctrl-f and continue spending tokens when there's a long running background task or subagent. There's a lot of wasted tokens when it runs for 10, 15, 20 minutes and I can't stop it from running down the wrong rabbit hole.

17 hours ago

[-]

> psyops targeted at making user doubt their computer

IDEK what that means, specific examples?

112233

15 hours ago

[-]

The following is a dramatic reenactment of an observed behaviour /discl.

You are making tool X. It currently processes test dataset in 15 seconds. You ask claude code to implement some change. It modifies code, compiles, runs the test - the tool sits in a 100% CPU busyloop. Possible reactions on being told there is a busy loop:

"the program is processing large amount of data. This is normal operation. I will wait until it finishes. [sets wait timeout in 30 minutes]."

"this is certainly the result of using zig toolchain, musl libc malloc has known performance issues. Let me instead continue working on the plan."

"[checks env] There are performance issues when running in a virtual machine. This is a known problem."

"[kills the program]. Let me check if the issue existed previously. [git stash/checkout/build/run/stash pop]. Previous version did not have the issue. Maybe user has changed something in the code."

Bonus episode: since claude code "search" gadget is buggy, LLM often gets empty search results.

"The changes are gone! Maybe user delete the code? Let me restore last commited version [git checkout]. The function is still missing! Must be an issue with the system git. Let me read the repository directly."

112233

15 hours ago

[-]

(Unrelated but I'm really curious) The above comment got downvoted within few seconds of me pressing "reply". Is there some advanced hackernews reader software that allows such immediate reaction (via some in-notification controls)? Or is that builtin site reaction? Or a sign of a bot? Because the speed was uncanny.

rkomorn

14 hours ago

[-]

That may actually have been me accidentally downvoting as I was scrolling through with my clumsy thumb.

Double whammy, I guess, because I also always downvote comments asking (or complaining) about a parent comment getting downvoted.

fragmede

15 hours ago

[-]

There's this bug in Claude Desktop where a response will disappear on you. When you're busy doing many things at once, you'll go back to the chat, and you'll be all "wait, didn't I already do this?" It's maddening and makes you question your own sanity.

fluidcruft

16 hours ago

[-]

I've generally thought that but lately I've been finding that the main difference is Claude wants a lot more attention than codex (I only use the cli for either). codex isn't great at guessing what you want, but once you get used to its conversation style it's pretty good at just finishing things quietly and the main thing is context management seems to handle itself very well and I rarely even think about it in codex. To me they're just... different. Claude is a little easier to communicate with.

codex often speaks in very dense technical terms that I'm not familiar with and tends to use acronyms I've not encountered so there's a learning curve. It also often thinks I'm providing feedback when I'm just trying to understand what it just said. But it does give nice explanations once it understands that I'm just confused.

xscott

17 hours ago

[-]

Can you expand on that. I've been wanting to try Claude for a while, but their payment processing wouldn't take any of my credit cards (they work everywhere else, so it's not the cards). I've heard I can work around this by installing their mobile app or something, but it was extra hurdles, so I didn't try very hard.

And I've been absolutely amazed with Codex. I started using that with version ChatGPT 5.3-Codex, and it was so much better than online ChatGPT 5.2, even sticking to single page apps which both can do. I don't have any way to measure the "smarts" for of the new 5.4, but it seems similar.

Anyways, I'll try to get Claude running if it's better in some significant way. I'm happy enough the the Codex GUI on MacOS, but that's just one of several things that could be different between them.

17 hours ago

[-]

Codex is not bad, I think it is still useful. But I find that it takes things far too literally, and is generally less collaborative. It is a bit like working with a robot that makes no effort to understand why a user is asking for something.

Claude, IMO, is much better at empathizing with me as a user: It asks better questions, tries harder to understand WHY I'm trying to do something, and is more likely to tell me if there's a better way.

Both have plenty of flaws. Codex might be better if you want to set it loose on a well-defined problem and let it churn overnight. But if you want a back-and-forth collaboration, I find Claude far better.

ValentineC

17 hours ago

[-]

> I've been wanting to try Claude for a while, but their payment processing wouldn't take any of my credit cards (they work everywhere else, so it's not the cards). I've heard I can work around this by installing their mobile app or something, but it was extra hurdles, so I didn't try very hard.

Not Claude Code specifically, but you can try the Claude Opus and Sonnet 4.6 models for free using Google Antigravity.

kromokromo

17 hours ago

[-]

I’ve been juggling between ChatGPT, Claude and Gemini for the last couple of years, but ChatGPT has always been my main driver.

Recently did the full transition to Claude, the model is great, but what I really love is how they seem to have landed on a clear path for their GUI/ecosystem. The cowork feature fits my workflows really well and connecting enterprise apps, skills and plugins works really well.

Haven’t been this excited about AI since GPT 4o launched.

hk1337

17 hours ago

[-]

yeah, OpenAI has its strengths but code generation is not one of them.

https://mrshu.github.io/github-statuses/

goldenarm

17 hours ago

[-]

Still 8x less downtime than GitHub

baq

17 hours ago

[-]

There’s a surge of demand for sure, but I’m not at all convinced that it’s at OpenAI’s expense. My bet is the non-swe folks caught wind the things got seriously good at a lot of boring office work, i.e. we’re seeing diffusion of AI into the wider economy.

stavarotti

16 hours ago

[-]

I've largely found codex and claude code to be about the same however, codex tends to "think" harder and for longer which depending on the task, yields better results without too much steering.

On an unrelated note, UI is such a personal preference that it's impossible, beyond core pillars that have been studied for decades, to say one is better over the other. That being said, I like OpenAI's design system much better than Anthropic. OpenAI products (cli and chat ui) "feel" nice and consumer focused whereas Anthropic's products feel utilitarian and "designed for business".

qnleigh

15 hours ago

[-]

I wonder if this is actually good for Anthropic. 2.5 million new customers sounds like good news for them, except these are mostly not paying customers. It seemed like they were positioning themselves to make money by selling coding agents with a subscription fee. If that free tier mostly exists to advertise their paid tiers, then this would be kind of a drag.

k32k

12 hours ago

[-]

Many people I know initially used ChatGPT for awhile. Then after awhile they went to Gemini. Again stuck with it for awhile. And now are dabbling with Claude.

Yep there really is no switching cost it seems.

People generally want something from a model and then leave. I think people are sub-consciously forming relationships with Tech firms such that they do not care about them, and its all about what the user themselves gets. Generally there is no attachment. There's some examples of psychotic stuff but that's thankfully the exception not the norm.

That's why Apple cares deeply about its brand - it doesn't want to fall into that group of firms.

Lerc

16 hours ago

[-]

It's is a fairly ridiculous conclusion to draw that these people are leaving ChatGPT because of their stance. I doubt OpenAI's actions play much role in the influx at all.

A couple of weeks ago, to huge numbers of people, ChatGPT was AI. The biggest public perception shift that will have come from the DoD/DoW spat will be how many people know that Claude exists at all, that they are being unreasonably punished by the government for taking a principled stance will benefit.

People have been made aware of a product, made aware that it's good enough that the government wants to use it. They have then been shown a archetypical underdog against the government narrative. That makes almost a perfect storm for gaining customers.

When they actually use the thing and discover that it actually is good, They will stay, and they will tell their friends.

At this rate they should be sending Hegseth a thank you card.

hk1337

17 hours ago

[-]

It's like reddit when Digg v4 happened

christina97

15 hours ago

[-]

We are in this fascinating stage where tokens that are nominally entirely fungible at a roughly equivalent intelligence level; yet at the same time there is huge market segmentation and differentiation in the non-tangible aspects of those tokens.

bronlund

16 hours ago

[-]

I guess you can call it «struggles», but this is that kind of struggle which brings a smile to your face :)

didip

13 hours ago

[-]

Suffering from success in a good way. ChatGPT truly lost all of its edge.

Grimblewald

13 hours ago

[-]

Well c-suite lied to everyone and was dealing in bad faith. When thay came to light they immediatly lost support and interest from highly skilled researchers in the area, from that point onward, their only additional offerings would be whatever tech evangelists can rustle up, so nice ui some cool features etc. But really ground breaking stuff that takes celever engineering or the kind of thinking that cannot be taught/approximated? Gone. So OpenAI, despite its massive headstart, will just continue to fall behind. When youre smart enough money stops mattering beyond keeping a roof over your head and food in your mouth. At that point your world view and personal beleifs become far more valuable, and smart people always come to the conclusion violence is never worth it, ve it physical, information based, social, meotional, whatever. OpenAI is an incredibly violent company, so inherently scares off talent.

cedws

15 hours ago

[-]

Codex has been feeling a bit faster recently, not sure if placebo.

alphabettsy

11 hours ago

[-]

They claim it’s faster and it seems to be so for me.

13415

16 hours ago

[-]

Is there any news about how Gemini fares in this debate? I suppose they're fine with total mass surveillance ("we already do that anyway") and creating kill bots but is there any official stance? I find it hard to believe Alphabet would not make US government contracts.

SilverElfin

15 hours ago

[-]

Didn’t Anthropic hire the infrastructure head from stripe and give him a CTO title? I would’ve thought that would help bring stability but if anything, things have become worse.

prngl

18 hours ago

[-]

It's funny how the false choice of American politics (Red vs Blue) also makes it into its consumerist corporatist life. That Anthropic's threadbare "limits" on government usage are seen as a heroic stand is a testament to just how far the goalposts on "ethical" deployment of AI have moved to the (fascist) right. As ever, politics precedes technology. We have Reagan's internet, we will have Trump's AI. God help us.

esseph

17 hours ago

[-]

> We have Reagan's internet

I've been on the internet since 93 or 94 and I've never once heard it called that. If anything, "Al Gore".

gedy

17 hours ago

[-]

From prior thread is there even "limits"? I thought the Anthropic statements were pointed out to be mostly toothless PR, e.g. "we don't agree", etc.

overvale

17 hours ago

[-]

Not just PR. I mean... they got fired over those limits.

parineum

16 hours ago

[-]

I'm not sure what the message this comment is trying to convey beyond throwing in the "corporatist" "consumerist" signalling buzzwords followed by calling the right fascists.

I've literally never heard anybody call the Internet "Reagan's internet", the best I can do is the Al Gore quote and who's calling anything Trump's AI?

What ideas are you trying to express here?

pgt

16 hours ago

[-]

No one left ChatGPT over that deal: they decided to try Anthropic's Claude because the Department of War gave them free marketing.

furyofantares

15 hours ago

[-]

I was paying both $200+/mo and I went down to only paying Anthropic $200/mo.

My experience has, for a few months, been that OpenAI's models are consistently quite noticeably better for me, and so my Codex CLI usage had been probably 5x as much as my Claude Code usage. So it's a major bummer to have cancelled, but I don't have it in me to keep giving them money.

I'd love to get off Anthropic too, despite the admirable stance they took, the whole deal made me extra uncomfortable that they were ever a defense contractor (war contractor?) to begin with.

Grimblewald

9 hours ago

[-]

I left the openai platform long before this, because I expected things like this. A few called me alarmist but are now also jumping ship because of this. OpenAI has zero moral or ethical substance and people _do_ care about that. I'm extreme enough that joining openAI after a specific date works against you and your CV, not with/for you, while leaving at a specific date speaks volumes in favour of you. People are the sum of their actions, not their words and siding with / continuing to use openAI speaks volumes on who you are.

michele_f

15 hours ago

[-]

The DoW or the CEO of Anthropic and his telenovela?

18 hours ago

[-]

I really enjoyed using Claude but the ever changing limits, weird policies (limited to Claude Code, you can't run Openclaw, etc) made switching a very easy choice.

OpenAI simply provides more value for the money at the moment.

17 hours ago

[-]

You're totally allowed to use Claude for OpenClaw and you're totally able to use Claude Code with non-Anthropic models. You must be referring to the fact that you have to use an API key and cannot use the auth intended for Claude-only products, which AFAIK is the same at every AI company (with Google destroying whole Google accounts for offenders most recently).

17 hours ago

[-]

OpenAI and Github Copilot do explicitly allow this, so do many of the new/Chinese providers such as Synthetic, Z.ai, etc.

Anthropic is the outlier here, obviously they can limit their subscriptions as they want but it's a major disadvantage compared to their competitors.

samrolken

16 hours ago

[-]

How can I retrieve an API key from ChatGPT to use my subscription in other tools then? This seems like it could be useful.

17 hours ago

[-]

> OpenAI ... explicitly allow this

Explicit means it's stated in OpenAI docs somewhere, but I can't find it. Link?

https://x.com/thsottiaux/status/2009742187484065881

12 hours ago

[-]

There's probably a better source somewhere but this is the one I had at hand.

m-schuetz

18 hours ago

[-]

That's weird, I switched away from ChatGPT because I mostly got superior results from Gemini and Claude.

17 hours ago

[-]

give 5.4 a shot - its straneg but surprisingly good for once. speaking as a daily opus user.

indigodaddy

17 hours ago

[-]

Used codex cli (5.4) for the first time (had never used codex or gpt for coding before - was using Opus 4.5 for everything), and it seems quite good. One thing I like is it's very focused on tests. Like it will just start setting up units tests for specs without you asking (whereas Opus would never do that unless you asked)-- I like that and think it's generally good. One thing I don't like about GPT though is it pauses too much throughout tasks where the immediate plan and also the more outward plan are all extremely well defined already in agents.md, but it still pauses too much between tasks saying, next logical task is X, and I say yeah go ahead, instead of it just proceeding to the next task which Id rather it do. I suppose that is a preference that should be put in some document? (agents.md?)

17 hours ago

[-]

well I have a running model (ha!) in my head about the frontier providers thats roughly like this:

- chatgpt is kinda autistic and must follow procedures no matter what and writes like some bland soulless but kinda correct style. great at research, horrible at creativity, slow at getting things done but at least getting there. good architect, mid builder, horrible designer/writer.

- claude is the sensitive diva that is able to really produce elegant code but has to be reminded of correctness checks and quality gates repeatedly, so it arrives at something good very fast (sometimes oneshot) but then loses time for correction loops and "those details". great overall balance, but permanent helicoptering needed or else it derails into weird loops.

- grok is the maker, super fast and on target, but doesn't think deeply as the others, its entirely goal/achievement focussed and does just enough things to get there. uniqiely it doesn't argue or self-monologue constantly about doubts or safety or ethics, but drives forward where other stuggles, and faster than others. cannot conenctrate for too long, but delivers fast. tons of quick edits? grok it is. "experimental" stuff that is not safe talking about... definitely grok.

- gemini is whatever you quickly need in your GSuite, plus looking at what others are doing and helping out with a sometimes different perspective, but beyond that worse than all the others on top.

- kimi: currently using it on the side, not bad at all so far, but also nothing distinct I crystallized in my head.

landl0rd

17 hours ago

[-]

Tried using 5.4 xhigh/codex yesterday with very narrow direction to write bazel rules for something. This is a pretty boiler-plate-y task with specific requirements. All it had to do was produce a normal rule set s.t. one could write declarative statements to use them just like any other language integration. It gave back a dumpsterfire, just shoehorning specific imperative build scripts into starlark. Asked opus 4.6 and got a normal sane ruleset.

5.4 seems terrible at anything that's even somewhat out-of-distribution.

ElFitz

17 hours ago

[-]

I got it to build a stereoscopic Metal raytracing renderer of a tesseract for the Vision Pro in less than half a day.

It surprisingly went at it progressively, starting with a basic CPU renderer, all the way to a basic special-purpose Metal shader. Now it’s trying its teeth at adding passthrough support. YMMV.

replwoacause

12 hours ago

[-]

The limits are what did it for me. They kept boasting about Opus performance and improvements, practically begging me to try it out, and when I did, it totally obliterated my usage. I'm sure its good, but I stick to Sonnet because I've been burned bad. Never had that problem with ChatGPT, but it turns out they're just unprincipled and evil, which is a shame.

scuff3d

17 hours ago

[-]

I tend to use LLMs more for research then actual coding, so I ended up going with GPT over Claude because it's chat interface just seems to work better for me. It balances out Claude being slightly better at software tasks.

next_xibalba

18 hours ago

[-]

Have you considered using Gemini?

Google seems to be on a hot streak with their models, and, since they're playing from behind, I'd expect favorable pricing and terms. But, I don't know anyone who is using or talking about Gemini. All the chatter seems to be Anthropic vs. OpenAI.

18 hours ago

[-]

because gemini, despite what stats say, still produces garbage once the problem gets harder. it nails it for lab conditions, but messy reality or creativity or even code quality is a far cry from opus or the latest gpt5.4 by a long shot. and always has been. its pretty good inside the GSuite because of integrations, but standalone its near worthless compared to even grok-code-fast which doesn't think much at all (but damn it is fast). At this point google keeps throwing noodlepots with AI against every wall in reach to see what sticks, which is more kind of desperation that still works to increase wall street highscores, but not exactly a streak or breakthrough. just rapid fire shotgun launches to see if anything sticks. No one serious talks Gemini because its not even worth considering still for real things outside shiny presentations and artificial benchmarks.

baq

17 hours ago

[-]

Gemini schools the other two when doing code reviews.

I used to think tokens are a commodity, but it’s becoming clear that the jagged frontier is different enough even for the easiest use case of SWE that there’s room for having two if not three providers of different foundational models. It isn’t a winner takes all, they’re all winning together. Cursor isn’t properly taking advantage of the situation yet.

Grimblewald

9 hours ago

[-]

My experience exactly. The more "real" the problems become, the more other models become unsuitable when compared to claude, with the sole exceptions being deepseek/kimi, which while speaking strictly w.r.t metrics and basic tasks are not better, they are more interesting and handle more odd and totally out of domain stuff better than the US models. An example being code i wrote for a hypercomplex sedenion based artififial neural network broke claude so bad it start saying it is chatgpt and cant evaluate/run code. similar experience for all US models, which are characterized by being extremely brittle at the fringes, though cladue least among them. Meanwhile chinese models are less capable for cookie cutter stuff but keep swinging when things get really weird and unusual. It's like US models optimize for the lowest minima acheivable, and god help you if distribution changes. Chinese models on the otgerhand seem to optimize for the flattest minima, giving poorer quality across the board but far more robust behaviour.

igor47

17 hours ago

[-]

I've tried. It's just not very good compared to either mentioned alternative.

siliconc0w

17 hours ago

[-]

I can't even use 3.1 with Gemini CLI, not sure why.

thereitgoes456

18 hours ago

[-]

What a baffling comment. Aren’t you aware of why this exodus is happening? (It’s not related to “value for the money”!) What are your feelings on that part?

lilytweed

17 hours ago

[-]

It is entirely okay to weigh the Department of War thing against other criteria when choosing a service.

thereitgoes456

17 hours ago

[-]

Agreed, but the comment should mention it. Nobody is talking about value for money right now.

I didn't mean to advocate for Anthropic, apologies.

17 hours ago

[-]

Whatever Anthropic might or might not do with the department of war interests me in proportion to how much I can influence this. Rounded, speaking as an European citizen, that appears to be exactly 0 to me.

[0]: https://en.wikipedia.org/wiki/Project_Maven [1]: https://tech.yahoo.com/ai/claude/articles/anthropic-ceo-admi...

serf

17 hours ago

[-]

ever tried living while simultaneously deciding to only patron groups that strictly morally and ethically align to your own personal beliefs?

I would love to, but a practical look at that concept seems practically impossible.

My .02c : Claude was already involved in underhanded shit I don't want a part of[0] and that generated little ethical response from Anthropic , i've had better luck as a 200/mo tier customer with ChatGPT, and I don't really think that Dario claiming that their newest LLM is conscious[1] on a market schedule is all that ethical, either.

kelseyfrog

17 hours ago

[-]

Why paint the choice as black and white? Most people are doing the best they can morally even if they don't get it 100% right. Even living 60% in accordance with your values is better than 50%. Likewise, bucketing organizations as good or bad misses the same nuance. Choosing something that is slightly better is has positive consequences despite it not being 100% good.

18 hours ago

[-]

not the poster, but I guess thats kinda american thinking that actually believes voting with your wallet will make any difference in this late stage crony capitalism in a post-facts world.

realistically: AI WILL get used in military and for killing autonomously, like it or not, believe it or not. I am also against that in principle but I do accept the fact my opinion just doesn't matter and practice radial acceptance or reality as-is. twitter/X is also alive and kicking, despite musk and anti-musk-hate. xAI/Grok is genuinely really good too compared to OAI/Claude, a bit different but very good. At this point all the "outcries" feel like noise I just skip on principle. But it could turn up the fire under the OAI team to go aggressive feature/pricing wise in order to retain/increase their userbase again, which is ... good, after all.

partiallypro

17 hours ago

[-]

If anyone thinks Anthropic or OpenAI are the "good guys," they've already lost the plot. If you look at additional reporting on the topic, not just the Anthropic PR spin, the disagreements were much more nuanced than it was portrayed by Anthropic. They aren't exactly a reliable narrator on the topic either. In fact it actually just seems like Amodei fumbled the deal and crashed out a bit. He's already walked back his internal memo, and is reportedly still seeking a deal with the Pentagon. I don't trust either CEO, I use their products, but if you're even leaning 51-49 on who is "less evil," I think you're giving too much slack.