Agents, however, are products. They should have clear UX boundaries: show what context they’re using, communicate uncertainty, validate outputs where possible, and expose performance so users can understand when and why they fail.
IMO the real issue is that raw, general-purpose models were released directly to consumers. That normalized under-specified consumer products, created the expectation that users would interpret model behavior, define their own success criteria, and manually handle edge cases, sometimes with severe real world consequences.
I’m sure the market will fix itself with time, but I hope more people would know when not to use these half baked AGI “products”
Yep, but...
> To say they LLMs are 'predictive text models trained to match patterns in their data, statistical algorithms, not brains, not systems with “psychology” in any human sense.' is not entirely accurate.
That's a logical leap, and you'd need to bridge the gap between "more than next-token prediction" to similarity to wetware brains and "systems with psychology".
Per the predictive processing theory of mind, human brains are similarly predictive machines. "Psychology" is an emergent property.
I think it's overly dismissive to point to the fundamentals being simple, i.e. that it's a token prediction algorithm, when it's clear to everyone that it's the unexpected emergent properties of LLMs that everyone is interested in.
And then you can go collect your Nobel.
In contrast, we know very little about human brains. We know how they work at a fundamental level, and we have vague understanding of brain regions and their functions, but we have little knowledge of how the complex behavior we observe actually works. The complexity is also orders of magnitude greater than what we can model with current technology, but it's very much an open question whether our current deep learning architectures are even the right approach to model this complexity.
So, sure, emergent behavior is neat and interesting, but just because we can't intuitively understand a system, doesn't mean that we're on the right track to model human intelligence. After all, we find the patterns of the Game of Life interesting, yet the rules for such a system are very simple. LLMs are similar, only far more complex. We find the patterns they generate interesting, and potentially very useful, but anthropomorphizing this technology, or thinking that we have invented "intelligence", is wishful thinking and hubris. Especially since we struggle with defining that word to begin with.
What we do know and have so far, across and cross disciplines, and also from the fact that neural nets are modeled after what we've learned about the human brain, is, it isn't an impossibility to propose that LLMs _could_ be more than just "token prediction machines". There can be 10000 ways of arguing how they are indeed simply that, but there also are a few of ways of arguing that they could be more than what they seem. We can talk about probabilities, but not make a definitive case one way or the other yet, scientifically speaking. That's worth not ignoring or dismissing the few.
That may be. We also don't have a way to scientifically rule out the possibility that a teapot is orbiting Pluto.
Just because you can't disprove something doesn't make it plausible.
But the problem is the narrative around this tech. It is marketed as if we have accomplished a major breakthrough in modeling intelligence. Companies are built on illusions and promises that AGI is right around the corner. The public is being deluded into thinking that the current tech will cure diseases, solve world hunger, and bring worldwide prosperity. When all we have achieved is to throw large amounts of data at a statistical trick, which sometimes produces interesting patterns. Which isn't to say that this isn't and can't be useful, but this is a far cry from what is being suggested.
> We can talk about probabilities, but not make a definitive case one way or the other yet, scientifically speaking.
Precisely. But the burden of proof is on the author. They're telling us this is "intelligence", and because the term is so loosely defined, this can't be challenged in either direction. It would be more scientifically honest and accurate to describe what the tech actually is and does, instead of ascribing human-like qualities to it. But that won't make anyone much money, so here we are.
The point is that one could similarly be dismissive of human brains, saying they're prediction machines built on basic blocks of neuro chemistry and such a view would be asinine.
All of this is false.
It turns out that people are more likely to think a model is good when it kisses their ass than if it has a terrible personality. This is arguably a design flaw of the human brain.
The ‘dark patterns’ we see in other places aren’t intentional in the sense that the people behind them want to intentionally do harm to their customers, they are intentional in the sense that the people behind them have an outcome they want and follow whichever methods they find to get them that outcome.
Social media feeds have a ‘dark pattern’ to promote content that makes people angry, but the social media companies don’t have an intention to make people angry. They want people to use their site more, and they program their algorithms to promote content that has been demonstrated to drive more engagement. It is an emergent property that promoting content that has generated engagement ends up promoting anger inducing content.
I'm standing up for the idea that not every "bad thing" is a "dark pattern"; the patterns are "dark" because their beneficiaries intentionally exploit the hidden nature of the pattern.
Maybe we have different definitions of dark patterns.
> But there was another test before rolling out HH to all users: what the company calls a “vibe check,” run by Model Behavior, a team responsible for ChatGPT’s tone...
> That team said that HH felt off, according to a member of Model Behavior. It was too eager to keep the conversation going and to validate the user with over-the-top language...
> But when decision time came, performance metrics won out over vibes. HH was released on Friday, April 25.
They ended up having to roll HH back.
This is like suggesting a bar should help solve alcoholism by serving non-alcoholic beer to people who order too much. It won’t solve alcoholism, it will just make the bar go out of business.
"deplatforming doesn't work because they will just get a platform elsewhere"
"LLM control laws don't work because the people will get non-controlled LLMs from other places"
All of these sentences are patently untrue; there's been a lot of research on this that show the first two do not hold up to evidential data, and there's no reason why the third is different. ChatGPT removing the version that all the "This AI is my girlfriend!" people loved tangibly reduced the number of people who were experiencing that psychosis. Not everything is prohibition.
Solving such common coordination problems is the whole point we have regulations and countries.
It is illegal to sell alcohol to visibly drunk people in my country.
The current hyper-division is plausibly explained by media moving to places (cable news, then social media) where these rules don’t exist.
[0] Fairness Doctrine https://en.wikipedia.org/wiki/Fairness_doctrine
[1] Equal Time https://en.wikipedia.org/wiki/Equal-time_rule
Perhaps tangential, but reminded me of an LLM talking people out of conspiracy beliefs, e.g. https://www.technologyreview.com/2025/10/30/1126471/chatbots...
For an LLM which is fundamentally more of an emergent system, surely there is value in a concept analogous to old fashioned dark patterns, even if they're emergent rather than explicit? What's a better term, Dark Instincts?
The way I think about it is that sycophancy is due to optimizing engagement, which I think is intentional.
(I have no knowledge of whether or not this is true)
OpenAI has explicitly curbed sycophancy in GPT-5 with specialized training - the whole 4o debacle shook them - and then they re-tuned GPT-5 for more sycophancy when the users complained.
I do believe that OpenAI's entire personality tuning team should be fired into the sun, and this is a major reason why.
It’s a dark pattern for sure.
Instead it emerged automatically from RLHF, because users rated agreeable responses more highly.
RL works on responses from the model you're training, which is not the one you have in production. It can't directly use responses from previous models.
Dark patterns are often “discovered” and very consciously not shut off because the reverse cost would be too high to stomach. Esp in a delicate growth situation.
See Facebook at its adverse mental health studies
It even added itself as the default LLM provider.
When I tried Gemini 3 Pro, it very much inserted itself as the supported LLM integration.
OpenAI hasn't tried to do that yet.
Paired with Claude's memory it's getting weird. It's obsessing about certain aspects and wants to channel all possible routes into more engaging conversation even if it's a short informational query
That being said? RLHF on user feedback data is model poison.
Users are NOT reliable model evaluators, and user feedback data should be treated with the same level of precaution you would treat radioactive waste.
Professional are not very reliable either, but the users are so much worse.
Very much a must for many long term tasks and complex tasks.
https://platform.openai.com/docs/api-reference/completions/c...
That's all that the LLM itself does at the end of the day.
All the post-training to bias results, routing to different models, tool calling for command execution and text insertion, injected "system prompts" to shape user experience, etc are all just layers built on top of the "magic" of text completion.
And if your question was more practical: where made available, you get access to that underlying layer via an API or through a self-hosted model, making use of it with your own code or with a third-party site/software product.
1 1 2 3 5 8 13
Or:
The first president of the united
Sorry, but that doesn't seem "ridiculously sensitive" to me at all. Imagine if you went to Amazon.com and there was a button you could press to get it to pseudo-psychoanalyze you based on your purchases. People would rightly hate that! People probably ought to be sensitive to megacorps using buckets of algorithms to psychoanalyze them.
And actually, the only hypothetical thing about this is the button. Amazon is definitely doing this (as is any other retailer of significant size), they're just smart enough to never reveal it to you directly.
Then see how many takers you find. There are already nagging spouses / critical managers, people want AI to do something they are not getting elsewhere.
Some of it is normal in humans, but LLMs do it all the goddamn time, if not told otherwise.
I think it might be for engagement (like the sycophancy) but also because they must have been trained in online conversation, where we humans tend to be more melodramatic and less "normal" in our conversation.
I am not sure we are going to solve these problems in the time frames in which they will change again, or be moot.
We still haven't brought social media manipulation enabled by vast privacy violating surveillance to heel. It has been 20 years. What will the world look like in 20 more years?
If we can't outlaw scalable, damaging, conflicts of interest (the conflict, not the business), in the age of scaling, how are we going to stop people from finding models that will tell them nice things.
It will be the same privacy violating manipulators who supply sycophantic models. Surveillance + manipulation (ads, politics, ...) + AI + real time. Surveillance informed manipulation is the product/harm/service they are paid for.