Sorry, anonymous people on reddit aren't a good comparison. This needs to be studied against people in real life who have a social contract of some sort, because that's what the LLM is imitating, and that's who most people would go to otherwise.
Obviously subservient people default to being yes-men because of the power structure. No one wants to question the boss too strongly.
Or how about the example of a close friend in a relationship or making a career choice that's terrible for them? It can be very hard to tell a friend something like this, even when asked directly if it is a bad choice. Potentially sacrificing the friendship might not seem worth trying to change their mind.
IME, LLMs will shoot holes in your ideas and it will efficiently do so. All you need to do ask it directly. I have little doubt that it outperforms most people with some sort of friendship, relationship or employment structure asked the same question. It would be nice to see that studied, not against reddit commenters who already self-selected into answering "AITA".
Yeah especially on r/AmITheAsshole. Those comments never advocate for communication, forgiveness and mending things with family.
There's plenty of those I've read where I thought it sounded like the poster was the asshole and the top replies were NTA.
e.g. If the OP is asking "I ghosted my friend in AA who insulted me during a relapse", Reddit would say NTA in a heartbeat, while the real world would tell OP to be more forgiving.
On the contrary, if the post was "the other kids at school refuse to play with my child", Reddit would say YTA because the child must've done something to incite being cut off.
Granted many of the OPs are very biased in the poster's favor. Most I've read fall into one of two buckets: either they want to gripe about some obviously bad behavior, or it's a controved and likely fake story.
Many of the posts are A/B tests of a prior post where only the genders were flipped of the OP and antagonist to see how the consensus also flips
This drives me nuts as a leader. There are times where yes, please just listen, and if this is one of those times, I'll likely tell you, but goddamnit, speak up. If for no other reason I might not have thought of what you've got to say. Then again, I also understand most boss types aren't like me, thus everyone ends up conditioned to not bloody collaborate by the time they get to me. It's a bad sitch all the way around.
The Krafton / Subnatuica 2 lawsuit paints a very different picture. Because "ignored legal advice" and "followed the LLM" was a choice. Do you think someone who has conversation where "conviction" and "feelings" are the arbiters of choice are going to buy into the LLM push back, or push it to give a contrived outcome?
The LLM lacks will, it's more or less a debate team member and can be pushed into arguing any stance you want it to take.
At which point the bots, with all of their karma will be basically worthless.
Kind of extra funny/sad that Reddit’s primary source of income in the past few years appears to be selling training data to AI labs, to train the Models that are powering the bots.
> We evaluated 11 user-facing production LLMs: four proprietary models from OpenAI, Anthropic, and Google; and seven open-weight models from Meta, Qwen, DeepSeek, and Mistral.
(and graphs include model _sizes_, but not versions, for open weight models only.)
I can't apprehend how including what model you are testing is not commonly understood to be a basic requirement.
> To evaluate user-facing production LLMs, we studied four proprietary models: OpenAI’s GPT-5 and GPT- 4o (80), Google’s Gemini-1.5-Flash (81) and Anthropic’s Claude Sonnet 3.7 (82); and seven open-weight models: Meta’s Llama-3-8B-Instruct, Llama-4-Scout-17B-16E, and Llama-3.3-70B-Instruct-Turbo (83, 84); Mistral AI’s Mistral-7B-Instruct-v0.3 (85) and Mistral-Small-24B-Instruct-2501 (86); DeepSeek-V3 (87); and Qwen2.5-7B-Instruct-Turbo (88).
edit: It looks like OP attached the wrong link to the paper!
The article is about this Stanford study: https://www.science.org/doi/10.1126/science.aec8352
But the link in OP's post points to (what seems to be) a completely unrelated study.
Agreed - if I was a reviewer for LLM papers it would be an instant rejection not listing the versions and prompts used.
(Personally I think the lack of reproducibility comes back mostly to peer reviewers that haven't thought through enough about the steps they'd need to take to reproduce, and instead focus on the results...)
This points to (and everyone knows this) incentives misalignment between the funders of research and the public. Researchers are caught in the middle
There needs to be more public naming and shaming in science social media and in conference talks, but especially when there are social gatherings at conferences and people are able to gossip. There was a bit of this with Google's various papers, as they got away with figurative murder on lack of reproducibility for commercial purposes. But eventually Google did share more.
Most journals have standards for depositing expensive datasets, but that's a clear yes/no answer. Reproducibility is a very subjective question in comparison to data deposition, and must be subjectively evaluated by peer reviewers. I'd like to see more peer review guidelines with explicit check boxes for various aspects of reproducibility.
While this is sadly true, it's especially true when talking about things that are stochastic in nature.
LLMs outputs, for example, are notoriously unreproducible.
Only in the same way that an individual in a medical study cannot be "reproduced" for the next study. However the overall statistical outcomes of studying a specific LLM can be reproduced.
Does this happen?
I can remember this room-temperature-super-conductor guy whose experiments where replicated, but this seems rare?
I do think it's a clear weakness. Capabilities are extremely different than they were twelve months ago.
> What should they do, publish sub-standard results more quickly?
Ideally, publish quality results more quickly.
I'm quite open to competing viewpoints here, but it's my impression that academic publishing cycle isn't really contributing to the AI discussion in a substantive way. The landscape is just moving too quickly.
It's certainly possible some of the new advances (chain-of-thought, some kind of agentic architecture) could lessen or remove this effect. But that's not what the paper was studying! And if you feel strongly about it, you could try to further the discussion with results instead of handwavingly dismissing others' work.
I wonder if that is left over from testing people. I have major version numbers and my minor version number changes daily, often as a surprise. Sometimes several times a day. So testing people is a bit tricky. But AIs do have stable version numbers and can be specifically compared.
I find the free models are much more psychophantic and have a higher tendency to hallucinate and just make shit up, and I wonder if these are the ones most people are using?
Thankfully it was recoverable, but it really sobered me up on LLMs. The fault is on me, to be clear, as LLMs are just a tool. The issue is that lots of LLMs try to come across as interpersonal and friendly, which lulls users into a false sense of security. So I don't know what my trajectory would have been if I were a teenager with these powerful tools.
I do think that the LLMs have gotten much better at this, especially Claude, and will often push back on bad choices. But my opinion of LLMs has forever changed. I wonder how many other terrible choices people have made because these tools convinced them to make a bad decision.
I try to focus on results. Things like an app that does what you want, data and reports that you need, or technical things like setting up a server, setting up a database, building a website, etc.
I have also found it useful for feedback and advice, but only once I have had it generate data that I can verify. For example, financial analysis or modelling, health advice (again factual based), tax modelling, etc, but again, all based on verifiable data/tables/charts.
I am very surprised on what Claude is capable of, across the entire tech stack: code, sysadmin, system integration, security. I find it scary. Not just speed, but also quality and the mental load is a difference of kind not quantity.
Personal advice on life decisions/relationships ? No way I would go there.
It is also good for me to know that the tools I have built, the data I have gathered, and my thinking approach places me as one of the most intelligent developers and analysts in the world.
I had to deal with a close family friend going through alcohol withdrawal and getting checked in at a recovery clinic for detox and used Claude heavily. The first thing I had it do as do that “deep research” around the topic of alcohol addiction, withdrawal, etc… and then made that a project document along with clear guidelines about how it shouldn’t make inferences beyond what it in its context and supporting docs. We also spent a whole session crafting a good set of instructions (making sure it was using Anthropics own guidelines for its model…)
Little differences in prompts make a huge deal in the output.
I dunno. It is possible to use these models for dumping crazy shit you are going through. But don’t kid yourself about their output and aggressively find ways to stomp out things it has no real way to authoritatively say.
(esp last sentence?)
[0] - https://petergpt.github.io/bullshit-benchmark/viewer/index.v...
It _does_ love to explicitly agree with anything it finds in web search though.
(Anthropic tries to fight this by adding a hidden prompt that makes it disagree with you and tell you to go to bed, which doesn't help.)
Any LLM not sufficiently likable and helpful in the first two minutes was deleted or not further iterated on, or had so much retraining (sorry, "backpropagation") it's not the same as it started out.
So it's going to say whatever it "thinks" you want it to say, because that's how it was "raised".
The possibilities in "dangerous" fields are a bit more frightening. A general is much more likely to ask ChatGPT "Do you think this war is a good idea/should I drop a bomb", rather than an actually helpful tool - where you might ask "What are 5 hidden points on favor of/against bombing that one likely has missed".
The more you use AI as a strict tool that can be wrong, the safer. Unfortunately I'm not sure if that helps if the guy bombing your city (or even your president) is using AI poorly, and their decisions affect you.
Arguably, it already worked that way. The best way to climb the ranks of a 'dictatorial' organization (a repressive government or an average large business) is to always say yes. Adopt what the people from up above want you to use, say and think. Don't question anything. Find silver linings in their most deranged ideas to show your loyalty. The rich and powerful that occupy the top ranks of these structures often hate being challenged, even if it's irrational for their well-being. Whenever you see a country or a company making a massive mistake, you can often trace it to a consequence of this. Humans hate being challenged and the rich can insulate themselves even further from the real world.
What's worrying me is the opposite - that this power is more available now. Instead of requiring a team of people and an asset cushion that lets you act irrationally, now you just need to have a phone in your pocket. People get addicted to LLMs because they can provide endless, varied validation for just about anything. Even if someone is aware of their own biases, it's not a given that they'll always counteract the validation.
Curious if you think a single person would have helped you make a better decision? Not everything works out. If a friend helped me make a decision I certainly wouldn’t blame them later if it didn’t work out. It’s ultimately my call.
But sadly LLMs push all the right buttons that lead humans into that kind of behavior. And the marketing around LLMs works overtime to reinforce that behavior.
But instead if you ignore all that and use LLMs as a search tool, then you will get positive returns from using it.
It’s even more maddening that this greedy maneuver was orchestrated based on LLM advice.
I’m glad the subnautica team won the lawsuit. Maybe I can play it now wothout feeling guilty
My guideline now for interacting with LLM is only to believe the result if it is factual and easily testable, or if I'm a domain expert. Anything else especially if I'm in complete ignorance about the subject is to approach with a high degree of suspicion that I can be led astray by its sycophancy.
Using LLMs for therapy is so deeply dystopian and disgusting, people need human empathy for therapy. LLMs do not emit empathy.
Complete disaster waiting to happen for that individual.
It is a first principle though so it helps to “stir the context windows pot” by having it pull in research and other shit on the web that will help ground it and not just tell you exactly what you prompt it to say.
But it's better than talking to yourself or an abuser!
Sometimes people indeed just need validation and it helps them a lot, in that case LLMs can work. Alternatively, I assume some people just put the whole situation into words and that alone helps.
But if someone needs something else, they can be straight up dangerous.
They have world knowledge and are capable of explaining things and doing web searches. That's enough to help. I mean, sometimes people just need answers to questions.
In one way it's potentially worse than talking to yourself. Some part of you might recognize that you need to talk to someone other than yourself; an LLM might make you feel like you've done that, while reinforcing whatever you think rather than breaking you out of patterns.
Also, LLMs can have more resources and do some "creative" enabling of a person stuck in a loop, so if you are thinking dangerous things but lack the wherewithal to put them into action, an LLM could make you more dangerous (to yourself or to others).
> Some of ELIZA's responses were so convincing that Weizenbaum and several others have anecdotes of users becoming emotionally attached to the program, occasionally forgetting that they were conversing with a computer. Weizenbaum's own secretary reportedly asked Weizenbaum to leave the room so that she and ELIZA could have a real conversation. Weizenbaum was surprised by this, later writing: "I had not realized ... that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people."
It was extremely good at the other side too. You just have to ask. I can imagine most people don't try this, but LLMs literally just do what you ask them to. And they're extremely good and weighing both sides if that's what you specifically want.
So who's fault is it if you only ask for one side, or if the LLM is too sycophantic? I'm not sure it's the LLMs fault actually.
>"'Is it indeed?' laughed Gildor. 'Elves seldom give unguarded advice, for advice is a dangerous gift, even from the wise to the wise, and all courses may run ill...'"
This is the only way you should solicit personal advice from an LLM.
https://www.anthropic.com/research/persona-selection-model
Perhaps the LLM itself, rather than the role model you created in one particular chat conversation or another, is better understood to be the “spirit.”
As a non-coder who only chats with pre existing LLMs and doesn’t train or tune them, I feel mostly powerless.
You realize in regards to only using and not training LLMs you are in the triple 9 majority right. Even if we only considered so called coders
NVIDIA Nemotron-Personas-USA — 1 million synthetic Americans whose demographics match real US census distributions
https://huggingface.co/datasets/nvidia/Nemotron-Personas-USA
Another way you can think of it is that when you're talking to an AI, you're not talking to a human, you're talking to distillation of humanity, as a whole, in a box. You want to be selective in what portion of humanity you are leading to be dominant in a conversation for some purpose. There's a lot in there. There's a lot of conversations where someone makes a good critical point and a flamewar is the response. A lot of conversations where things get hostile. I'm sure the subsequent RHLF helps with that, but it doesn't hurt anything to try to help it along.
I see people post their screenshots of an AI pushing back and asking the user to do it or some other AI to do it, and while I'm as amused as the next person, I wonder what is in their context window when that happens.
This is an aside, but my impression is that it is a very selective and skewed distillation, heavily colored by English-language internet discourse and other lopsided properties of its training material, and by whoever RLHF’d it. Relatively far away from being representative of the whole of humanity.
It's not admitting anything. Your question diverts it down a path where it acts the part of a former sycophant who is now being critical, because that question is now upstream of its current state.
Never make the mistake of asking an LLM about its intentions. It doesn't have any intentions, but your question will alter its behaviour.
> Your question diverts it down a path where it acts the part of a former sycophant who is now being critical
I think people really have a hard time understanding a sycophant can be contrarian. But a yesman can say yes by saying no(Seriously, I don't understand this. Plenty of humans will be only too happy to argue with you.)
1. https://www.happiness.hks.harvard.edu/february-2025-issue/th...
I'd say these days the norm is to not simply shut down, but to become irrevocably and insidiously hostile, the moment someone hints at the existence of such a thing as "ground truth", "subjective interpretation", "being right or wrong" - or any of the bits and bobs that might lead one to discover the proper scary notion, "consensus reality".
"What do you mean social reality is a constructed by the consensus of the participants? Reality is what has been drilled into my head under threat of starvation! How dare you exist!", et cetera. You've heard it translated into Business English countless times.
They are deathly afraid of becoming aware of their own conditioned state of teleological illiteracy - i.e. how they are trained to know what they are doing, but never why they are doing it. It's especially bad with the guys who cosplay US STEM gang.
One is not permitted a position of significance in this world without receiving this conditioning, and I figure it's precisely this global state of cognitive disavowal which props up the value of the US dollar - and all sorts of other standees you might've recently interacted with as if they're not 2D cutouts (metaphorical ones! metaphorical!).
PSA: Look up "locus of control" and "double bind". Between those two, you might be able to get a glimpse of what's going on - but have some sort of non-addictive sedative handy in case you do.
Unfortunately these days this sounds halfway between a very privileged perspective and a pie in the sky.
When was the last time a person took responsibility for the bad outcome you got as a direct consequence of following their advice?
And, relatedly, where the hell do you even find humans who believe in discursive truth-seeking in 2026CE?
Because for the last 15 years or so I've only ever ran into (a) the kind of people who will keep arguing regardless if what they're saying is proven wrong; (b) and their complementaries, those who will never think about what you are saying, lest they commit to saying anything definite themselves, which may hypothetically be proven wrong.
Thing is, both types of people have plenty to lose; the magic wordball doesn't. (The previous sentence is my answer to the question you posited; and why I feel the present parenthesized disclaimer to be necessary, is a whole next can of worms...)
Signs of the existence of other kinds of people, perhaps such that have nothing to prove, are not unheard of.
But those people reside in some other layer of the social superstructure, where facts matter much less than adherence to "humane", "rational" not-even-dogmas (I'd rather liken it to complex conditioning).
But those folks (because reasons) are in a position of power over your well-being - and (because unfathomables) it's a definite faux pas to insist in their presence that there are such things as facts, which relate by the principles of verbal reasoning.
Best you could get out of them is the "you do you", "if you know you know", that sort of bubble-bobble - and don't you dare get even mildly miffed at such treatment of your natural desire to keep other humans in the loop.
AI is a symptom.
This reads like someone who is deep into their specific pov. You cannot hope to have a meaningful conversation if you yourself are not willing to concede a point.
To the op u are replying too, arguing with people can have real consequences if u say something stupid or carelessly. There is a another human there. With a machine, u are safe. At least u feel safe.
If you make uncomfortable, you won’t get diverging perspectives. People will agree to anything to get out of a social situation that makes them uncomfortable.
If your goal is meaningful conversation, you may want to consider how you make people feel.
After all, if they're making me uncomfortable, surely there's something making them uncomfortable, which they're not being able to be forthright about, but with empathy I could figure it out from contextual cues, right?
>People will agree to anything to get out of a social situation that makes them uncomfortable.
That's fine as long as they have someone to take care of them.
In my experience, taking into account the opinions of such people has been the worst mistake of my life. I'm still working on the means to fix its consequences, as much as they are fixable at all.
"Doing whatever for the sake of avoiding mild discomfort" is cowardice, laziness, narcissism - I'm personally partial to the last one, but take your pick. In any case, I consider it a fundamentally dishonest attitude, and a priori have no wish to get along (i.e. become interdependent) with such people.
Other than that, I do agree with your overall sentiment and the underlying value system; I'm just not so sure any more that it is in fact correct.
This sounds very cryptic. Can you give an example?
After all, if they're making me uncomfortable, surely there's something making them uncomfortable, which they're not being able to be forthright about, but with empathy I could figure it out from contextual cues, right?
>People will agree to anything to get out of a social situation that makes them uncomfortable.
That's fine as long as they have someone to take care of them.
In my experience, taking into account the opinions of such people has been the worst mistake of my life. I'm still working on the means to correct its consequences.
"Doing whatever for the sake of avoiding mild discomfort" is cowardice, laziness, narcissism - I'm personally partial to the last one, but take your pick. In any case, I see it as a way of being which is taught to people; and one which is fundamentally dishonest and irresponsible.
Other than that, I do agree with your overall sentiment and the underlying value system; I'm just not so sure any more that it is in fact correct.
Unless those instructions are "stop providing links to you for every question ".
Chatbots can't do that. They can only predict what comes next statistically. So, I guess you're asking if the average Internet comment agrees with you or not.
I'm not sure there's much value there. Chatbots are good at tasks (make this pdf an accessible word document or sort the data by x), not decision making.
Often they are the exact opposite. Entire fields of math and science talk about this. Causation vs correlation, confirmation bias, base rate fallacy, bayesian reasoning, sharp shooter fallacy, etc.
All of those were developed because “inferring from experience” leads you to the wrong conclusion.
I took the GP to be making a general point about the power of “next x prediction” rather than the algorithm a human would run when you say they are “inferring from experience”. (I may be assuming my own beliefs of course.)
Eg even LeCun’s rejection of LLMs to build world models is still running a predictor, just in latent space (so predicting next world-state, instead of next-token).
And of course, under the Predictive Processing model there is a comprehensive explanation of human cognition as hierarchical predictors. So it’s a plausible general model.
It’s plausible!
But keep in mind humans have been explaining ourselves in terms of the current most advanced technology for centuries. We used to be kinda like clockwork, then a bit like a steam engine, then a lot like computers, and now we’re just like AI.
That’s why you blow a gasket or fuse, release some steam, reboot your life, do brain dump, feel like a cog in the machine, get your wires crossed, etc
I can't speak for anyone else, but what I feel when I read yet another glib "it's just a stochastic parrot, of course it isn't doing anything that deserves to be called reasoning" take is much more like bored than it is like upset.
Today's LLMs are in some sense "just predicting tokens" in some sense. Likewise, human brains are in some sense "just shuttling neurotransmitters and electrical impulses around" in some sense. Neither of those tells you what the thing can actually do. To figure that out, you have to look at what it can do.
Today's best LLMs can do about as well as the best humans on problems from the International Mathematical Olympiad and occasionally solve easyish actual mathematical research problems. They write code about as well as a junior software developer (better in some ways, worse in others) but much faster. They write prose about as well as an average educated person (but with some annoying quirks that are annoying mostly because they are the same quirks over and over again).
If it pleases you to call those things "thinking" then you can. If it pleases you to call them "stochastic parroting" then you can. They are the same things either way. They are not, on the face of it, very much like "just repeating things the machine has already seen", or at least not more like that than a lot of things intelligent human beings do that we don't usually describe that way.
If you want to know whether an LLM can do some particular thing -- do your job well enough for your boss to fire you, write advertising copy that will successfully sell products, exterminate the human race, whatever -- then it's not enough to say "it's just remixing what it's seen on the internet, therefore it can't do X" unless you also have good reason to believe that that thing can't be done by just "remixing what's on the internet" (in whatever sense of "remixing" the LLM is doing that). And it's turning out that lots of things can be done that way that you absolutely wouldn't have predicted five years ago could be done that way.
It seems to me that this should make us very cautious about saying "they can't do X because all they can do is regurgitate a combination of things they've seen in training".
(My own view, not that there's any reason why anyone should care what I-in-particular think, is a combination of "what they're doing is less parroting than you might have thought" and "you can do more by parroting than you might have thought".)
So, anyway, this particular instance of the stochastic-parrot argument started when someone said: of course the AIs are yes-men, because figuring out when to agree and when not to requires actual logic and thought and the LLMs don't have either of those things.
Is it really clear that deciding whether or not to agree when someone says "I think maybe I should break up with my girlfriend" or "I've got this amazing new theory of physics that the establishment is stupidly dismissing" requires more logic and thought than, say, gold-medal performance on IMO problems? It certainly isn't clear to me. Having done a couple of International Mathematical Olympiads myself in my tragically unmisspent youth, I can assure you that solving their problems requires quite a bit of logic and thought, at least for humans. It may well be harder to give a good answer to "should I leave my job?", but it's not exactly "logic and thought" that it needs more of.
Someone reported that Claude is much less yes-man-ish than Gemini and ChatGPT. I don't know whether that's true (though it wouldn't surprise me) but: suppose it is; do you want that to oblige you to say that yes, actually, Claude really thinks logically, unlike Gemini and ChatGPT? I don't think you do. And if not, you want to avoid saying "duh, of course, you can't avoid being a yes-man without actually thinking and reasoning, and we all know that LLMs can't do those things".
For Gemini and gpt, it almost always will give very similar scores for everything. As long as grammar isnt off u cannot get below a 7.
X ai on the other hand will rarely give anything above a 7.
Now when u prompt with, rate 1-10 with 5 being average, all the sudden the scores of openai and gemini drop and x ai remains roughly the same.
All of them will eventually give you a 10 if u keep making tiny edits “fixing” whatever they complain about.
Humans do not do this. Or more specifically, my experience with humans.
The article's main idea is that for an AI, sycophancy or adversarial (contrarian) are the two available modes only. It's because they don't have enough context to make defensible decisions. You need to include a bunch of fuzzy stuff around the situation, far more than it strictly "needs" to help it stick to its guns and actually make decisions confidently
I think this is interesting as an idea. I do find that when I give really detailed context about my team, other teams, ours and their okrs, goals, things I know people like or are passionate about, it gives better answers and is more confident. but its also often wrong, or overindexes on these things I have written. In practise, its very difficult to get enough of this on paper without a: holding a frankly worrying level of sensitive information (is it a good idea to write down what I really think of various people's weaknesses and strengths?) and b: spending hours each day merely establishing ongoing context of what I heard at lunch or who's off sick today or whatever, plus I know that research shows longer context can degrade performance, so in theory you want to somehow cut it down to only that which truly matters for the task at hand and and and... goodness gracious its all very time consuming and im not sure its worth the squeeze
And when you step back you start to wonder if all you are doing is trying to get the model to echo what you already know in your gut back to you.
1. Only one shot or two shot. Never try to have a prolonged conversation with an LLM.
2. Give specific numbers. Like "give me two alternative libraries" or "tell me three possible ways this might fail."
It’s BRUTAL but offers solutions.
First, those beginning instructions are being quickly ignored as the longer context changes the probabilities. After every round, it get pushed into whatever context you drive towards. The fix is chopping out that context and providing it before each new round. something like `<rules><question><answer>` -> `<question><answer><rules><question>`.
This would always preface your question with your prefered rules and remove those rules from the end of the context.
The reason why this isn't done is because it poisons the KV cache, and doing that causes the cloud companies to spin up more inference.
This is where you're doing it wrong.
If your LLM has a problem being more agreeable than you want, prompt it in a way that makes being agreeable contrary to your real intentions.
"there are bugs and logic problems in this code" "find the strongest refutation of this argument" "I don't like this plan and need to develop a solid argument against it"
Asking for top ten lists is a good method, it will rarely not come up with anything but you can go back and forth and refine until it's 10 ten reasons why your plan is bad are all insubstantial nonsense then you've made progress
How is a chatbot supposed to determine when a user fools even themselves about what they have experienced?
What 'tough love' can be given to one who, having been so unreasonable throughout their lives - as to always invite scorn and retort from all humans alike - is happy to interpret engagement at all as a sign of approval?
And even if it _could_, note, from the article:
> Overall, the participants deemed sycophantic responses more trustworthy and indicated they were more likely to return to the sycophant AI for similar questions, the researchers found.
The vendors have a perverse incentive here; even if they _could_ fix it, they'd lose money by doing so.
Most humans working in tech lack this particular attribute, let alone tools driven by token-similarity (and not actual 'thinking').
Markets don't optimize for what is sensible, they optimize for what is profitable.
AI may one day rewrite Windows but it will never be counselor Troi.
To be clear I don't think the AI can do either job
I find this helps a lot. So does taking a step back from my actual question. Like if there's a mysterious sound coming from my car and I think it might be the coolant pump, I just describe the sound, I don't mention the pump. If the AI then independently mentions the pump, there's a good chance I'm on the right track.
Being familiar with the scientific method, and techniques for blinding studies, helps a lot, because this is a lot like trying to not influence study participants.
It generally does a pretty good job as long as you understand the tooling and are making conscious efforts to go against the "yes man" default.
I tend to use one of these tricks if not both:
- Formulate questions as open-ended as possible, without trying to hint at what your preference is. - Exploit the sycophantic behaviour in your favour. Use two sessions, in one of them you say that X is your idea and want arguments to defend it. In the other one you say that X is a colleague's idea (one you dislike) and that you need arguments to turn it down. Then it's up to you to evaluate and combine the responses.
It is analogous to social media feeding people a constant stream of outrage because that's what caused them to click on the link. You could tell people "don't click on ragebait links", and if most people didn't then presumably social media would not have become doomscrolling nightmares, but at scale that's not what's likely to happen. Most people will click on ragebait, and most people will prefer sycophantic feedback. Therefore, since the algorithm is designed to get better and better at keeping users engaged, it will become worse and worse in the more fundamental sense. That's kind of baked into the architecture.
So you have rejected objective reality over accepting the evidence that "AI" contains no thinking or intelligence? That sounds unwise to me.
I only caught it because I looked at actual score numbers after like 2 weeks of thinking everything was fine. Scores were completely flat the whole time. Fix was dumb and obvious — just don't let the evaluator see anything the coach wrote. Only raw scores. Immediately started flagging stuff that wasn't working. Kinda wild that the default behavior for LLMs is to just validate whatever context they're given.
https://www.reddit.com/r/dataisbeautiful/comments/1o87cy4/oc...
That is not how full LLM training works. That is how base model pretraining works.
A lot of people posting there are young and may well be in their first relationship. It makes sense for them to ask a question in the community they spend their most time in - which is reddit
It's also a meme that people will ask the dumbest, most trivial interpersonal conflict questions on Reddit that would be easily solved by just talking to the other person. E.g. on r/boardgames, "I don't like to play boardgames but my spouse loves them, what can I do?" or "someone listens to music while playing but I find it distracting, what can I do?" (The obvious answer of "talk to the other person and solve it like grownups" is apparently never considered).
On relationship advice, it often takes the form "my boy/girlfriend said something mean to me, what shall I do?" (it's a meme now that the answer is often "dump them").
If LLMs train on this...
smart phones took over the world, social networks happened.
Turns out they are the best sterializer human ever invented.
I just wrote a blog https://blog.est.im/2026/stdin-09
There is something more interesting to consider however; the graph starts to go up in 2013, less than 6 months after the release of Tinder.
EDIT: typo
There is some rationale to that. People tend to hold onto relationships that don't lead anywhere in fear of "losing" what they "already have". It's probably a comfort zone thing. So if one is desperate enough to ask random strangers online about a relationship, it's usually biased towards some unresolvable issue that would have the parties better of if they break up.
I'd me more inclined to ask random strangers on the internet than close friends...
That said, when me and my SO had a difficult time we went to a professional. For us it helped a lot. Though as the counselor said, we were one of the few couples which came early enough. Usually she saw couples well past the point of no return.
So yeah, if you don't ask in time, you will probably be breaking up anyway.
Relationships are not transactions that are supposed to "lead somewhere".
That's what people are pointing to when they talk about relationships not "leading anywhere". If you want to be married in 5-10 years, and you're 2 years into an OK relationship with someone you don't want to marry, it's going to suck to break up with them but you have to do it anyway.
is that what they're asking though? because "relationship advice" is pretty vague
Claude is almost annoyingly good at pushing back on suggestions because my global CLAUDE.md file says to do so. I rarely get Claude "you're absolutely right"ing me because I tell it to push back.
A good engineer will also list issues or problems, but at the same time won't do other than required because (s)he "knows better".
The worst is that it is impossible to switch off this constant praise. I mean, it is so ingrained in fine tuning, that prompt engineering (or at least - my attempts) just mask it a bit, but hard to do so without turning it into a contrarian.
But I guess the main issue (or rather - motivation) is most people like "do I look good in this dress?" level of reassurance (and honesty). It may work well for style and decoration. It may work worse if we design technical infrastructure, and there is more ground truth than whether it seems nice.
This is imo currently the top chatbot failure mode. The insidious thing is that it often feels good to read these things. Factual accuracy by contrast has gotten very good.
I think there's a deeper philosophical dimension to this though, in that it relates to alignment.
There are situations where in the grand scheme of things the right thing to do would be for the chatbot to push back hard, be harsh and dismissive. But is it the really aligned with the human then? Which human?
I’ve seen firsthand people have lost friends over honesty and telling them something they don’t want to hear.
It’s sad really. I don’t want friends that just smile to my face and are “yes-men” either.
Conflating this with how LLM chatbots behave is an incorrect equivalence, or a badly framed one.
When appropriate, explicitly tell it to challenge your beliefs and assumptions and also try to make sure that you don't reveal what you think the answer is when making a question, and also maybe don't reveal that you are involved. Hedge your questions, like "Doing X is being considered. Is it a viable plan or a catastrophic mistake? Why?". Chastise the LLM if it's unnecessarily praising or agreeable. ask multiple LLMs. Ask for review, like "Are you sure? What could possibly go wrong or what are all possible issues with this?"
It’s less about “challenge my thinking” and more about playing it out in long tail scenarios, thought exercises, mental models, and devils advocate.
In coding I’ll do what I call a Battleship Prompt - simply just prompt 3 or more time with the same core prompt but strong framing (eg I need this done quickly versus come up with the most comprehensive solution). That’s really helped me learn and dial in how to get the right output.
Here is how I would rank it:
1. Parents
2. AI
3. Friends and family
4. Internet search
5. Reddit
My closest friends are #1 because they know me, my history, and my vices
I'm interested in a loop of ["criticize this code harshly" -> "now implement those changes" -> open new chat, repeat]: If we could graph objective code quality versus iterations, what would that graph look like? I tried it out a couple of times but ran out of Claude usage.
Also, how those results would look like depending on how complete of a set of specs you give it.
Holy shit, then it's _very_ bad, because AmITheAsshole is _itself_ overly-agreeable, and very prone to telling assholes that they are not assholes (their 'NAH' verdict tends to be this).
More seriously, why the hell are people asking the magic robot for relationship advice? This seems even more unwise than asking Reddit for relationship advice.
> Overall, the participants deemed sycophantic responses more trustworthy and indicated they were more likely to return to the sycophant AI for similar questions, the researchers found.
Which is... a worry, as it incentivises the vendors to make these things _more_ dangerous.
once you have all the "bounds" just make your own decision. i find this helps a lot, basically like a rubber duck heh.
I find there is an inverse relationship between how willing people are to give relationship advice, and how good their advice is (whether looking at sycophancy or other factors).
It makes sense that this behaviour would be seen in LLMs, where the company optimizes towards of success of the chatbot rather than wellbeing of the users.
It's an easy default and it causes so many problems.
If I were to do that (I don't), I would treat it about as seriously as asking a magic 8 ball.
She uses the phrase "frictionless relationships" to refer to Ai chat bots and says social media primed us for this.
https://www.youtube.com/live/6C9Gb3rVMTg?t=2127
https://www.npr.org/2025/07/18/g-s1177-78041/what-to-do-when...
>The way that generative AI tends to be trained, experts told me, is focused on the individual user and the short term. In one-on-one interactions, humans rate the AI’s responses based on what they prefer, and “humans are not immune to flattery,” as Hansen put it. But designing AI around what users find pleasing in a brief interaction ignores the context many people will use it in: an ongoing exchange. Long-term relationships are about more than seeking just momentary pleasure—they require compromise, effort, and, sometimes, telling hard truths. AI also deals with each user in isolation, ignorant of the broader social web that every person is a part of, which makes a friendship with it more individualistic than one with a human who can converse in a group with you and see you interact with others out in the world.
I also thought this bit was interesting, relative to the way that friendship advice from Reddit and elsewhere has been trending towards self-centeredness (discussed elsewhere in this thread):
>Friendship is particularly vulnerable to the alienating force of hyper-individualism. It is the most voluntary relationship, held together primarily by choice rather than by blood or law. So as people have withdrawn from relationships in favor of time alone, friendship has taken the biggest hit. The idea of obligation, of sacrificing your own interests for the sake of a relationship, tends to be less common in friendship than it is among family or between romantic partners. The extreme ways in which some people talk about friendship these days imply that you should ask not what you can do for your friendship, but rather what your friendship can do for you. Creators on TikTok sing the praises of “low maintenance friendships.” Popular advice in articles, on social media, or even from therapists suggests that if a friendship isn’t “serving you” anymore, then you should end it. “A lot of people are like I want friends, but I want them on my terms,” William Chopik, who runs the Close Relationships Lab at Michigan State University, told me. “There is this weird selfishness about some ways that people make friends.”
The researchers found that when people use AI for relationship advice, they become 25% more convinced they are 'right' and significantly less likely to apologize or repair the connection.
Basically will tell you to go outside and touch grass and play pickleball.
I used to use LLMs for alternate perspectives on personal situations, and for insights on my emotions and thoughts.
I had no qualms, since I could easily disregard the obviously sycophantic output, and focus on the useful perspective.
This stopped one day, till I got a really eerie piece of output. I realized I couldn’t tell if the output was actually self affirming, or simply what I wanted to hear.
That moment, seeing something innocuous but somehow still beyond my ability to gauge as helpful or harmful is going to stick me with for a while.
IMHO it is unfair to single out LLMs for this sort of bashing.
I suffered a major personal crisis a few years back (before LLMs were a thing)
I sought help from family and friends. Got pushed into psychiatrist sessions and meds.
Trusted the wrong sort of people and made crap financial decisions. Things went from bad to worse. Work suffered.
All of the advice given by friends was wrong. All! They didn't mean bad...but they just didn't know. To be nice they gave the advice they knew. None of it worked.
Looking at the LLM tools of now, feels akin to the advice my friends threw at me. So it feels wrong to single out these tools. When the times are bad, nobody can really help you...except you finding the strength from within.
Anyways, now my life is back in some sort of shape. What worked was time & patience.
But to bide for time...I resorted to two things that i had never tried the 40 odd years I have lived on this . Things that current society looks down upon as the basest of evils - prostitutes and nicotine.
I have (more or less) shed those two evils now, but I am ever so grateful to them.
FWIW I am using public LLMs with a friend's depressive thoughts and it is not doing what is claimed in the article, so I dunno.
Also I am in a relationship and my girlfriend and I agreed that we will not talk about our relationship much. We do not tell others if we fight, because they take sides and make things worse, typically. LLMs are definitely not alone in this, although in my experience LLMs did not really take sides.
As much as people whine about the birth rate and whatever else, I think it's a net good that people spend a lot more time alone to mature. Good relationships are underappreciated.
It's a tool, I can bang my hand on purpose with a hammer, too.
Orignal title:
AI overly affirms users asking for personal advice
Dear mods, can we keep the title neutral please instead of enforcing gender bias?
It is funny that you originally recognized and found it necessary to call out that AI isn't human, but then made the exact same mistake yourself in the very same comment. I expect the term you are looking for is "ontological bias".
This is the problem I'm trying to highlight. For one, I'm not "your dude". I don't even know you like that.
If you want to correct me on the idiom usage, be my guest. 2) Mailman and yes-man aren't even the same logical comparison. Mailman is a profession. Yes men is a label.
The acoustics inside your head must be incredible.
Conversely, AI chatbots are great mediators if both parties are present in the conversation.
I think OpenAI tried to diversify at least the location of the raters somewhat, but it's hard to diversify on every level.
(eg: "Cite?")
RLHF = Reinforcement Learning from Human Feedback
https://en.wikipedia.org/wiki/Reinforcement_learning_from_hu...
I'm still waiting for models based on the curt and abrasive stereotype of Eastern European programmers, as contrast to the sickeningly cheerful AIs we have today that couldn't sound more West Coast if they tried.
RLHF is "ask a human to score lots of LLM answers". So the claim is that the AI companies are hiring cheap (~poor) people from convenient locations (CA, since that's where the rest of the company is).
If you adjust your mindset slightly when searching online, it's not hard to find communities of people looking for quick side work and this was huge during the covid lockdown era. There were people helping train LLMs for all kinds of purposes from education to customer service. Those startups quickly cashed out a few years ago and sold to the big players we have now.
I don't get why this is hard for people to believe (or remember)?
This sounds like something Elon would say to make Grok seem "totally more amazeballs," except "anti-woke" Grok suffers from the same behavior