I don't think people are afraid of doctors using imperfect tools. That is the easier part. But that will not solve the problem of too many patients for a single doctor and what leads to the lack of empathy. This was a problem even before AI. It seems society does not have empathy for these kind of "professional problems". Offering tools instead of humans is an even riskier approach, not for that particular individual, but for how society tends to build trust and empathy. We tend to see everything now as a problem with a technical solution because we only have confidence in solving technical problems.
The crucial part is the training. AI may very well be the solution for underserved communities, but not if it is trained on internet rubbish. Train an AI on curated state-of-the-art, scientific data, imagine the expert systems of yore on overdrive, and you will see much better results, including knowing when to call in a human doctor.
This is the point where it becomes important to distinguish two senses of "advanced", i.e. advanced in technological sense on the one hand and advanced in social/societal and especially large-group-long-time-horizon coordination terms on the other. In the former we are quite advanced, in the latter quite primitive and regressing by the day, it feels like. (But sorry to end on a doomer note, take it with a grain of salt.)
Using ChatGPT takes... nothing, it's already here.
Not too surprising to see one and not the other.
Also, the kinds of changes that result in more doctors don't tend to get media coverage. That's that boring, keep the lights on, politics that modern rage-bait driven media abhors, so it may even still be true that the changes we need for more doctors are also already happening. We'll find out in another decade or two.
With highly lucid people like the author's mom I'm not too worried about Dr. Deepseek. I'm actually incredibly bullish on the fact that AI models are, as the article describes, superhumanly empathetic. They are infinitely patient, infinitely available, and unbelievably knowledgeable, it really is miraculous.
We don't want to throw the baby out with the bathwater, but there are obviously a lot of people who really cannot handle the seductivity of things that agree with them like this.
I do think there is pretty good potential in making good progress on this front in though. Especially given the level of care and effort being put into making chatbots better for medical uses and the sheer number of smart people working on the problem.
like, I tried to treat the bloating in one municipal clinic in Ternopil, Ukraine (got "just use Espumisan or anything else that has symeticone" and when it did not work out permanently, "we don't know what to do, just keep eating symeticone") and then with Gemini 3 (Pro or Flash depending on Google AI Studio rate limits and mood), which immediately suspected a poor diet and suggested logging it, alongside activity level, every day.
Gemini's suggestions were nothing extreme - just cut sugar and ban bread and pastry. I was guilty of loving bread, croissants, and cinnabons (is this how they are translated?) too much.
the result is no more bloating on the third week, -10cm in waistline in 33 days, gradually improving sleep quality, and even ability to sleep on a belly, which was extremely uncomfortable to me due to that goddamned bloating!
They are knowledgeable in that so much information sits in their repository.
But less than perfect application of that information combined with the appearance of always perfect confidence can lead to problems.
I treat them like that one person in the office who always espouses alternate theories - trust it as far as I can verify it. This can be very handy for finding new paths of inquiry though!
I asked ChatGPT a question about a made up character in a made up work and it came back with "I don’t actually have a reliable answer for that". Perfect.
On the other hand, I can ask it about varnishing a piece of wood and it will give a lovely table with options, tradeoffs, and Good/Ok/Bad ratings for each option, except the ratings can be a little off the mark. Same thing when asking what thickness cable is required to carry 15A in AU electrical work. Depending on the journey and line of questioning, you would either get 2.5mm^2 or 4mm^2.
Not wrong enough to kill someone, but wrong enough that you're forced to use it as a research tool rather than a trusted expert/guru.
DeepSeek however hallucinated a completely fictional band from 30 years ago, right down to album names, a hard luck story about how they’d been shafted by the industry (and by whom), made up names of the members and even their supposed subsequent collaborations with contemporary pop artists.
I asked if it was telling the truth or making it up and it doubled down quite aggressively on claiming it was telling the truth. The whole thing was very detailed and convincing yet complete and utter bollocks.
I understand the difference in the cost/parameters etc. but it was miles behind the other 3, in fact it wasn’t just behind it was hurtling in the opposite direction, while being incredibly plausible.
Which, you know, humans can also do, including when they're not actually empathizing with you. It's often called lying. In some fields it's called a bedside manner.
An imaginary friend is just your own brain. LLMs are something much more.
No strong opinion on if that's good or bad long term, as humans have been outsourcing portions of their thinking for a really long time, but it's interesting to think about.
LLMs, at least for now, escape the near-total enshittification of computing. They're fully general-purpose, resist attempts at constraining them[0], and are good enough at acting like a human, they're able to defeat user-hostile UX and force interoperability on computer systems despite all attempts of the system owners at preventing it.
The last 2-3 years were a period where end-users (not just hardcore hackers) became profoundly empowered by technology. It won't last forever, but I hope we can get at least few more years of this, before business interests inevitably reassert their power over people once again.
--
[0] - Prompt injection "problem" was, especially early on, a feature from the perspective of end-users. See increasingly creative "jailbreak" prompts invented to escape ham-fisted attempts by vendors to censor models and prevent "inappropriate" conversations.
This is a strange way to talk about a computer program following its programming. I see no miracle here.
A human might be "empathetic", "infinitely patient, infinitely available". And (say) a book or a calculator is infinitely available. -- When chatting with an LLM, you get an interface that's more personable than a calculator without being less available.
I know the LLM is predicting text, & outputting whatever is most convincing. But it's still tempting to say "thank you" after the LLM generates a response which I found helpful.
I don't think it's helpful because I don't interact with objects.
I have experience with building systems to remove that infinite patience from chatbots and it does make interactions much more realistic.
> Watching John with the machine, it was suddenly so clear. The Terminator would never stop, it would never leave him... it would always be there. And it would never hurt him, never shout at him or get drunk and hit him, or say it couldn't spend time with him because it was too busy. And it would die to protect him. Of all the would-be fathers who came and went over the years, this thing, this machine, was the only one who measured up. In an insane world, it was the sanest choice.
The AI doctor will always have enough time for you, and always be at the top of their game with you. It becomes useful when it works better than an overworked midlevel, not when it competes with the best doctor on their best day. If we're not there already, we're darn close.
Specifically:
> I know now why you cry, but it's something I can never do.
While the machine learns what this complex social behavior called 'crying' is, it also learns that it is unable to ever actualize this; it can never genuinely care for John, any relationship would be a simulation of emotions. In the context of a child learning these complex social interactions, having a father-figure who you knew wasn't actually happy to see you succeed, sad to see you cry ...
Again, LLMs aren't competing with the best human doctors. They're competing with doctors you actually have access to.
I was taking one high blood pressure medication but then noticed my blood sugar jumped. I did some research with ChatGPT and it found a paper that did indicate that it could raise blood sugar levels and gave me a recommendation for an alternative I asked my doctor about it and she said I was wrong, but I gently pushed her to switch and gave the recommended medication. She obliged, which is why I have kept her for almost 30 years now, and lo and behold, my blood sugar did drop.
Most people have a hard time pushing back against doctors and doctors mostly work with blinders on and don't listen. ChatGPT gives you the ability to keep asking questions without thinking you are bothering them.
I think ChatGPT is a great advance in terms of medical help in my opinion and I recommend it to everyone. Yes, it might make mistakes and I caution everyone to be careful and don't trust it 100%, but I say that about human doctors as well.
In this context, I think of ChatGPT as a many-headed Redditor (after all, reddit is what ChatGPT is trained on) and think about the information as if it was a well upvoted comment on Reddit. If you had come across a thread on Reddit with the same information, would you have made the same push for a change?
There are quite a few subreddits for specific medical conditions that provide really good advice, and there are others where the users are losing their minds egging each other on in weird and whacky beliefs. Doctors are far from perfect, doctors are often wrong, but ChatGPT's sycophancy and a desperate patient's willingness to treat cancer with fruit feel like a bad mix. How do we avoid being egged on by ChatGPT into forcing doctors to provide bad care? That's not a rhetorical question, curious about your thoughts as an advocate for ChatGPT.
I have type 2 diabetes.
> How do we avoid being egged on by ChatGPT into forcing doctors to provide bad care?
I don’t ask it leading questions. I ask “These are my symptoms, give me some guidance.” Instead of “these are my symptoms, I think I have cancer. Could I be right?” If I don’t ask leading questions it keeps the response more pure.
Are you asking why a side effect that is actually an entire health problem on its own, is a problem? Especially when there is a replacement that doesn’t cause it?
A lazy doctor combined with a patient that lacks a clear understanding of how ChatGPT works and how to use it effectively could have disastrous results. A lazy doctor following the established advice for a condition by prescribing a medication that causes high blood sugar is orders of magnitude less dangerous than a lazy doctor who gives in to a crackpot medical plan that the patient has come up with using ChatGPT without the rigour described by the comment we are discussing.
Spend any amount of time around people with chronic health conditions (online or offline) and you'll realise just how much damage could be done by encouraging them to use ChatGPT. Not because they are idiots but because they are desperate.
They can be used for isolated, treatment of high blood pressure, but they are also used for dual treatment of blood pressure and various heart issues (heart failure, stable angina, arrhythmias). If you have heart failure, beta blockers can reduce your relative annual mortality risk by about 25%.
I would not trust an LLM to weigh the pros and cons appropriately knowing their syncophantic tendencies. I suspect they are going to be biased toward agreeing with whatever concerns the user initially expresses to them.
[1]
I replaced it with Lisinopril with no side effects.
> doctors mostly work with blinders on and don't listen
This has unfortunately been my experience as well. My childhood PCP was great but every interaction I've had with the healthcare system since has been some variation of this. Reading blood work incorrectly, ignoring explanations of symptoms, misremembering medications you've been taking, prescribing inappropriate medications, etc. The worst part is that there are a lot of people that reflexively dismiss you as a contrarian asshole or, even worse, a member of a reviled political group that you have nothing to do with just because you dare to point out that The Person With A Degree In Medicine makes glaring objective mistakes.
Doctors aren't immune to doing a bad job. I don't think it's a secret that the system overworks them and causes many of them to treat patients like JIRA tickets - I'd just like to know what it would take for people to realize that saying such doesn't make you a crackpot.
As an aside I use Claude primarily for research when investigating medical issues, not to diagnose. It is equally likely to hallucinate or mischaracterize in the medical domain as it is others.
I feel like the difference is that doctors took what I told them and only partially listened. They never took it especially seriously and just went straight to standard tests and treatments (scopes, biopsies and various stomach acid impacting medications). ChatGPT took some of what I said and actually considered it, discounting some things and digging into others (I said that bitter beer helped... doctor laughed at that, ChatGPT said that the alcohol probably did not help but that the bittering agent might and it was correct). ChatGPT got me somewhere better than where I was previously... something no doctor was able to do.
I’ve been lucky enough to not need much beyond relative minor medical help but in the places I’ve lived always found that when I do see a GP they’re generally helpful.
There’s also something here about medical stuff making people feel vulnerable as a default so feeling heard can overcompensate the relationship? Not sure I’m articulating this last point well but it comes up so frequent (it listened, guided me through it step by step etc.) that I wonder if that has an effect. Feeling more in control than a doctor who has other patients and time constraint just say it’s x or do this
I caught one when i ask ChatGPT something, and then went to urgent case. I told my story, they left, came back and essentially read back exactly what ChatGPT told me.
I visited my GP, 2 wrist specialists, and physical therapist to help deal with it. I had multiple x rays and an MRI done. Steroid injection done. All without relief. My last wrist specialist even recommended I just learn to accept it and don't try to extend my wrist too much.
I decided to ask Gemini, and literally the first thing it suggested was maybe the way I was using the mouse was inflaming an extensor muscle, and it suggested changing my mouse and a stretch/massage.
And you know what, the next day I had no wrist pain for the first day in a year. And it's been that way for about 3 weeks now, so I'm pretty hopeful it isn't short term
I pre-emptively switched to trackballs and to alternating left/right hands for mousing near the start of my professional career based on the reading I did after some mild wrist strain.
I simply can't imagine this happening based on my experience with healthcare in two (European) countries.
The only thing that moves is my thumb, and it's much better for flexing than the wrist, also it has a tiny load to manage vs the wrist.
They have 15 minutes and you have very finite money.
Medical agents should be a pre consult tool that the patient talks to in the lobby while waiting for the doctor so the doctor doesn't waste an hour to hear the most important data point and the patient doesn't sit for an hour in the lobby doing nothing.
Source: I used to be a geotechnical engineer and left it because of the ridiculous personal risk you take on for the salary you get.
Everyone I know just goes to walk-in clinics / urgent-care centres. And neither of those options give doctors any "skin in the game." Or any opportunities for follow-up. Or any ongoing context for evaluating treatment outcomes of chronic conditions, with metrics measured across yearly checkups. Or the "treatment workflow state" required to ever prescribe anything that's not a first-line treatment for a disease. Or, for that matter, the willingness to believe you when you say that your throat infection is not in fact viral, because you've had symptoms continuously for four months already, and this was just the first time you had enough time and energy to wake up at 6AM so you could wait out in front of the clinic at 7:30AM before the "first-come-first-served" clinic fills up its entire patient queue for the day.
Because the republican party turned out to be a bunch of fascist fucks, there's no real critique of Obamacare. One of the big changes with the ACA is that it allowed medical networks to turn into regional cartels. Most regions have 2-3 medical networks, who are gobbled up all of the medical practices and closed many.
Most of the private general practices have been bought up, consolidated to giant practices, and doctors paid to quit and replaced by other providers at half the cost. Specialty practices are being swept up by PE.
To say nothing of giving your personal health information over to a private company with no requirement to practice HIPAA, and just recently got subpoenaed for all chat records. Not to mention potential future government requests, NSA letters, during an administration that has a health secretary openly talking about rounding up mentally ill people and putting them in work camps.
Maybe LLMs have use here, but we absolutely should not be encouraging folks to plug information into public chatbots that they do not control and do not run locally.
It is a recipe for disaster.
As an American on ACA this made me chuckle.
I did some searching with Grok and I found out:
- no contact injuries are troubling b/c it generally means they pulled something
- kids don't generally tear an ACL (or other ligament)
- it's actually way more common for the ligament to pull the anchor point off of the bigger bone b/c kid bones are soft
I asked it to differentially diagnose the issue with the details of: can't hold weight, little to no swelling and some pain.
It was adamant, ADAMANT, that this was a classic case of bone being pulled off by the ligament and that it would require surgery. It even pointed out the no swelling could be due to a very small tear etc. It gave me a 90% chance of surgery too.
I followed up by asking what test would definitely prove it one way or the other and it mentioned getting an X-Ray.
We go off to the urgent care, son is already kind of hobbling around. Doctor says he seems fine, I push for an X-Ray and turns out no issue: he probably just pulled something. He was fully healed in 2-3 days.
As someone who has done a lot of differential diagnosing/troubleshooting of big systems (FinTech SRE) I find it interesting that it was basically correct in what could have happened but couldn't go the "final mile" to establish it correctly. Once we start hooking up X-Rays to Claude/Grok 4.2 etc equivalent LLMs, will be even more interesting to see where this goes.
Grok is...not most people's first choice.
At least both OpenAI and Deep Mind do medical fine tuning, and both are almost certainly paying doctors to do it.
Opus 4.5 seems good too, though getting dumber. OpenAIs fine tuning is clearly built to toe the professional medical advice line, which can be good and bad.
It is like getting phished and then pointing out that the scammer was basically right about being Coinbase support aside from the fact that they did not work there
If you are using it like a tool to review/analyze or simplify something - ie explain risk stratification for a particular cancer variant and what is taken into account, or ask it to provide probabilities and ranges for survival based on age/medical history, it's usually on the money.
Every other caveat mentioned here is valid, and it's valid for many domains not just medical.
I did get hemotologist/oncologist level advice out of chatgpt 4o based on labs, pcr tests and symptoms - and those turned out to be 100% true based on how things panned out in the months that followed and ultimately the treatment that was given. Doctors do not like to tell you the good and the bad candidly - it's always "we'll see what the next test says but things look positive" and "it could be as soon as 1 week or as long as several months depending on what we find" when they know full well you're in there for 2 months at minimum you're a miracle case. Only once cornered or prompted will they give you a larger view of the big picture. The same is true for most professional fields.
You are 100% correct that, much like pilots overseeing autopilots, we should combine the best of both worlds in the medical field.
There isn't simply enough doctors to go around and the average one isn't as knowledgeable as you would want. Everything suggests that when it comes to diagnosis ML systems should be better in the long run on average.
Especially with a quickly aging population there is no alternative if we want people to have healthcare on a sensible level.
AI is doing the last mile and all these posters are saying it’s replacing doctors. Scary stuff.
Patients are substituting 15 minutes of 100% correct information for 2 hours of 80%, or whatever percentage the AIs are.
Additionally, they're pretty much free at the moment and available 24/7 with no travel required.
I see some of this adversarial second-guessing introspection from Claude sometimes. ("But wait. I just said x y and z, but that's inconsistent with this other thing. Let me rethink that.")
Sometimes when I get the sense that an LLM is too sycophantic, I'll instruct it to steelman the counter-argument, then assess the persuasiveness of that counter-argument. It helps.
https://www.mdpi.com/2504-3900/114/1/4 - Reinecke, Madeline G., et al. "The double-edged sword of anthropomorphism in llms." Proceedings. Vol. 114. No. 1. MDPI, 2025 Author: https://www.mgreinecke.com/
Human is human already, regardless how they speak and what they do... There's always true sympathy and empathy to find in a human, regardless how busy, dark, erroneous etc. they are... Yet, it's the true alive soul inside in every person...
Medics are likely tired, doing their job every single day, yet they have empathy, and will always try to listen if you actually try... They know what PAIN, AGONY, DEATH, SORROW... fear means...
LLM/"AI" will always pretend to be a human, since it's "trained"/designed, to be so, and will always be limited and incomplete... Not to mention the initial dataset of numerous emphatic actual human has its limited memory inside, no one is responsible for.
What a hopeless sorrow is that awful trendy, advertised, mind-atrophying mess...
However I would say that the cited studies are somewhat outdated already compared e.g. with GPT-5-Thinking doing 2mins of reasoning/search about a medical question. As far as I know Deepseeks search capabilities are not comparable and non of the models in the study spend a comparable amount of compute answering your specific question.
"Machines are more like humans"
I love the future...
In companies people talk about Shadow-IT happening when IT doesn't cover the user needs. We should probably label this stuff Shadow-Health.
To some extent, the deployment of a publicly funded AI health chat bot, where the responses can be analysed by healthcare professionals to at least prevent future harm is probably significantly less bad than telling people not to ask AI questions and consult the existing stretched infrastructure. Because people will ask the questions regardless.
But I do agree that some focused and well funded public health bot would be ideal, although we'll need the WHO to do it, it's certainly not coming from the US any time soon.
It's still insane to me how people can trust an LLM they didn't train themselves! You have no idea what vile and evil Dreck these models were trained with.
It's the click workers in third world countries who you don't recognize exist who could know if they were experts. But they aren't. A huge responsibility for these underpaid workers. They are the backbone of this hype, not GPUs. It's all built on shit. :)
The AI Revolution in Medicine: GPT-4 and Beyond by Peter Lee (Microsoft Research) - https://www.microsoft.com/en-us/research/publication/the-ai-...
The AI Revolution in Medicine, Revisited by Peter Lee (Microsoft Research) - https://www.microsoft.com/en-us/research/story/the-ai-revolu...
I don't know enough about medicine to say whether or not this is correct, but it sounds suspect. I wouldn't be surprised if chatbots, in an effort to make people happy, start recommending more and more nonsense natural remedies as time goes on. AI is great for injuries and illnesses, but I wonder if this is just the answer she wants, and not the best answer.
(and not just with the AI stuff)
> She understood that chatbots were trained on data from across the internet, she told me, and did not represent an absolute truth or superhuman authority. She had stopped eating the lotus seed starch it had recommended.
The “there’s wrong stuff there” fear has existed for the Internet, Google, StackOverflow. Each time people adapted. They will adapt again. Human beings have remarkable ability to use tools.
Personally I think the article spends a lot of time trying to show that AI may be able to improve health outcomes, particulaly for rural patients, but IMHO it doesn't spend nearly enough time talking about the current challenges