A few weeks before the ER, I was having stomach pain. I went to the doctor with theories from ChatGPT in hand, they checked me for those things and then didn't check me for what ended up being a pretty obvious issue. What's interesting is that I mentioned to the doctor that I used ChatGPT and that the doctor even seemed to value that opinion and did not consider other options (and what it ultimately ended up being was rare but really obvious in retrospect, I think most doctors would have checked for it). I do feel I actually biased the first doctors opinion with my "research."
It may feel easy to say doctors should just consider all the options. But telling them an option is worse than just biasing their thinking; they are going to interpret that as information about your symptoms.
If you feel pain in your abdomen but are only talking about your appendix, they are rightfully going to think the pain is in the region of your appendix. They are not going to treat you like you have kidney pain. How could they? If they have to treat all of your descriptions as all the things that you could be relating them to, then that information is practically useless.
If I used GPT for my medical issue last year and everybody took my word for it, I would be dead.
This has been a big problem in medicine since the early days of WebMD: Each appointment has a limited time due to the limited supply of doctors and high demand for appointments.
When someone arrives with their own research, the doctor has to make a choice: Do they work with what the patient brought and try to confirm or rule it out, or do they try to walk back their research and start from the beginning?
When doctors appear to disregard the research patients arrive with many patients get very angry. It leads to negative reviews or even formal complaints being filed (usually from encouragement from some Facebook group or TikTok community they were in). There might even be bigger problems if the patient turns out to be correct and the doctor did not embrace the research, which can prompt lawsuits.
So many doctors will err on the side of focusing on patient-provided theories first. Given the finite time available to see each patient (with waiting lists already extending months out in some places) this can crowd out time for getting a big picture discussion through the doctor's own diagnostic process.
When I visit a doctor I try to ground myself to starting with symptoms first and try to avoid biasing toward my thoughts about what it might be. Only if the conversation is going nowhere do I bring out my research, and then only as questions rather than suggestions. This seems to be more helpful than what I did when I was younger, which is research everything for hours and then show up with an idea that I wanted them to confirm or disprove.
A doctor is typically scheduled at 6 patients/hour. In that time they also have to chart, walk between rooms, make up time for the other patients that inevitably went over time, et cetera. The doctor you're seeing probably has a goal of only talking to you for 3 minutes.
This is untrue. General practice physicians are usually at 3 patients per hour. Some specialists can get in the range or 5 or more per hour if assistants handle most of the prep and work.
The average across all specialties is around 3, though.
> In that time they also have to chart, walk between rooms, make up time for the other patients that inevitably went over time, et cetera. The doctor you're seeing probably has a goal of only talking to you for 3 minutes.
I've been through two different medical systems due to job changes/moving. Both of them gave me the option of a 20 minute or 40 minute appointment slot, with the latter requiring some pre-screening to be approved by the staff. I got the time every time I went.
If your doctor is only giving you 3 minutes you need to find a new one.
People not suffering from mental illness will typically not blame 5G for their health concerns.
You've read things that compellingly claim that foo causes xyz symptoms. You also know that some people that have obviously palpable disdain for you claim that foo could never cause these symptoms.
You have xyz symptoms. Are you mentally ill if you think that foo could be the cause?
I have a family member who had a "rare but obvious" one but it took 5 doctors to get to the diagnosis. What we really need to see are attempts to blind studies and real statistical rigor. It's funny to paint a tunnel on a canvas and get a Tesla to drive into it, but there's a reason studies (and the more blind the better) are the standard.
I'm not so sure. Doctors are trained to check for the most common things that explain the symptoms. "When you hear hoofbeats, think horses not zebras" is a saying that is often heard in medicine.
ChatGPT was trained on the same medical textbooks and research papers that doctors are.
Yeah hm I wonder what the difference could possibly be.
In the future, I think I'll likely review things with ChatGPT and have an opinion and treat the doctor like a ChatGPT session as well--this is opposed to leading the doctor to what I believe I should be doing. I was dismissive about the doctor's advice because it seemed so obvious but more and more, I feel that most of our issues are caused by habitual, daily mistakes--little things that take hold seasonally or over periods of stress that appear like chronic health issues. At least for me.
Doctors hate to hear this, but if you're so poor in communication and social skills that the patient can't/won't follow you any care you've given, your value is lost.
What you want instead is that the users just describe their problem, as unbiased as possible and with enough detail and then let the expert come up with an appropriate solution that solves the problem.
I try to do that as well when going to the doctor.
Management has realized this. Hey I can outsource to bangalore/hyderabad/east europe/ai, get something that barely works, and just market the crap out of it. Look at the sort of companies, products, and services that dominate markets today. These aren't leaders in quality or engineering. They are leaders in marketing. Marketing is what sells. Marketing can sell billions of steaming turds. Nike shoes are pieces of shit but it's marketing that makes the brand and provides all value in the stock. The world doesn't value quality. It values noise and pretty feathers.
Why can't you name them, and give us some context? Is this based on public info, or not?
There is a reason why the majority of a doctor's 8 years of training is spent doing the rounds as a junior doctor in hospital wards ....
Interacting with real people, facing a person trying to get help for something that they don't want to experience is vastly different than reading about a symptom or group of symptoms in a book.
Seriously ? ¯\_(ツ)_/¯
The textbooks are the theory.
The hospital wards are the practice.
The hospital wards are what shows you that the human body is complex and many times things don't happen like the textbook says it will.
And then there's the ICU, pediatric, geriatric and mental health wards where the patient often cannot even describe their symptoms ...
In practice...
Edit: People seem confused here. The study was feeding the AI structured clinical scenarios and seeing it's results. The study was not a live analyses of AI being used in the field to treat patients.
Real life use is full of ill posed questions open ended statements inaccurate assessment of symptoms, and conclusory remarks sprinkled in between. Real use of chat bots for Health by non-clinicians looks very different than scenario based evaluation.
> Three physicians independently assigned gold-standard triage levels based on cited clinical guidelines and clinical expertise, with high inter-rater agreement
If you're worried about not catching a legit emergency, as in something that can't wait a day or two for them to complete the different sessions, you could have a doctor monitor the interactions with the ability to raise a flag and step in to send them to the ER.
You have to justify it, but most places have sections in the document where you request review to justify it. It’s not any different from giving one patient heart medicine that you think works and another patient a sugar pill.
In actual heart medicine studies the control arm is typically treated with the current standard of care, not a placebo. So it seems pretty clear that you don't have any actual knowledge or experience in this area.
And in most cases the diagnosis is the easy part. I mean we see occasional horror stories about misdiagnosis but those are rare. The harder and more important part is coming up with an effective treatment plan which the patient will actually follow, and then monitoring progress while making adjustments as needed. So a focus on the diagnosis portion of clinical decision support seems fundamentally misguided.
Yea, like how rich the patient is or if they are on insurance etc. I wish I was kidding.
These "experts", they have no problem to tout anecdotes when it serves them..
We had a potential pet poisoning, so was naturally searching for resources. Google had a summary with a "dose of concern" that was an order of magnitude off. Someone could have read that and thought all was fine and had a dead cat.
(BTW cat is fine, turned out to be a false alarm, but public service announcement: cats are alergic to aspirin and peptobismal has aspirin. don't leave demented plastic chewing cats around those bottles, in case you too have a lovely but demented cat)
My wife had a pretty bad cold during pregnancy and our GP proceeded to prescribe her cough syrup with high alcohol content, because that was what ChatGPT told him to prescribe. We only noticed it once she took the first dose and spit it out again...
Still regularly get wrong information from google’s search AI.
Really starting to wonder if common sense is ever going to come back with new tech, but I fear it is going to require something truly catastrophic to happen.
These systems are borderline useless if you don’t give them dangerous levels of access to data and generate tons of juicy chat history with them. What’s coming is very predictable.
The fact that the model most hyper-optimized for cheap+fast makes mistakes is not a particular compelling argument.
[1] https://openai.com/index/introducing-chatgpt-health/
[2] https://cdn.openai.com/pdf/bd7a39d5-9e9f-47b3-903c-8b847ca65...
I suspect many, many doctors also fail to regularly recognize medical emergencies.
If you look at online reviews, doctors are mostly rated based on being "nice" but that has little bearing on patient outcomes.
We live during the healthiest period in human history due to the fact that doctors are highly reliable and well-trained. You simply would not be able to replace a real doctor with an LLM and get desirable results.
It's like people want to remove the physician or current care from the discussion. It's weird because care is already too expensive and too error prone for the cost.
I think it's rather people trying to keep grounded and suggest that it's not just the hallucination machine that's bad, but also that many doctors in real life also suck - in part because of the domain being complex, but also due to a plethora of human reasons, such as not listening to your patients properly or disregarding their experiences and being dismissive (seems to happen to women more for some reason), or sometimes just being overworked.
> You simply would not be able to replace a real doctor with an LLM and get desirable results.
I don't think people should be replaced with LLMs, but we should benchmark the relative performance of various approaches:
A) the performance of doctors alone, no LLMs
B) the performance of LLMs alone, no human in the loop
C) the performance of doctors, using LLMs
Problem is that historical cases where humans resolved the issue and not the ones where the patient died (or suffered in general as a consequence of the wrong calls being made) would be pre-selecting for the stuff that humans might be good at, and sometimes wouldn't even properly be known due to some of those being straight up malpractice on the behalf of humans, whereas benchmarking just LLMs against stuff like that wouldn't give enough visibility in the failings of humans either.Ideally you'd assess the weaknesses and utility of both at a meaningfully large scale, in search of blind spots and systemic issues, the problem being that benchmarking that in a vacuum without involving real cases might prove to be difficult and doing that on real cases would be unethical and a non-starter. And you'd also get issues with finding the truly shitty doctors to include in the sample set, sometimes even ones with good intentions but really overworked (other times because their results would suggest they shouldn't be practicing healthcare), otherwise you're skewing towards only the competent ones which is a misrepresentation of reality.
Reminds me of an article that got linked on HN a while back: https://restofworld.org/2025/ai-chatbot-china-sick/
The fact that someone would say stuff like "Doctors are more like machines." implies failure before we even get to basic medical competency. People willingly misdirect themselves and risk getting horrible advice because humans will not give better advice and the sycophantic machine is just nicer.
No, you see this line or argumentation on every post critical of LLM's deficiencies. "Humans also produce bad code", "Humans also make mistakes" etc etc.
So your reading of this is that it's a deflection of the shortcomings?
My reading of it is that both humans and LLMs suck at all sorts of tasks, often in slightly different ways.
One being bad at something doesn't immediately make the other good if it also sucks - it might, however, suggest that there are issues with the task itself (e.g. in regards to code: no proper tests and harnesses of various scripts that push whoever is writing new code in the direction of being correct and successful).
Now, I don't agree that this is a good decision, but the point is, human doctors also often miss major problems.
The numbers that you see quoted are almost certainly wildly exaggerated.
Let me guess you got 7?
So when I read “they then compared the platform’s recommendations with the doctors’ assessments” and see a mismatch, I wonder if it’s because human doctors are overly cautious or that the AI was wrong.
But that all pales in what could be the actual issue. I can’t read the original study, but if it use the USA, it’s understandable why people are turning to AI for Health advice. Healthcare is painfully expensive here. Even a simple trip to the ER (e.g. a $2000 stomach ache) is beyond a lot of people’s ability to spend. That’s just a reality.
With that in mind, the real questions “should I do nothing about my symptoms because I can’t afford healthcare or should I at least ask AI knowing it could be wrong”.
Recent (as in last few days/weeks) incidents using different models/tools:
* Google AI search summary compare product A & B, call out a bunch of differences that are correct.. and then threw in features that didn't exist
* Work (midsize company with big AI team / homebuilt GPT wrappers) PDF parsing for company headquarters address, it hallucinated an address that didn't exist in the document
* Work, a team using frontier model from top 2 AI lab was using it to perform DevOps type tasks, requested "Restart XYZ service in DEV environment". It responded "OK, restarting ABC service in PROD environment". It then asked for confirmation AFTER actioning whether they meant XYZ in DEV or ABC in PROD... a little too late.
They are very difficult tools to use correctly when the results are not automatically verifiable (like code can be with the right tests) and the answer might actually matter.
... Wait, they gave the magic robot _access to modify their production environment_?!
Bloody hell, there's no helping some people.
The problem with all these orgs hiring "AI experts" is the adverse selection of finding the people who "know AI" but can't get a job at AI lab, startup, big tech, or literally any other job using AI that is better than "making excel do AI more good".
It's like Big Data / Cybersecurity / DevOps / Big Agile / Cloud Evangelist / Data Science grifter playbook all over again.
If it could be an emergency, see a doctor.
A friend of mine had an accident. He was taken to the emergency room, but the doctors there thought his injuries were minor. My friend insisted that he was bleeding out internally. They finally checked for that, and it turns out he was minutes from dying.
AI wasn't involved in this case, but it's good to have both AI and a trained doctor in the decision loop.
That doesn't necessarily follow from your story. The AI's specificity and sensitivity are important, which is why we need to study this stuff. An AI that produces too many false positives will send doctors off chasing zebras and they'll waste time, which will result in more deaths.
An AI that produces too many false negatives will make doctors more likely to miss things they otherwise would have checked, which will result in more deaths.
The other real problem with using AI in a medical setting is that AI is very very good at producing plausible sounding wrong information. Even an expert isn't immune to this. So it's even more important that we study how likely they are to be wrong.
I used ChatGPT to do a valve adjustment on an engine; a task I've never done before. I didn't just accept the torque values and procedure it told me though, because I know better from my experience with it as a dev. I cross-referenced it all with Youtube videos, forum posts, instruction manuals (where available) to make sure the job was A) doable for a non-mechanic like me and B) done correctly. Thanks to the Youtube video (which I cross-referenced with other sources), I discovered the valve clearance values were slightly off with the ChatGPT recommendation.
I think the average Joe would assume these values were correct and run with it.
https://www.liveinsurancenews.com/health-insurance-claims-de...
Most physicians I know use ChatGPT. Although of course it's usage guided by an expert, not by the patient, nor fully autonomous.
No, no, no, and no. Are we going to never learn. Sharing medical data with AI tools is going to come back and bite you.
Win win right?
One of the things that people need to come to grips with is that like Wikipedia people will use ChatGPT because it is there. And the alternative is to be rich and have a primary care doctor that you can reach out to at a moments notice. Until that is different people will use these web services. It’s the same thing as Wikipedia or WebMD.