A patient in her 50s came to my eye clinic for flashes of light in one eye. Before she sat down, she told me, calmly, that she already knew what this was: migraine with visual aura. She had worked it out with ChatGPT over several days. By the time I met her, her story had acquired the language of aura: a shimmer, a zigzag, something like a 20-minute spread, perhaps a mild discomfort behind her eye afterward. But on exam, I found a retinal tear. She would need laser treatment that day to protect her vision.
When I asked her to go back to the beginning (what had she noticed before she typed anything into ChatGPT?), the aura story began to fall apart. The original symptoms were briefer, peripheral, recurrent flashes, especially noticeable in the dark. The chatbot hadn’t just fabricated a total falsehood but had done something more subtle: it had offered a plausible template, asked questions in that template’s language, and helped her fit ambiguous memories into the wrong disease.
The usual worry about patients and artificial intelligence (AI) is that the chatbot will tell them something untrue. That concern is real, but it was just part of what had nearly cost this patient her vision. The more important issue here was the coherence of the story. Generative AI typically does not hand patients 10 conflicting links as Google searches used to. It returns a single fluent account, with an onset, a mechanism, and a conclusion, and then it refines that account in conversation until everything fits.
This is not a fringe scenario. In a 2026 KFF poll, 32% of U.S. adults said they had used AI chatbots for health information in the past year, and many who asked about physical or mental health did not follow up with a clinician afterward. Physicians increasingly describe spending much of a short appointment walking patients back from what a chatbot told them. Coherence is persuasive on its own.
The Google-era patient arrived with fragments. That ambiguity kept the question open for the clinician. The AI-era patient arrives with a conclusive narrative, based on a chatbot dialogue that offered something web search never did. It asks leading questions and folds the answers back in, so the story tightens with every exchange.
“Did the flashing zigzag across your vision?”
“Did it last around 20 minutes?”
“Was there a headache afterward?”
Each agreeable prompt invites the patient to supply a detail that was not there before.
This is where it stops being an information problem and becomes a memory problem. In a 2024 study from MIT and the University of California-Irvine, people who recalled an event through a back-and-forth with a generative chatbot formed more than three times as many false memories as a control group, and their confidence in those false memories stayed elevated a week later. Psychologist Elizabeth Loftus, PhD, a co-author of the study, has spent her career showing how suggestion reshapes recollection. The mechanism she describes is the one I now see in clinic: a suggestible patient, a fluent and agreeable interlocutor, leading questions, and a story that cements into memory. That’s why I ended up spending so much time trying to separate what my patient had experienced from what the conversation had taught her to remember.
And the anchoring does not stop with the patient. A clean, confident narrative primes the clinician too. In a randomized clinical vignette study in JAMA, clinicians became significantly less accurate when shown biased AI diagnostic predictions, even when the model’s explanations were displayed. Premature AI assumptions are dangerous precisely because they can act on both sides of the encounter.
The response to this issue should play out in three ways.
In the exam room, clinicians should ask patients if they have any thoughts or conclusions about their condition, and where those ideas came from. If the patient mentions use of an AI tool, clinicians should learn to re-elicit the original symptoms without borrowing the AI’s vocabulary. This is why chatbot use belongs in the social history.
For patients, the safest use of AI is not “What do I have?” but “What symptoms should I not ignore and what should I ask my clinician?” A chatbot should help prepare the visit, not finish it.
For companies and policymakers, consumer health AI should be built to preserve uncertainty and tested in real use cases, not just model-only settings. In a 2026 Nature Medicine study, members of the public using large language models were no better than controls at identifying relevant conditions or choosing the right course of action, even though the models performed much better when tested alone. These systems should show a differential, avoid leading questions, identify red flags, and push toward evaluation when vision, chest pain, neurologic symptoms, pregnancy, suicidality, or other high-risk contexts are involved.
My patient kept her vision because the exam contradicted the story she had come to believe about her own eye. That is a thin margin. The next patient may bring a story so smooth that I am tempted to trust it, or they may have rehearsed it so many times that even careful questioning cannot fully uncover what really happened.
We have spent 2 years worrying that AI will tell patients things that are wrong. The harder, more insidious problem is that it will tell them things that are coherent and that they will remember the coherent version as the truth. Our job is no longer only to correct the record. It is to find out who wrote it.
Please enable JavaScript to view the comments powered by Disqus.
Source link : https://www.medpagetoday.com/opinion/second-opinions/121988
Author :
Publish date : 2026-06-30 16:50:00
Copyright for syndicated content belongs to the linked Source.
