ChatGPT Competes With Surgeons in Patient Q&A

[ad_1]

With advances in precision medicine at the molecular, genetic, and radiological levels, artificial intelligence (AI) is now shaping patient-doctor interactions. Large language models (LLMs) are neural networks designed to interpret the language context, tone, subtleties, and cultural aspects.

These LLMs function as generative AI, capable of producing new text or data similar to the information on which they were trained.

Chatbots are among the most widely recognised applications of LLM. These conversational tools provide quick responses to user queries about products, services, and healthcare topics. ChatGPT, a leading example, has become a key player in this domain.

German researchers specialising in pancreatic cancer surgery evaluated ChatGPT’s ability to answer patient questions about pancreatic cancer surgery, comparing its responses with those of two specialist surgeons. Patients and medical experts assessed the responses for accuracy, clarity for non-specialists, and empathy.

Testing Responses

A set of 10 frequently asked patient questions was compiled in collaboration with patients from the Heidelberg University Hospital, Heidelberg, Germany. ChatGPT (version 4.0) and two surgeons independently answered these questions.

A panel of 24 patients and 25 surgeons evaluated the responses using an anonymised online questionnaire. The evaluation criteria included accuracy, clarity for non-specialists, and empathetic responses.

Key Findings

  • Patients rated the accuracy and clarity of ChatGPT’s answers as comparable with those of the surgeons.
  • ChatGPT scored slightly higher than one of the surgeons in terms of empathy.
  • Patients preferred the first surgeon’s answers 50% of the time and ChatGPT’s answers 30% of the time.
  • Surgeons were better at identifying AI-generated responses than patients, who struggled to distinguish between human and AI responses.

Complementary but Limited

These findings suggest that ChatGPT can serve as a complementary tool in patient-physician communication, particularly in simplifying complex medical topics for patients. Similar results have been reported in studies on colon cancer and cardiovascular disease prevention.

However, AI has some notable limitations. ChatGPT is not specifically designed for medical applications and can produce errors or “hallucinations” — unexpected inaccuracies that, while surprising at first, are well documented by AI specialists and increasingly recognised by the public. These errors stem from the way that LLMs operate.

Unlike humans, AI does not store knowledge but generates responses through mathematical computation. Rather than understanding text, it processes words as fragments called tokens, each of which is assigned a unique mathematical identifier. In some cases, errors arise from the way the data are processed by an LLM. Some models only analyse parts of a dataset, lack the capability to interpret graphs or infographics, or impose restrictions on accessible references. This highlights the need for a careful evaluation of AI-generated medical information.

Currently, AI cannot replace the essential human connection in healthcare, particularly in addressing the emotional aspects of illness. Additionally, it is important to clearly indicate when a response is generated by AI to prevent any confusion for patients. Medical liability for AI-related errors remains a significant issue that must be addressed before broader clinical adoption, similar to the ongoing debate on accountability in autonomous vehicle accidents.

This story was translated from JIM using several editorial tools, including AI, as part of the process. Human editors reviewed this content before publication.

[ad_2]

Source link : https://www.medscape.com/viewarticle/chatgpt-competes-surgeons-patient-q-2025a10005s7?src=rss

Author :

Publish date : 2025-03-10 10:59:00

Copyright for syndicated content belongs to the linked Source.

Exit mobile version