Becker's Healthcare January 2, 2025
Erica Carbajal

Large language models like ChatGPT have performed well on medical exams, but they struggle with diagnostic accuracy in real-world clinical interactions.

This is according to a new study led by researchers at Boston-based Harvard Medical School and Stanford (Calif.) University. To conduct the study, the team designed a testing framework, CRAFT-MD, to assess four AI models’ conversation skills and diagnostic accuracy based on scenarios mimicking real-world clinician-patient interactions.

While all four models fared well on medical exam-style questions, they struggled with basic conversations that mimic real-world encounters. Specifically, they showed limitations in asking questions to gather relevant medical history and synthesizing scattered information to make accurate diagnoses.

“The dynamic nature of medical conversations — the need to...

Today's Sponsors

LEK
ZeOmega

Today's Sponsor

LEK

 
Topics: AI (Artificial Intelligence), Provider, Survey / Study, Technology, Trends
AI detects ovarian cancer better than human experts in new study
AI means the end of internet search as we’ve known it
AI revolution drives demand for specialized chips, reshaping global markets
How AI regulation could shake out in 2025
How To Take Personalization To A New Level With AI And A Data Cloud

Share This Article