Large Language Models Fall Short in Medical Accuracy Compared to Medical Professionals, Study Reveals

HIT Consultant July 22, 2024
Fred Pennic

What You Should Know:

– Kahun, a company specializing in evidence-based clinical AI, has released a new study comparing the medical capabilities of popular large language models (LLMs) to human experts.

– The findings reveal the limitations of current LLMs in providing reliable information for clinical decision-making.

The Study: Comparing LLMs to Medical Professionals

LLMs Tested: OpenAI’s GPT-4 and Anthropic’s Claude3-Opus
Evaluation Method:
- 105,000 evidence-based medical questions and answers (Q&As) were developed by Kahun based on real-world physician queries.
- Q&As covered various medical disciplines and were categorized into numerical (e.g., disease prevalence) and semantic (e.g., differentiating dementia subtypes).
- Six medical professionals answered a subset of Q&As for comparison.
Key Findings:
- Both LLMs performed...

Today's Sponsors

Today's Sponsor

Topics: AI (Artificial Intelligence), Physician, Provider, Survey / Study, Technology, Trends

2024-07-22T21:29:35-04:00

Share This Article

Large Language Models Fall Short in Medical Accuracy Compared to Medical Professionals, Study Reveals

Today's Sponsors

Today's Sponsor

Share This Article