Physicians Should Not Use ChatGPT for Clinical Recommendations, Study Indicates

HCP Live October 18, 2024
Chelsie Derman

Key Takeaways

GPT-4-turbo and GPT-3.5-turbo underperformed compared to resident physicians in emergency department tasks, except for antibiotic prescriptions.
AI models demonstrated high sensitivity but low specificity, often leading to overprescription and false positives.
AI’s cautious recommendations stem from training on general internet data, not tailored for emergency medical decision-making.
Resident physicians outperformed AI in real-world settings, highlighting AI’s current limitations in complex clinical environments.

A recent study demonstrated physicians surpass GPT-4- or GPT-3.5 turbo at making clinical recommendations in the emergency department.

ChatGPT will not be helping the decision-making for physicians any time soon, as a new study demonstrated.¹

GPT-4-turbo may have performed tasks better than the earlier version, GPT-3.5-turbo, particularly in predicting the need for antibiotics for...

Today's Sponsors

2024-10-20T08:12:42-04:00