Clinical Trials Arena August 7, 2024
A study by Mendel and UMass Amherst shows different types of hallucinations in AI-summarised medical records and the need for robust detection.
Artificial intelligence (AI) startup Mendel and the University of Massachusetts Amherst (UMass Amherst) have jointly published a study detecting hallucinations in AI-generated medical summaries.
The study evaluated medical summaries generated by two large language models (LLMs), GPT-4o and Llama-3. It categorises the hallucinations into five categories based on where they occur in the structure of medical notes – patient information, patient history, symptoms / diagnosis / surgical procedures, medicine-related instructions, and follow-up.
The study found that summaries created by AI models can “generate content that is incorrect or too general according to information in the source clinical notes”,...