MedCity News July 22, 2024
Anthropic’s Claude3-Opus performed better than GPT-4, but both fell short of humans on a test of objective medical knowledge. The study was conducted by a firm developing LLMs specifically for healthcare that claim to be incorporating peer-reviewed sources of information.
A new study that pitted six humans, OpenAI’s GPT-4 and Anthropic’s Claude3-Opus to evaluate which of them can answer medical questions most accurately found that flesh and blood still beat out artificial intelligence.
Both the LLMs answered roughly a third of questions incorrectly though GPT-4 performed worse than Claude3-Opus. The survey questionnaire were based on objective medical knowledge drawn from a Knowledge Graph created by another AI firm – Israel-based Kahun. The company created their proprietary Knowledge Graph with a...