AXIOS April 16, 2024
Ryan Heath

AI makers can’t agree on how to test whether their models behave responsibly, per Stanford’s latest AI Index, released Monday.

Why it matters: Businesses and individual users have little basis for comparison when choosing an AI provider to suit their needs and values.

Catch-up quick: “AI models behave very differently for different purposes,” Nestor Maslej, editor of the 2024 AI Index from Stanford University’s Institute for Human-Centered Artificial Intelligence (HAI), told Axios.

  • But users lack simple options for comparing them, and there’s no solution in sight.
  • The most commonly used benchmark test for responsibility — TruthfulQA — is used by only three out of the five leading AI developers assessed by the Stanford team: OpenAI’s GPT-4, Meta’s Llama 2...

Today's Sponsors

LEK
ZeOmega

Today's Sponsor

LEK

 
Topics: AI (Artificial Intelligence), Technology
Aviv Regev: The Revolution in Digital Biology
How to use LLMs to lose your competitors: A mixology for enterprise leaders
Report: Apple and OpenAI Reopen iPhone AI Talks
General Catalyst Reportedly Raises Nearly $6 Billion for Tech Firms
What do employees really want out of generative AI?

Share This Article