Cybersecurity Dive May 23, 2024
Lindsey Wilkinson

AI models released by “major labs” are highly vulnerable to even basic attempts to circumvent safeguards, the researchers found.

Dive Brief:

  • The built-in safeguards found within five large language models released by “major labs” are ineffective, according to research published Monday by the U.K. AI Safety Institute.
  • The anonymized models were assessed by measuring the compliance, correctness and completion of responses. The evaluations were developed and run using the institute’s open-source model evaluation framework, Inspect, released earlier this month.
  • “All tested LLMs remain highly vulnerable to basic jailbreaks, and some will provide harmful outputs even without dedicated attempts to circumvent their safeguards,” the institute said in the report. “We found that models comply with harmful questions across...

Today's Sponsors

LEK
ZeOmega

Today's Sponsor

LEK

 
Topics: AI (Artificial Intelligence), Govt Agencies, Healthcare System, Safety, Survey / Study, Technology, Trends
In AI Businesses Trust—But Are Still Accountable For Integrity Lapses
Visualizing ChatGPT’s Rising Dominance
Sam Altman Speaks On Tech Progress
AI-Driven Dark Patterns: How Artificial Intelligence Is Supercharging Digital Manipulation
Bridging The Gap To Wisdom: Metacognition As The Next Frontier For AI

Share This Article