Cybersecurity Dive May 23, 2024
Lindsey Wilkinson

AI models released by “major labs” are highly vulnerable to even basic attempts to circumvent safeguards, the researchers found.

Dive Brief:

  • The built-in safeguards found within five large language models released by “major labs” are ineffective, according to research published Monday by the U.K. AI Safety Institute.
  • The anonymized models were assessed by measuring the compliance, correctness and completion of responses. The evaluations were developed and run using the institute’s open-source model evaluation framework, Inspect, released earlier this month.
  • “All tested LLMs remain highly vulnerable to basic jailbreaks, and some will provide harmful outputs even without dedicated attempts to circumvent their safeguards,” the institute said in the report. “We found that models comply with harmful questions across...

Today's Sponsors

LEK
ZeOmega

Today's Sponsor

LEK

 
Topics: AI (Artificial Intelligence), Govt Agencies, Healthcare System, Safety, Survey / Study, Technology, Trends
Mobile Threats, AI Innovation, Third-Party Risks: Trends Shaping Healthcare In 2025
The Future Of Work: When Human Expertise Meets AI Capabilities
Samsung’s C-Lab to Showcase AI and Health Projects at CES
Foxconn Invests in AI Data Center Firm Zettabyte to Boost Sustainable Computing
DeepSeek-V3, ultra-large open-source AI, outperforms Llama and Qwen on launch

Share This Article