Anthropic just made it harder for AI to go rogue with its updated safety policy
VentureBeat October 15, 2024
Anthropic, the artificial intelligence company behind the popular Claude chatbot, today announced a sweeping update to its Responsible Scaling Policy (RSP), aimed at mitigating the risks of highly capable AI systems.
The policy, originally introduced in 2023, has evolved with new protocols to ensure that AI models, as they grow more powerful, are developed and deployed safely.
This revised policy sets out specific Capability Thresholds—benchmarks that indicate when an AI model’s abilities have reached a point where additional safeguards are necessary.
The thresholds cover high-risk areas such as bioweapons creation and autonomous AI research, reflecting Anthropic’s commitment to prevent misuse of its technology. The update also brings new internal governance measures, including the appointment of a Responsible Scaling Officer to...