How OpenAI stress-tests its large language models

MIT Technology Review November 21, 2024
Will Douglas Heaven

The company really wants you to know that it’s trying to make its models safer.

OpenAI is once again lifting the lid (just a crack) on its safety-testing processes. Last month the company shared the results of an investigation that looked at how often ChatGPT produced a harmful gender or racial stereotype based on a user’s name. Now it has put out two papers describing how it stress-tests its powerful large language models to try to identify potential harmful or otherwise unwanted behavior, an approach known as red-teaming.

Large language models are now being used by millions of people for many different things. But as OpenAI itself points out, these models are known to produce racist, misogynistic and hateful...

Today's Sponsors

Today's Sponsor

Topics: AI (Artificial Intelligence), Technology

2024-11-21T20:15:57-05:00

Share This Article

How OpenAI stress-tests its large language models

Today's Sponsors

Today's Sponsor

Share This Article