OpenAI tackles global language divide with massive multilingual AI dataset release

VentureBeat September 23, 2024
Michael Nuñez

OpenAI took a major step toward expanding the global reach of artificial intelligence by releasing a multilingual dataset that evaluates the performance of language models across 14 languages, including Arabic, German, Swahili, Bengali and Yoruba.

The company shared the Multilingual Massive Multitask Language Understanding (MMMLU) dataset on the open data platform Hugging Face. This new evaluation builds on the popular Massive Multitask Language Understanding (MMLU) benchmark, which tested an AI system’s knowledge across 57 disciplines from mathematics to law and computer science, but only in English.

By incorporating a diverse array of languages into the new multilingual evaluation, some of which have limited resources for AI training data, OpenAI set a new benchmark for multilingual AI capabilities. This benchmark could...

Today's Sponsors

Today's Sponsor

Topics: AI (Artificial Intelligence), Technology

2024-09-23T21:59:44-04:00

Share This Article

OpenAI tackles global language divide with massive multilingual AI dataset release

Today's Sponsors

Today's Sponsor

Share This Article