Can AI really compete with human data scientists? OpenAI’s new benchmark puts it to the test

VentureBeat October 10, 2024
Michael Nuñez

OpenAI has introduced a new tool to measure artificial intelligence capabilities in machine learning engineering. The benchmark, called MLE-bench, challenges AI systems with 75 real-world data science competitions from Kaggle, a popular platform for machine learning contests.

This benchmark emerges as tech companies intensify efforts to develop more capable AI systems. MLE-bench goes beyond testing an AI’s computational or pattern recognition abilities; it assesses whether AI can plan, troubleshoot, and innovate in the complex field of machine learning engineering.

AI takes on Kaggle: Impressive wins and surprising setbacks

The results reveal both the progress and limitations of current AI technology. OpenAI’s most advanced model, o1-preview, when paired with specialized scaffolding called AIDE, achieved medal-worthy performance in 16.9% of the competitions....

Today's Sponsors

Today's Sponsor

Topics: AI (Artificial Intelligence), Technology

2024-10-10T20:33:44-04:00

Share This Article

Can AI really compete with human data scientists? OpenAI’s new benchmark puts it to the test

Today's Sponsors

Today's Sponsor

Share This Article