VentureBeat October 10, 2024
Michael Nuñez

OpenAI has introduced a new tool to measure artificial intelligence capabilities in machine learning engineering. The benchmark, called MLE-bench, challenges AI systems with 75 real-world data science competitions from Kaggle, a popular platform for machine learning contests.

This benchmark emerges as tech companies intensify efforts to develop more capable AI systems. MLE-bench goes beyond testing an AI’s computational or pattern recognition abilities; it assesses whether AI can plan, troubleshoot, and innovate in the complex field of machine learning engineering.

AI takes on Kaggle: Impressive wins and surprising setbacks

The results reveal both the progress and limitations of current AI technology. OpenAI’s most advanced model, o1-preview, when paired with specialized scaffolding called AIDE, achieved medal-worthy performance in 16.9% of the competitions....

Today's Sponsors

LEK
ZeOmega

Today's Sponsor

LEK

 
Topics: AI (Artificial Intelligence), Technology
Google digs deeper into healthcare AI: 5 notes
JP Morgan Annual Healthcare Conference 2025: What are the key talking points likely to be?
How AI Has And Will Continue To Transform Healthcare
AI Translates Nature into New Medicines | StartUp Health Insights: Week of Nov 26, 2024
Building AI trust: The key role of explainability

Share This Article