AI can fix bugs—but can’t find them: OpenAI’s study highlights limits of LLMs in software engineering

VentureBeat February 18, 2025
Emilia David

Large language models (LLMs) may have changed software development, but enterprises will need to think twice about entirely replacing human software engineers with LLMs, despite OpenAI CEO Sam Altman’s claim that models can replace “low-level” engineers.

In a new paper, OpenAI researchers detail how they developed an LLM benchmark called SWE-Lancer to test how much foundation models can earn from real-life freelance software engineering tasks. The test found that, while the models can solve bugs, they can’t see why the bug exists and continue to make more mistakes.

The researchers tasked three LLMs — OpenAI’s GPT-4o and o1 and Anthropic’s Claude-3.5 Sonnet — with 1,488 freelance software engineer tasks from the freelance platform Upwork amounting to $1 million in...

Today's Sponsors

Today's Sponsor

Topics: AI (Artificial Intelligence), Technology

2025-02-18T21:49:08-05:00

Share This Article

AI can fix bugs—but can’t find them: OpenAI’s study highlights limits of LLMs in software engineering

Today's Sponsors

Today's Sponsor

Share This Article