Is the debugging ability of the AI programming assistant insufficient? Microsoft research reveals the reasons behind it

Insufficient Debugging Ability of AI Programming Assistants? Microsoft Research Reveals the Reasons Behind

via cnBeta.COM Chinese Industry Information Website - Telegram Channel

Telegraph
Insufficient Debugging Ability of AI Programming Assistants? Microsoft Research Reveals the Reasons Behind
Insufficient Debugging Ability of AI Programming Assistants? Microsoft Research Reveals the Reasons Behind A recent study by Microsoft Research shows that even the most advanced AI models struggle when dealing with software debugging tasks. This study tested several top AI models, including Anthropic's Claude3.7Sonnet and OpenAI's o3-mini, using a benchmark called SWE-benchLite for evaluation. The results show that these models did not perform satisfactorily when faced with 300 carefully selected software debugging tasks. The success rate of Claude3.7Sonnet barely reached 48.4%, while the two models of OpenAI were even lower, only 30.2% and 22.1% respectively. So...