Anthropic's new Claude 4 models outperform competition in coding benchmarks
- Anthropic introduced two new AI models, Claude Opus 4 and Claude Sonnet 4, aimed at enhancing coding tasks.
- The models feature capabilities for autonomous operation and extended reasoning using external tools.
- These advancements signal a shift in AI's role in software development, necessitating more focus on human oversight.
On May 22, 2025, Anthropic announced the release of two new AI coding models, Claude Opus 4 and Claude Sonnet 4, marking a significant step in the company's development of larger model releases. This is notable as the company had primarily focused on mid-range models since mid-2024. Both of these new models are presented as the most capable coding models developed by Anthropic to date, designed specifically for complex and long-running tasks. A key feature of Opus 4 is its ability to operate autonomously for extended periods, illustrating the increasing role of AI in more sophisticated coding tasks. The launch of these models comes at a time when there is a growing demand for more intelligent machines that can effectively carry out programming tasks without constant human supervision. Anthropic's models reportedly showcase high performance in industry benchmarks, with Opus 4 achieving an impressive 72.5 percent on the SWE-bench and 43.2 percent on the Terminal-bench. Such achievements suggest that Claude 4 may hold significant promise for enterprises seeking to improve coding efficiency and productivity through automation. Anthropic has also integrated memory capabilities into the new Claude 4 models, allowing them to maintain information across long sessions, which could enhance their programming effectiveness. Another innovation is the introduction of