All Hands AI Open Sources OpenHands CodeAct 2.1: A New Software Development Agent to Solve Over 50% of Real Github Issues in SWE-Bench

The world of software development has seen an explosion in the use of AI agents over the last few years, promising to enhance productivity, automate complex tasks, and make the lives of developers easier. However, one problem that remains prevalent is the significant gap between these promising AI agents and their ability to address real-world issues effectively. Most AI Agents struggle to understand the complexity and contextual nuances of software development challengesâ€”especially when it comes to solving real GitHub issues that developers face every day. These AI agents often fall short, requiring extensive oversight or manual correction from developers, which defeats their purpose. Addressing this challenge requires a solution that is not just smarter but is able to keep up with the dynamic demands of software engineering, a space full of unique challenges and fast-moving projects.

All Hands AI Open Sources OpenHands CodeAct 2.1: a new software development agent, the first to solve over 50% of real GitHub issues in SWE-Bench, the standard benchmark for evaluating AI-assisted software engineering tools. OpenHands CodeAct 2.1 represents a significant leap forward, boasting a 53% resolution rate on SWE-Bench and a 41.7% success rate on SWE-Bench Lite. What makes OpenHands CodeAct 2.1 particularly revolutionary is that it has gone beyond experimentation in controlled environments and is now making a substantial impact on actual projects by solving real GitHub issues autonomously. Unlike other tools that are either too closed off for contribution or too niche to be useful to the broader community, OpenHands is an open-source agent that developers can freely use, improve, and adapt. With the perfect combination of openness and competitiveness, it has become the top choice for developers seeking an effective AI solution.

OpenHands CodeAct 2.1â€™s performance improvements are primarily rooted in three major updates. First, it switched to Anthropicâ€™s new Claude-3.5 model, which significantly improves natural language understanding, allowing CodeAct to better interpret issues raised by developers. Second, the agentâ€™s actions have been modified to use function calling, which brings more precision in task execution. This ensures that the agent can call specific pieces of code without misinterpretation, effectively addressing developer issues more accurately. Lastly, the developers behind CodeAct 2.1 made significant improvements regarding directory traversal, reducing instances of the agent getting stuck in repetitive or circular tasksâ€”a common problem that plagued earlier iterations. By refining the agentâ€™s capabilities to navigate directories intelligently, larger and more complicated issues are resolved smoothly, and efficiency is markedly increased.

The importance of these updates cannot be overstated. Having a 53% resolve rate on SWE-Bench means that over half of the issues in this benchmark were solved without any human intervention. Considering that SWE-Bench is specifically designed to be representative of real-world GitHub issues faced by software developers, this milestone demonstrates that OpenHands CodeAct 2.1 can directly impact software engineering workflows by solving a substantial number of issues autonomously. In the broader scope of automated development assistance, this is significant because it saves developers time and allows them to focus on higher-level challenges rather than getting bogged down by tedious issue resolution. Moreover, the open-source nature of OpenHands invites developers from around the globe to contribute and further improve the agentâ€”a feature that the development community holds in high regard. The data from SWE-Bench Lite, where OpenHands CodeAct 2.1 achieved a 41.7% resolve rate, also supports its versatility and capability in handling less complex issues, which can be equally disruptive when left unchecked in a development pipeline.

In conclusion, OpenHands CodeAct 2.1 is a breakthrough in AI-driven software development, moving us a step closer to fully autonomous coding assistants that genuinely enhance productivity. Its ability to solve over 50% of real GitHub issues in SWE-Bench demonstrates not only technological advancement but also practical usability that developers can rely on day-to-day. The open-source nature of OpenHands ensures that it remains a community-driven effort with the promise of continued improvements. Whether developers are looking to run OpenHands locally, integrate it through GitHub actions, or sign up for the soon-to-be-released online version, it offers flexibility and an open invitation to all developers to join in its evolution. With major improvements in the agentâ€™s capabilitiesâ€”such as adopting Anthropicâ€™s Claude-3.5, implementing function calling, and improving directory traversalâ€”OpenHands CodeAct 2.1 is setting the standard for what an AI development agent should be: effective, accessible, and continuously evolving.

Check out the Details and GitHub here. All credit for this research goes to the researchers of this project. Also,Â donâ€™t forget to follow us onÂ Twitter and join ourÂ Telegram Channel andÂ LinkedIn Group. If you like our work, you will love ourÂ newsletter.. Donâ€™t Forget to join ourÂ 55k+ ML SubReddit.

The post All Hands AI Open Sources OpenHands CodeAct 2.1: A New Software Development Agent to Solve Over 50% of Real Github Issues in SWE-Bench appeared first on MarkTechPost.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Minecraft licensing robbed us of this controversial NFL schedule release video

The power of generators

The power of generators

Simplify Factory Associations with Laravel’s UseFactory Attribute

This Week in Laravel: React Native, PhpStorm Junie, and more

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

All Hands AI Open Sources OpenHands CodeAct 2.1: A New Software Development Agent to Solve Over 50% of Real Github Issues in SWE-Bench

Salesforce AI Releases BLIP3-o: A Fully Open-Source Unified Multimodal Model Built with CLIP Embeddings and Flow Matching for Image Understanding and Generation

Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

Unleash AI innovation with Amazon SageMaker HyperPod

How to install Ubuntu Server in under 30 minutes

Google AI Released TxGemma: A Series of 2B, 9B, and 27B LLM for Multiple Therapeutic Tasks for Drug Development Fine-Tunable with Transformers

CVE-2025-30330 – Adobe Illustrator Heap-based Buffer Overflow Vulnerability

I replaced my M1 MacBook Pro with a base model M4 – and it blew my $3,000 system away

Microsoft Edge Tests Bottom Address Bar Swipe Gesture for Tab Switching on Android

Java Selenium: Custom Assert Message for Multiple Checkbox

Critical Unpatched Flaws Disclosed in Popular Gogs Open-Source Git Service

All Hands AI Open Sources OpenHands CodeAct 2.1: A New Software Development Agent to Solve Over 50% of Real Github Issues in SWE-Bench

Related Posts