FPT Software AI Center Introduces HyperAgent: A Groundbreaking Generalist Agent System to Resolve Various Software Engineering Tasks at Scale, Achieving SOTA Performance on SWE-Bench and Defects4J

Large Language Models (LLMs) have revolutionized software engineering, demonstrating remarkable capabilities in various coding tasks. While recent efforts have produced autonomous software agents based on LLMs for end-to-end development tasks, these systems are typically designed for specific Software Engineering (SE) tasks. Researchers from FPT Software AI Center, Viet Nam, introduce HyperAgent, a novel generalist multi-agent system designed to address a wide spectrum of SE tasks across different programming languages by mimicking human developersâ€™ workflows.

HyperAgent comprises four specialized agentsâ€”Planner, Navigator, Code Editor, and Executorâ€”managing the full lifecycle of SE tasks, from initial conception to final verification. Through extensive evaluations, HyperAgent demonstrates competitive performance across diverse SE tasks:

GitHub issue resolution: 25.01% success rate on SWE-Bench-Lite and 31.40% on SWE-Bench-Verified, competitive performance compared to existing methods, such as AutoCodeRover, SWE-Agent, Agentless, etc.

Code generation at repository scale (RepoExec): 53.3% accuracy when navigating through codebases and retrieving correct context.

Fault localization and program repair (Defects4J): 59.70% accuracy in fault localization and successful fixes for 29.8% of Defects4J bugs, achieved SOTA performance on these 2 tasks.

This work represents a significant advancement towards versatile, autonomous agents capable of handling complex, multi-step SE tasks across various domains and languages. HyperAgentâ€™s performance demonstrates its potential to transform AI-assisted software development practices, offering a more adaptable and comprehensive solution than task-specific alternatives.

Methodology

HyperAgent is inspired by typical developer workflows to solve any software engineering task, it consists of four iterative phases in the typical software engineering workflow: Analysis & Plan, where developers understand requirements and formulate a flexible strategy; Feature Localization, which involves identifying relevant code components in the repository; Edition, where developers implement changes, add functionality, and write tests while maintaining code quality; and Execution, which includes testing and verification of the modifications. These phases are repeated as necessary until the task is completed satisfactorily, with the process adapting to the specific task requirements and the developerâ€™s expertise.

In HyperAgent, the framework is organized around four primary agents: Planner, Navigator, Code Editor, and Executor. Each agent corresponds to a specific step in the overall workflow, though the actual workflow of each agent may differ slightly from how a human developer might approach similar tasks.

The design emphasizes three main advantages over existing methods:

Generalizability: The framework is designed to easily adapt to a wide range of tasks with minimal configuration changes and little additional effort required to implement new modules into the system.

Efficiency: Each agent is optimized to manage processes with varying levels of complexity, requiring different degrees of intelligence from LLMs. For example, a lightweight and computationally efficient LLM can be employed for navigation, which, while less complex, involves the highest token consumption. Conversely, more complex tasks, such as code editing or execution, require more advanced LLM capabilities.

Scalability: The framework is built to scale effectively when deployed in real-world scenarios where the number of subtasks is significantly large. For instance, a complex task in the SWE-bench benchmark may require considerable time for an agent-based system to complete, and HyperAgent is designed to handle such scenarios efficiently.

These advantages allow HyperAgent to effectively tackle a broad spectrum of software engineering tasks while maintaining efficiency and scalability.

Conclusion

HyperAgent is a generalist multi-agent system designed to address a wide range of software engineering tasks. By closely mimicking typical software engineering workflows, HyperAgent incorporates stages for analysis, planning, feature localization, code editing, and execution/verification. Extensive evaluations across diverse benchmarks, including GitHub issue resolution, code generation at repository-level scale, and fault localization and program repair, demonstrate that HyperAgent not only matches but often exceeds the performance of specialized systems. The success of HyperAgent highlights the potential of generalist approaches in software engineering, offering a versatile tool that can adapt to various tasks with minimal configuration changes. Its design emphasizes generalizability, efficiency, and scalability, making it well-suited for real-world software development scenarios where tasks can vary significantly in complexity and scope.

Future work could explore integrating HyperAgent with existing development environments and version control systems, investigating its potential in specialized domains like security-focused code review or performance optimization, enhancing its explainability, and continually updating its knowledge base. These advancements could further streamline the software engineering process, expand HyperAgentâ€™s applicability, improve trust among developers, and ensure its long-term relevance in the rapidly evolving field of software engineering.

Check out the Paper and GitHub. All credit for this research goes to the researchers of this project. Also,Â donâ€™t forget to follow us onÂ Twitter and join ourÂ Telegram Channel andÂ LinkedIn Group. If you like our work, you will love ourÂ newsletter..

Donâ€™t Forget to join ourÂ 48k+ ML SubReddit

Find Upcoming AI Webinars here

Thanks toÂ FPT Software AI CenterÂ for the thought leadership/ Resources for this article.Â FPT Software AI Center has supported us in this content/article.

The post FPT Software AI Center Introduces HyperAgent: A Groundbreaking Generalist Agent System to Resolve Various Software Engineering Tasks at Scale, Achieving SOTA Performance on SWE-Bench and Defects4J appeared first on MarkTechPost.

Source: Read MoreÂ

CodeSOD: Enterprise Code Coverage

CodeSOD: Ready Xor Not

CodeSOD: A Set of Mistakes

CodeSOD: While This Works

I tried an ultra-thin iPhone case, and here’s how my daunting experience went

I tested the viral ‘tangle-free’ USB-C cable, and it’s my new travel essential

I found one of the fastest-charging portable batteries for home backups – and it’s on sale

Qualcomm scores BIG win against Arm, can continue to sell Snapdragon X chips for PCs

Community News: Latest PECL Releases (12.10.2024)

Community News: Latest PECL Releases (12.10.2024)

Community News: Latest PEAR Releases (12.09.2024)

Community News: Latest PECL Releases (12.17.2024)

Windows 11’s Microsoft 365 app is taking a new AI-first approach with Copilot

Windows 11’s Microsoft 365 app is taking a new AI-first approach with Copilot

5 Compelling Reasons to Choose Linux Over Windows

Rilasciato DXVK 2.5.2: Ottimizzazioni e Correzioni per i Giochi Windows su GNU/Linux

FPT Software AI Center Introduces HyperAgent: A Groundbreaking Generalist Agent System to Resolve Various Software Engineering Tasks at Scale, Achieving SOTA Performance on SWE-Bench and Defects4J

Methodology

Conclusion

Why developers needn’t fear CSS – with the King of CSS himself Kevin Powell [Podcast #154]

I tested the viral ‘tangle-free’ USB-C cable, and it’s my new travel essential

How to maximize browser window irrespective of windows versions

Microsoft Warns of Surge in Cyber Attacks Targeting Internet-Exposed OT Devices

Airline lost your luggage? This new Apple feature could help find it

Neutralinojs 5.3 released!

Cutting-Edge AI Features at IFA Berlin 2024

Microsoft fixes issue with taskbar not showing after Windows 11â€™s KB5039302 June update

Load testing asynchronous systems

Is Content Design Still Relevant?

FPT Software AI Center Introduces HyperAgent: A Groundbreaking Generalist Agent System to Resolve Various Software Engineering Tasks at Scale, Achieving SOTA Performance on SWE-Bench and Defects4J

Methodology

Conclusion

Related Posts