Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      June 1, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      June 1, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      June 1, 2025

      How To Prevent WordPress SQL Injection Attacks

      June 1, 2025

      7 MagSafe accessories that I recommend every iPhone user should have

      June 1, 2025

      I replaced my Kindle with an iPad Mini as my ebook reader – 8 reasons why I don’t regret it

      June 1, 2025

      Windows 11 version 25H2: Everything you need to know about Microsoft’s next OS release

      May 31, 2025

      Elden Ring Nightreign already has a duos Seamless Co-op mod from the creator of the beloved original, and it’ll be “expanded on in the future”

      May 31, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Student Record Android App using SQLite

      June 1, 2025
      Recent

      Student Record Android App using SQLite

      June 1, 2025

      When Array uses less memory than Uint8Array (in V8)

      June 1, 2025

      Laravel 12 Starter Kits: Definite Guide Which to Choose

      June 1, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Photobooth is photobooth software for the Raspberry Pi and PC

      June 1, 2025
      Recent

      Photobooth is photobooth software for the Raspberry Pi and PC

      June 1, 2025

      Le notizie minori del mondo GNU/Linux e dintorni della settimana nr 22/2025

      June 1, 2025

      Rilasciata PorteuX 2.1: Novità e Approfondimenti sulla Distribuzione GNU/Linux Portatile Basata su Slackware

      June 1, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Stanford Researchers Introduce SIRIUS: A Self-Improving Reasoning-Driven Optimization Framework for Multi-Agent Systems

    Stanford Researchers Introduce SIRIUS: A Self-Improving Reasoning-Driven Optimization Framework for Multi-Agent Systems

    February 13, 2025

    Multi-agent AI systems utilizing LLMs are increasingly adept at tackling complex tasks across various domains. These systems comprise specialized agents that collaborate, leveraging their unique capabilities to achieve common objectives. Such collaboration has proven effective in complex reasoning, coding, drug discovery, and safety assurance through debate. The structured interactions among agents enhance problem-solving efficiency and provide a built-in self-correction mechanism, as agents can refine and verify each other’s outputs. This collaborative approach often surpasses single-agent performance, especially in tasks requiring rigorous reasoning or factual validation.

    Despite these advancements, optimizing multi-agent systems presents significant challenges. A primary issue is acquiring appropriate training signals for each agent, as task-level reward feedback is available, but credit assignment across agents remains ambiguous. Determining how to attribute success or failure to specific decisions and reasoning steps each LLM agent makes is complex. This challenge parallels the multi-agent credit assignment problem in reinforcement learning. However, in language-based systems, reasoning unfolds through intricate and unstructured interactions, making attribution more difficult than in traditional reinforcement learning settings with well-defined action spaces. 

    Stanford University researchers introduce SIRIUS, a self-improving optimization framework for multi-agent systems that leverages reasoning-driven learning. It constructs an experience library by retaining successful reasoning trajectories, providing a high-quality training set. Additionally, it refines unsuccessful attempts through augmentation, enriching the dataset. SIRIUS enhances reasoning and biomedical QA performance by 2.86% to 21.88% while improving agent negotiation in competitive settings. Agents iteratively refine their collaboration strategies by learning from successful interactions without direct supervision. This scalable approach enables self-generated data-driven optimization, fostering continuous improvement in multi-agent systems without relying on fine-grained human intervention.

    A multi-agent system consists of agents interacting within a defined environment, where each agent follows a policy to optimize rewards. The environment primarily relies on natural language, with agents generating responses based on prior interactions. SIRIUS, a self-improving framework, enhances agent performance through iterative fine-tuning. The process includes generating responses, evaluating them using a reward function, refining low-quality outputs, and updating policies via supervised learning. By continuously optimizing responses through iterative training and augmentation, SIRIUS improves reasoning and decision-making in language-based multi-agent systems, leading to more effective and coherent interactions over time.

    The experiments compare SIRIUS against various baselines, including Single-Agent, STaR, CoMM, and TextGrad. SIRIUS consistently outperforms other models, demonstrating improved problem-solving, task decomposition, and agent collaboration. Ablation studies reveal that specialized agent roles, multi-agent optimization, and experience augmentation are crucial for performance. SIRIUS also excels in actor-critic and competitive settings, outperforming other methods in tasks like PubMedQA and resource exchange games. Fine-tuning SIRIUS leads to improved win rates and payoffs, and it generalizes well across different game configurations, confirming its robustness and adaptability across various scenarios.

    Hostinger

    In conclusion, SIRIUS is a framework designed to optimize multi-agent systems powered by LLMs through learning from successful interactions and refining failed ones. It builds an experience library containing high-quality reasoning steps that lead to successful outcomes, which serves as a training set for system optimization. Additionally, SIRIUS augments the library by improving unsuccessful trajectories. The approach boosts reasoning, biomedical QA, and agent negotiation performance, with improvements ranging from 2.86% to 21.88%. SIRIUS also enables continuous self-improvement and generates reusable data for future enhancements in multi-agent collaboration.


    Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 75k+ ML SubReddit.

    🚨 Recommended Open-Source AI Platform: ‘IntellAgent is a An Open-Source Multi-Agent Framework to Evaluate Complex Conversational AI System’ (Promoted)

    The post Stanford Researchers Introduce SIRIUS: A Self-Improving Reasoning-Driven Optimization Framework for Multi-Agent Systems appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleLIMO: The AI Model that Proves Quality Training Beats Quantity
    Next Article ImmerseDiffusion: A Generative Spatial Audio Latent Diffusion Model

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    June 1, 2025
    Machine Learning

    BOND 2025 AI Trends Report Shows AI Ecosystem Growing Faster than Ever with Explosive User and Developer Adoption

    June 1, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    XJSOML

    News & Updates

    At the core of problem-solving

    Artificial Intelligence

    Brisa 0.2.7 Release notes

    Development

    CVE-2025-48757 – Lovable Database Row-Level Security Bypass (Remote Unauthenticated)

    Common Vulnerabilities and Exposures (CVEs)
    GetResponse

    Highlights

    Get Paid for Your Art: Start a Graphic Design Business Today

    August 20, 2024

    Turn your skills into a thriving business or side hustle with these online courses covering…

    Hand TeX is a handwritten LaTeX symbol classifier

    May 22, 2025

    Free Proton VPN Now Included in Vivaldi Web Browser

    March 27, 2025

    Galaxy AI is coming to mid-range Samsung phones. These models will get it first

    August 6, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.