Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 31, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 31, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 31, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 31, 2025

      How to install SteamOS on ROG Ally and Legion Go Windows gaming handhelds

      May 31, 2025

      Xbox Game Pass just had its strongest content quarter ever, but can we expect this level of quality forever?

      May 31, 2025

      Gaming on a dual-screen laptop? I tried it with Lenovo’s new Yoga Book 9i for 2025 — Here’s what happened

      May 31, 2025

      We got Markdown in Notepad before GTA VI

      May 31, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Oracle Fusion new Product Management Landing Page and AI (25B)

      May 31, 2025
      Recent

      Oracle Fusion new Product Management Landing Page and AI (25B)

      May 31, 2025

      Filament Is Now Running Natively on Mobile

      May 31, 2025

      How Remix is shaking things up

      May 30, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      How to install SteamOS on ROG Ally and Legion Go Windows gaming handhelds

      May 31, 2025
      Recent

      How to install SteamOS on ROG Ally and Legion Go Windows gaming handhelds

      May 31, 2025

      Xbox Game Pass just had its strongest content quarter ever, but can we expect this level of quality forever?

      May 31, 2025

      Gaming on a dual-screen laptop? I tried it with Lenovo’s new Yoga Book 9i for 2025 — Here’s what happened

      May 31, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Meet RAGEN Framework: The First Open-Source Reproduction of DeepSeek-R1 for Training Agentic Models via Reinforcement Learning

    Meet RAGEN Framework: The First Open-Source Reproduction of DeepSeek-R1 for Training Agentic Models via Reinforcement Learning

    February 1, 2025

    Developing AI agents capable of independent decision-making, especially for multi-step tasks, is a significant challenge. DeepSeekAI, a leader in advancing large language models and reinforcement learning, focuses on enabling AI to process information, predict outcomes, and adjust actions as situations evolve. It underlines the importance of proper reasoning in dynamic settings. The new development from DeepSeekAI captures state-of-the-art methods in reinforcement learning, large language models, and agent-based decision-making to ensure that it stays on top of the current AI research and applications. It deals with many common problems, such as decision-making inconsistencies, long-term planning issues, and the inability to adapt to changing conditions. However, AI can take suboptimal actions or even commit errors without a proper reasoning mechanism.

    Many AI training methodologies suffer from problems of inconsistent processing, which, in turn, leads to errors on tasks that necessitate multiple decision-making rounds. These approaches do not describe an environment that, through the action of AI, provides a complete understanding of the consequences, due to which results are unanalyzed and obscure. Also, training is implemented in a step-by-step procedure by which there are breaks in learning sequences, and reward functions become unstable, resulting in the lack of suitable long-term policy development. Therefore, decision and problem-solving systems become inefficient and ineffective. The DeepSeekAI solves this dilemma by providing more integrated and well-streamlined training, helping AI make good, consistent, dependable decisions while quickly adapting to new environments.

    Meet RAGEN, the first reproduction of DeepSeek-R1(-Zero) methods for training agentic models, to address challenges in training AI agents for multi-step reasoning and real-world tasks. DeepSeekAI, known for its advancements in large language models and reinforcement learning, developed DeepSeek-R1 to enhance agentic reasoning through structured training. Unlike other methods that struggle with inconsistent batch processing, limited planning, and unstable rewards, RAGEN streamlines training using a two-phase approach: a rollout phase where environment states and model-generated reasoning tokens are processed together and an update phase where only critical tokens (actions and rewards) contribute to learning, ensuring stable batch rollouts and improving decision-making. The framework efficiently prevents instability from variable sequence lengths by generating reasoning and action tokens during rollout, executing only actions in the environment, and reinforcing strategic planning through reward aggregation in the update phase. Tested on the Sokoban puzzle environment, RAGEN showed that smaller models perform comparably to larger ones and that models without explicit instructions adapt well. RAGEN enhances sequential decision-making by reproducing DeepSeek-R1’s training methodology, making it valuable for applications like logistics automation and AI assistants.

    Ultimately, RAGEN enhances the training of AI agents by eliminating inconsistent decision-making, unstable rewards, and planning limitations. By mimicking DeepSeek-R1’s approach, it guarantees stable learning and better adaptability. Tested on the Sokoban puzzle, it showed that smaller models perform well as an efficiency indicator. As a baseline for future research, RAGEN can help refine AI training methods, improve reinforcement learning, and support advancements in general-purpose AI systems.


    Check out the GitHub Page. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 70k+ ML SubReddit.

    🚨 Meet IntellAgent: An Open-Source Multi-Agent Framework to Evaluate Complex Conversational AI System (Promoted)

    The post Meet RAGEN Framework: The First Open-Source Reproduction of DeepSeek-R1 for Training Agentic Models via Reinforcement Learning appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleIntel Labs Explores Low-Rank Adapters and Neural Architecture Search for LLM Compression
    Next Article Mistral AI Releases the Mistral-Small-24B-Instruct-2501: A Latency-Optimized 24B-Parameter Model Released Under the Apache 2.0 License

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    May 31, 2025
    Machine Learning

    Multimodal Foundation Models Fall Short on Physical Reasoning: PHYX Benchmark Highlights Key Limitations in Visual and Symbolic Integration

    May 31, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Warhammer 40,000: Rogue Trader, winner of Windows Central’s 2024 Editor’s Choice Award, has sold over 1 million copies

    News & Updates

    CVE-2025-32432 (CVSS 10): Craft CMS Hit by Critical RCE Flaw Exploited in the Wild

    Security

    Understanding Microservices Architecture: Benefits and Challenges Explained

    Development

    CVE-2025-1529 – WordPress AM LottiePlayer Stored Cross-Site Scripting Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Highlights

    Is Elden Ring Nightreign on Xbox Game Pass? News & Updates

    Is Elden Ring Nightreign on Xbox Game Pass?

    April 9, 2025

    Elden Ring Nightreign brings players back to the beloved ARPG’s world with a co-op roguelike…

    CVE-2025-26390 – OZW672/OZW772 SQL Injection Vulnerability

    May 13, 2025

    CVE-2025-4559 – Netvision ISOinsight SQL Injection

    May 12, 2025

    CISA’s Latest Advisories Expose High-Risk Vulnerabilities in Industrial Control Systems

    April 3, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.