Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Error’d: Pickup Sticklers

      September 27, 2025

      From Prompt To Partner: Designing Your Custom AI Assistant

      September 27, 2025

      Microsoft unveils reimagined Marketplace for cloud solutions, AI apps, and more

      September 27, 2025

      Design Dialects: Breaking the Rules, Not the System

      September 27, 2025

      Building personal apps with open source and AI

      September 12, 2025

      What Can We Actually Do With corner-shape?

      September 12, 2025

      Craft, Clarity, and Care: The Story and Work of Mengchu Yao

      September 12, 2025

      Cailabs secures €57M to accelerate growth and industrial scale-up

      September 12, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Using phpinfo() to Debug Common and Not-so-Common PHP Errors and Warnings

      September 28, 2025
      Recent

      Using phpinfo() to Debug Common and Not-so-Common PHP Errors and Warnings

      September 28, 2025

      Mastering PHP File Uploads: A Guide to php.ini Settings and Code Examples

      September 28, 2025

      The first browser with JavaScript landed 30 years ago

      September 27, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured
      Recent
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»This AI Paper Introduces FASTCURL: A Curriculum Reinforcement Learning Framework with Context Extension for Efficient Training of R1-like Reasoning Models

    This AI Paper Introduces FASTCURL: A Curriculum Reinforcement Learning Framework with Context Extension for Efficient Training of R1-like Reasoning Models

    April 3, 2025

    Large language models have transformed how machines comprehend and generate text, especially in complex problem-solving areas like mathematical reasoning. These systems, known as R1-like models, are designed to emulate slow and deliberate thought processes. Their key strength is handling intricate tasks requiring step-by-step reasoning across long sequences. These capabilities make them valuable for applications such as solving Olympiad-level math problems or logical reasoning tasks, where depth and coherence of reasoning are essential.

    A significant challenge in training these models is the extensive computation for reinforcement learning using long context windows. Tasks that require multi-step logic force models to produce long outputs which consumes more resources and slows down learning. Further, not all long responses contribute meaningfully to accuracy; many include redundant reasoning. These inefficiencies in response generation and high GPU usage make it difficult to effectively scale training, particularly when working with models with 1.5 billion parameters.

    Previous attempts to address this issue include models like DeepScaleR, which uses a staged context length extension strategy during training. DeepScaleR starts with an 8K context window and expands gradually to 24K over three training phases. Although this approach helps guide the model to manage longer reasoning chains efficiently, it still demands approximately 70,000 A100 GPU hours. DeepScaleR reduces that to 3,800 hours through a progressive strategy but still requires considerable hardware, including setups with up to 32 GPUs in some stages. This shows that while improvements are possible, the solution remains costly and complex.

    Researchers at Tencent introduced a method called FASTCURL to overcome the inefficiencies of traditional reinforcement learning training. This method presents a curriculum-based strategy aligned with context window expansion. FASTCURL splits the dataset based on input prompt length into short, long, and combined categories. The training progresses in four stages, each using a different dataset and context window setting. This approach ensures the model learns simple reasoning before advancing to longer, more complex reasoning steps. The researchers emphasize that the entire training process runs on a single node with just 8 GPUs, reducing setup complexity.

    The approach involves a deliberate segmentation of data by input length, driven by the hypothesis that longer prompts usually lead to longer and more complex outputs. The model first learns using short prompts under an 8K window. As training proceeds, the model transitions to a mixed dataset with 16K window length, then to the long dataset with the same window size, and finally reviews the combined data again. Each stage is trained for one iteration, and FASTCURL requires about 860 training steps. This is efficient compared to DeepScaleR’s 1,750 steps, representing a 50% reduction in training time and resource usage while maintaining effectiveness.

    In performance evaluations, FASTCURL-1.5B-Preview showed improvements over other models across five benchmarks. It scored 88.0 on MATH 500, 43.1 on AIME 2024, 74.2 on AMC 2023, 31.6 on Minerva Math, and 50.4 on OlympiadBench, with an average PASS@1 score of 57.5. Compared to DeepScaleR-1.5B-Preview, which scored an average of 57.0, FASTCURL performed better in four of five datasets. These results highlight that FASTCURL can outperform existing techniques while consuming significantly fewer resources. The model also showed better generalization, particularly on datasets like AMC 2023 and Minerva Math, indicating robustness.

    The research clearly outlines a computational problem in training R1-like reasoning models and offers an innovative curriculum strategy as a solution. The method provides an efficient and practical training framework by combining input-based data segmentation with context expansion. FASTCURL delivers strong performance using fewer steps and limited hardware, proving that strategic training design can be as powerful as raw computational scale.


    Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 85k+ ML SubReddit.

    🔥 [Register Now] miniCON Virtual Conference on OPEN SOURCE AI: FREE REGISTRATION + Certificate of Attendance + 3 Hour Short Event (April 12, 9 am- 12 pm PST) + Hands on Workshop [Sponsored]

    The post This AI Paper Introduces FASTCURL: A Curriculum Reinforcement Learning Framework with Context Extension for Efficient Training of R1-like Reasoning Models appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleResearchers from Dataocean AI and Tsinghua University Introduces Dolphin: A Multilingual Automatic Speech Recognition ASR Model Optimized for Eastern Languages and Dialects
    Next Article Introduction to MCP: The Ultimate Guide to Model Context Protocol for AI Assistants

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    September 3, 2025
    Machine Learning

    Announcing the new cluster creation experience for Amazon SageMaker HyperPod

    September 3, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    Modernizing on Your Own Terms: A Strategic Guide to Managing Node.js Legacy Systems

    Development

    Quest Patches Critical KACE SMA Flaws, Including CVSS 10 Authentication Bypass

    Security

    Enterprise AI Without GPU Burn: Salesforce’s xGen-small Optimizes for Context, Cost, and Privacy

    Machine Learning
    Fantasy Sports App Development: Features, Cost, and How to Build a Winning Platform

    Fantasy Sports App Development: Features, Cost, and How to Build a Winning Platform

    Web Development

    Highlights

    Add Apple CarPlay or Android Auto to your older car with this screen – and it’s on sale

    July 11, 2025

    This easy-to-install screen lets me bring the convenience of Apple CarPlay to my low-tech car.…

    Google Releases 76-Page Whitepaper on AI Agents: A Deep Technical Dive into Agentic RAG, Evaluation Frameworks, and Real-World Architectures

    May 6, 2025

    Adaptive Knowledge Distillation for Device-Directed Speech Detection

    August 7, 2025

    Google Ordered to Pay $314M for Misusing Android Users’ Cellular Data Without Permission

    July 4, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.