Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Droip: The Modern Website Builder WordPress Needed

      July 8, 2025

      Last week in AI dev tools: Cloudflare blocking AI crawlers by default, Perplexity Max subscription, and more (July 7, 2025)

      July 7, 2025

      Infragistics Launches Ultimate 25.1 With Major Updates to App Builder, Ignite UI

      July 7, 2025

      Design Guidelines For Better Notifications UX

      July 7, 2025

      There’s a massive 42% Amazon Prime Day discount on the Razer DeathAdder V3 Pro — One of the best gaming mice we gave a near-perfect score to

      July 8, 2025

      This 360Hz QD-OLED monitor is more than magnificent — and it’s $280 off right now

      July 8, 2025

      Diablo 4, one of Blizzard’s best Xbox games, is now 64% off — a devilish Anti-Amazon Prime Day discount that’s worth taking over Amazon’s deals

      July 8, 2025

      “One of the best and most premium charging accessories” — Razer Universal Quick Charging Stand for Xbox is 40% off

      July 8, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      AI and Digital Trends Marketing and IT Leaders Need to Know

      July 8, 2025
      Recent

      AI and Digital Trends Marketing and IT Leaders Need to Know

      July 8, 2025

      Blade Authorization Directives for View Security

      July 8, 2025

      Laravel AI Chat Starter Kit

      July 8, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      There’s a massive 42% Amazon Prime Day discount on the Razer DeathAdder V3 Pro — One of the best gaming mice we gave a near-perfect score to

      July 8, 2025
      Recent

      There’s a massive 42% Amazon Prime Day discount on the Razer DeathAdder V3 Pro — One of the best gaming mice we gave a near-perfect score to

      July 8, 2025

      This 360Hz QD-OLED monitor is more than magnificent — and it’s $280 off right now

      July 8, 2025

      Diablo 4, one of Blizzard’s best Xbox games, is now 64% off — a devilish Anti-Amazon Prime Day discount that’s worth taking over Amazon’s deals

      July 8, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»This AI Paper Introduces FASTCURL: A Curriculum Reinforcement Learning Framework with Context Extension for Efficient Training of R1-like Reasoning Models

    This AI Paper Introduces FASTCURL: A Curriculum Reinforcement Learning Framework with Context Extension for Efficient Training of R1-like Reasoning Models

    April 3, 2025

    Large language models have transformed how machines comprehend and generate text, especially in complex problem-solving areas like mathematical reasoning. These systems, known as R1-like models, are designed to emulate slow and deliberate thought processes. Their key strength is handling intricate tasks requiring step-by-step reasoning across long sequences. These capabilities make them valuable for applications such as solving Olympiad-level math problems or logical reasoning tasks, where depth and coherence of reasoning are essential.

    A significant challenge in training these models is the extensive computation for reinforcement learning using long context windows. Tasks that require multi-step logic force models to produce long outputs which consumes more resources and slows down learning. Further, not all long responses contribute meaningfully to accuracy; many include redundant reasoning. These inefficiencies in response generation and high GPU usage make it difficult to effectively scale training, particularly when working with models with 1.5 billion parameters.

    Previous attempts to address this issue include models like DeepScaleR, which uses a staged context length extension strategy during training. DeepScaleR starts with an 8K context window and expands gradually to 24K over three training phases. Although this approach helps guide the model to manage longer reasoning chains efficiently, it still demands approximately 70,000 A100 GPU hours. DeepScaleR reduces that to 3,800 hours through a progressive strategy but still requires considerable hardware, including setups with up to 32 GPUs in some stages. This shows that while improvements are possible, the solution remains costly and complex.

    Researchers at Tencent introduced a method called FASTCURL to overcome the inefficiencies of traditional reinforcement learning training. This method presents a curriculum-based strategy aligned with context window expansion. FASTCURL splits the dataset based on input prompt length into short, long, and combined categories. The training progresses in four stages, each using a different dataset and context window setting. This approach ensures the model learns simple reasoning before advancing to longer, more complex reasoning steps. The researchers emphasize that the entire training process runs on a single node with just 8 GPUs, reducing setup complexity.

    The approach involves a deliberate segmentation of data by input length, driven by the hypothesis that longer prompts usually lead to longer and more complex outputs. The model first learns using short prompts under an 8K window. As training proceeds, the model transitions to a mixed dataset with 16K window length, then to the long dataset with the same window size, and finally reviews the combined data again. Each stage is trained for one iteration, and FASTCURL requires about 860 training steps. This is efficient compared to DeepScaleR’s 1,750 steps, representing a 50% reduction in training time and resource usage while maintaining effectiveness.

    In performance evaluations, FASTCURL-1.5B-Preview showed improvements over other models across five benchmarks. It scored 88.0 on MATH 500, 43.1 on AIME 2024, 74.2 on AMC 2023, 31.6 on Minerva Math, and 50.4 on OlympiadBench, with an average PASS@1 score of 57.5. Compared to DeepScaleR-1.5B-Preview, which scored an average of 57.0, FASTCURL performed better in four of five datasets. These results highlight that FASTCURL can outperform existing techniques while consuming significantly fewer resources. The model also showed better generalization, particularly on datasets like AMC 2023 and Minerva Math, indicating robustness.

    The research clearly outlines a computational problem in training R1-like reasoning models and offers an innovative curriculum strategy as a solution. The method provides an efficient and practical training framework by combining input-based data segmentation with context expansion. FASTCURL delivers strong performance using fewer steps and limited hardware, proving that strategic training design can be as powerful as raw computational scale.


    Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 85k+ ML SubReddit.

    🔥 [Register Now] miniCON Virtual Conference on OPEN SOURCE AI: FREE REGISTRATION + Certificate of Attendance + 3 Hour Short Event (April 12, 9 am- 12 pm PST) + Hands on Workshop [Sponsored]

    The post This AI Paper Introduces FASTCURL: A Curriculum Reinforcement Learning Framework with Context Extension for Efficient Training of R1-like Reasoning Models appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleResearchers from Dataocean AI and Tsinghua University Introduces Dolphin: A Multilingual Automatic Speech Recognition ASR Model Optimized for Eastern Languages and Dialects
    Next Article Introduction to MCP: The Ultimate Guide to Model Context Protocol for AI Assistants

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    July 8, 2025
    Machine Learning

    The Geometries of Truth Are Orthogonal Across Tasks

    July 7, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    Are software professionals truly an endangered species? It’s complicated

    News & Updates

    I’ve already published 58 reviews in 2025 — These are my top 10 favorite laptops, accessories, and other tech so far

    News & Updates

    CVE-2025-6490 – Nokogiri Heap-Based Buffer Overflow Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-49444 – Merkulove Reformer for Elementor Unrestricted File Upload Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Highlights

    CVE-2025-53310 – Funnnny HidePost CSRF Reflected XSS

    June 27, 2025

    CVE ID : CVE-2025-53310

    Published : June 27, 2025, 2:15 p.m. | 55 minutes ago

    Description : Cross-Site Request Forgery (CSRF) vulnerability in Funnnny HidePost allows Reflected XSS. This issue affects HidePost: from n/a through 2.3.8.

    Severity: 7.1 | HIGH

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    CVE-2025-22478 – Dell Storage Center – Dell Storage Manager XML External Entity Reference Vulnerability

    May 6, 2025

    RVTools Official Site Hacked to Deliver Bumblebee Malware via Trojanized Installer

    May 19, 2025

    CVE-2025-6583 – SourceCodester Best Salon Management System SQL Injection Vulnerability

    June 24, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.