Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 8, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 8, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 8, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 8, 2025

      Xbox handheld leaks in new “Project Kennan” photos from the FCC — plus an ASUS ROG Ally 2 prototype with early specs

      May 8, 2025

      OpenAI plays into Elon Musk’s hands, ditching for-profit plan — but Sam Altman doesn’t have Microsoft’s blessing yet

      May 8, 2025

      “Are we all doomed?” — Fiverr CEO Micha Kaufman warns that AI is coming for all of our jobs, just as Bill Gates predicted

      May 8, 2025

      I went hands-on with dozens of indie games at Gamescom Latam last week — You need to wishlist these 7 titles right now

      May 8, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Mastering Node.js Streams: The Ultimate Guide to Memory-Efficient File Processing

      May 8, 2025
      Recent

      Mastering Node.js Streams: The Ultimate Guide to Memory-Efficient File Processing

      May 8, 2025

      Sitecore PowerShell commands – XM Cloud Content Migration

      May 8, 2025

      Our Partner Adobe Recognized Again as a DXP Leader

      May 8, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Xbox handheld leaks in new “Project Kennan” photos from the FCC — plus an ASUS ROG Ally 2 prototype with early specs

      May 8, 2025
      Recent

      Xbox handheld leaks in new “Project Kennan” photos from the FCC — plus an ASUS ROG Ally 2 prototype with early specs

      May 8, 2025

      OpenAI plays into Elon Musk’s hands, ditching for-profit plan — but Sam Altman doesn’t have Microsoft’s blessing yet

      May 8, 2025

      “Are we all doomed?” — Fiverr CEO Micha Kaufman warns that AI is coming for all of our jobs, just as Bill Gates predicted

      May 8, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»This AI Paper Introduce WebThinker: A Deep Research Agent that Empowers Large Reasoning Models (LRMs) for Autonomous Search and Report Generation

    This AI Paper Introduce WebThinker: A Deep Research Agent that Empowers Large Reasoning Models (LRMs) for Autonomous Search and Report Generation

    May 7, 2025

    Large reasoning models (LRMs) have shown impressive capabilities in mathematics, coding, and scientific reasoning. However, they face significant limitations when addressing complex information research needs when relying solely on internal knowledge. These models struggle with conducting thorough web information retrieval and generating accurate scientific reports through multi-step reasoning processes. So, the deep integration of LRM’s reasoning capabilities with web information exploration is a practical demand, initiating a series of deep research initiatives. However, existing open-source deep search agents use RAG techniques with rigid, predefined workflows, restricting LRMs’ ability to explore deeper web information and hindering effective interaction between LRMs and search engines.

    LRMs like OpenAI-o1, Qwen-QwQ, and DeepSeek-R1 enhance performance through extended reasoning capabilities. Various strategies have been proposed to achieve advanced reasoning capabilities, including intentional errors in reasoning during training, distilled training data, and reinforcement learning approaches to develop long chain-of-thought abilities. However, these methods are fundamentally limited by their static, parameterized architectures that lack access to external world knowledge. RAG integrates retrieval mechanisms with generative models, enabling access to external knowledge. Recent advances span multiple dimensions, including retrieval necessity, query reformulation, document compression, denoising, and instruction-following.

    Researchers from Renmin University of China, BAAI, and Huawei Poisson Lab have proposed a deep research agent called WebThinker that empowers LRMs to autonomously search the web, navigate web pages, and draft research reports during the reasoning process. WebThinker introduces a Deep Web Explorer module that enables LRMs to dynamically search, navigate, and extract information from the web when they encounter knowledge gaps. It employs an Autonomous Think-Search-and-Draft strategy, allowing models to combine reasoning, information gathering, and report writing in real time smoothly. Moreover, an RL-based training strategy is implemented to enhance research tool utilization through iterative online Direct Preference Optimization.

    WebThinker framework operates in two primary modes: Problem-Solving Mode and Report Generation Mode. In Problem-Solving Mode, WebThinker addresses complex tasks using the Deep Web Explorer tool, which the LRM can invoke during reasoning. In Report Generation Mode, the LRM autonomously produces detailed reports and employs an assistant LLM to implement report-writing tools. To improve LRMs with research tools via RL, WebThinker generates diverse reasoning trajectories by applying its framework to an extensive set of complex reasoning and report generation datasets, including SuperGPQA, WebWalkerQA, OpenThoughts, NaturalReasoning, NuminaMath, and Glaive. For each query, the initial LRM produces multiple distinct trajectories.

    The WebThinker-32B-Base model outperforms prior methods like Search-o1 across all benchmarks on complex problem-solving, with 22.9% improvement on WebWalkerQA and 20.4% on HLE. WebThinker achieves the highest overall score of 8.0, surpassing RAG baselines and advanced deep research systems in scientific report generation tasks, including Gemini-Deep Research (7.9). The adaptability across different LRM backbones is remarkable, with R1-based WebThinker models outperforming direct reasoning and standard RAG baselines. With the DeepSeek-R1-7B backbone, it achieves relative improvements of 174.4% on GAIA and 422.6% on WebWalkerQA compared to direct generation, and 82.9% on GAIA and 161.3% on WebWalkerQA over standard RAG implementations.

    In conclusion, researchers introduced WebThinker, which provides LRMs with deep research capabilities, addressing their limitations in knowledge-intensive real-world tasks such as complex reasoning and scientific report generation. The framework enables LRMs to autonomously explore the web and produce comprehensive outputs through continuous reasoning processes. The findings highlight WebThinker’s potential to advance the deep research capabilities of LRMs, creating more powerful intelligent systems capable of addressing complex real-world challenges. Future work includes incorporating multimodal reasoning capabilities, exploring advanced tool learning mechanisms, and investigating GUI-based web exploration.


    Check out the Paper. Also, don’t forget to follow us on Twitter.

    Here’s a brief overview of what we’re building at Marktechpost:

    • ML News Community – r/machinelearningnews (92k+ members)
    • Newsletter– airesearchinsights.com/(30k+ subscribers)
    • miniCON AI Events – minicon.marktechpost.com
    • AI Reports & Magazines – magazine.marktechpost.com
    • AI Dev & Research News – marktechpost.com (1M+ monthly readers)

    The post This AI Paper Introduce WebThinker: A Deep Research Agent that Empowers Large Reasoning Models (LRMs) for Autonomous Search and Report Generation appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleA Step-by-Step Guide to Implement Intelligent Request Routing with Claude
    Next Article Is Automated Hallucination Detection in LLMs Feasible? A Theoretical and Empirical Investigation

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    May 8, 2025
    Machine Learning

    How Deutsche Bahn redefines forecasting using Chronos models – Now available on Amazon Bedrock Marketplace

    May 8, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Automate building guardrails for Amazon Bedrock using test-driven development

    Development

    Researchers Discover “Bootkitty” – First UEFI Bootkit Targeting Linux Kernels

    Development

    Lenovo’s 13mm thin Snapdragon X debut may be my favorite Windows laptop ever (and I’m not exaggerating)

    Development

    Gracefully Handling Third Party API Failures

    Development

    Highlights

    Development

    KAIST Researchers Introduce CHOP: Enhancing EFL Students’ Oral Presentation Skills with Real-Time, Personalized Feedback Using ChatGPT and Whisper Technologies

    July 13, 2024

    The field of English as a Foreign Language (EFL) focuses on equipping non-native speakers with…

    India Confirms State-Owned Telecom Giant BSNL’s Data Breach, Millions of User Records Compromised

    July 26, 2024

    Real-World Wins: Case Studies of Businesses Thriving with AI📊

    May 5, 2025

    Why Java endures: The foundation of modern enterprise development

    March 11, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.