Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Tiny Screens, Big Impact: The Forgotten Art Of Developing Web Apps For Feature Phones

      July 16, 2025

      Kong AI Gateway 3.11 introduces new method for reducing token costs

      July 16, 2025

      Native vs hybrid vs cross-platform: Resolving the trilemma

      July 16, 2025

      JetBrains updates Junie, Gemini API adds embedding model, and more – Daily News Digest

      July 16, 2025

      My favorite Bose products are on sale plus an extra 25% discount – if you buy refurbished

      July 16, 2025

      Microsoft saved $500 million using AI — after slashing over 15,000 jobs in 2025

      July 16, 2025

      Obsidian’s Xbox RPG Avowed gets another update bringing bug fixes and these new abilities — and it’s now Steam Deck Verified

      July 16, 2025

      Half of Windows PCs are still yet to upgrade to Windows 11 — and are running out of time, says study

      July 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The details of TC39’s last meeting

      July 16, 2025
      Recent

      The details of TC39’s last meeting

      July 16, 2025

      Vector Search Embeddings and RAG

      July 16, 2025

      Python Meets Power Automate: Trigger via URL

      July 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Ubuntu 25.10 Fixes the Dock’s Inconsistent Radii

      July 16, 2025
      Recent

      Ubuntu 25.10 Fixes the Dock’s Inconsistent Radii

      July 16, 2025

      Microsoft saved $500 million using AI — after slashing over 15,000 jobs in 2025

      July 16, 2025

      Obsidian’s Xbox RPG Avowed gets another update bringing bug fixes and these new abilities — and it’s now Steam Deck Verified

      July 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Huawei Noah’s Ark Lab Released Dream 7B: A Powerful Open Diffusion Reasoning Model with Advanced Planning and Flexible Inference Capabilities

    Huawei Noah’s Ark Lab Released Dream 7B: A Powerful Open Diffusion Reasoning Model with Advanced Planning and Flexible Inference Capabilities

    April 9, 2025
    Huawei Noah’s Ark Lab Released Dream 7B: A Powerful Open Diffusion Reasoning Model with Advanced Planning and Flexible Inference Capabilities

    LLMs have revolutionized artificial intelligence, transforming various applications across industries. Autoregressive (AR) models dominate current text generation, with leading systems like GPT-4, DeepSeek, and Claude all using sequential left-to-right architectures. Despite impressive capabilities, fundamental questions about next-generation architectural paradigms have emerged as AR models exhibit limitations at scale. These challenges include complex reasoning difficulties, inadequate long-term planning, and struggles maintaining coherence across extended contexts. These are problematic for emerging applications in embodied AI, autonomous agents, and long-horizon decision-making systems where sustained reasoning and contextual understanding are essential for success.

    Discrete diffusion models (DMs) are a promising alternative to autoregressive approaches for sequence generation. Unlike AR models that generate tokens sequentially, DMs refine all sequences in parallel from a fully noised state. This difference provides significant advantages: bidirectional contextual modeling enhances global coherence, flexible controllable generation occurs naturally through iterative refinement, and potential exists for fundamental sampling acceleration through efficient noise-to-data mapping. Recent advancements show diffusion’s growing potential in language tasks, with models like DiffuLLaMA and LLaDA scaling to 7B parameters, while Mercury Coder shows impressive inference efficiency in code generation.

    Researchers from the University of Hong Kong and Huawei Noah’s Ark Lab released Dream 7B (Diffusion reasoning model), the most powerful open diffusion large language model to date. The model matches or exceeds similarly-sized AR models on general tasks, mathematics, and coding benchmarks. Dream 7B shows exceptional zero-shot planning capabilities and inference flexibility, outperforming larger models like DeepSeek V3 (671B) on structured tasks. Trained on 580B tokens from diverse datasets, including Dolma and OpenCoder, the model employs mask-based diffusion with autoregressive weight initialization from Qwen2.5 7B. Its architecture enables powerful bidirectional context processing, arbitrary-order generation, infilling capabilities, and adjustable quality-speed tradeoffs during inference.

    Dream 7B builds upon previous work in diffusion language modeling, utilizing RDM’s theoretical foundation and DiffuLLaMA’s adaptation strategy. It implements a mask diffusion paradigm with architecture designed for diverse applications. Training data uses text, mathematics, and code from sources, including Dolma v1.7, OpenCoder, and DCLM-Baseline. Pretraining utilized 580 billion tokens, executed on 96 NVIDIA H800 GPUs over 256 hours without unrecoverable loss spikes. Extensive design experimentation at the 1B parameter level identified critical components, including weight initialization from autoregressive models like Qwen2.5 and LLaMA3, along with context-adaptive token-level noise rescheduling that proved essential for Dream 7B training.

    The proposed method is evaluated on Countdown and Sudoku tasks with adjustable planning difficulty, comparing against LLaDA 8B, Qwen2.5 7B, LLaMA3 8B, and DeepSeek V3 671B. It outperforms similarly-sized baseline models, with both diffusion models surpassing autoregressive alternatives. These diffusion models occasionally exceed DeepSeek V3 despite its vastly larger parameter count, showing diffusion models’ effectiveness for multi-constraint problem-solving and specific-objective tasks. The method underwent supervised fine-tuning post-training using 1.8M instruction pairs from Tulu 3 and SmolLM2 datasets over three epochs. Results indicate Dream’s capability to match autoregressive model performance:

    In conclusion, researchers introduced Dream 7B, which represents a breakthrough family of diffusion language models characterized by efficiency, scalability, and flexibility through carefully developed training methodologies. These models perform comparably with leading autoregressive models of similar size across general tasks, mathematics, and coding applications. Dream’s most distinctive strengths emerge in advanced planning scenarios and flexible inference capabilities, where its diffusion-based architecture provides significant advantages over traditional autoregressive approaches. This achievement shows the viability of diffusion models as a compelling alternative path forward in language model development.


    Check out the Dream-org/Dream-v0-Instruct-7B and Dream-org/Dream-v0-Base-7B. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 85k+ ML SubReddit.

    🔥 [Register Now] miniCON Virtual Conference on OPEN SOURCE AI: FREE REGISTRATION + Certificate of Attendance + 3 Hour Short Event (April 12, 9 am- 12 pm PST) + Hands on Workshop [Sponsored]

    The post Huawei Noah’s Ark Lab Released Dream 7B: A Powerful Open Diffusion Reasoning Model with Advanced Planning and Flexible Inference Capabilities appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleBorg ER-3 is a portable audio synthesizer
    Next Article This AI Paper from ByteDance Introduces MegaScale-Infer: A Disaggregated Expert Parallelism System for Efficient and Scalable MoE-Based LLM Serving

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    July 16, 2025
    Machine Learning

    Enabling customers to deliver production-ready AI agents at scale

    July 16, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    What if Superman was AI?

    Artificial Intelligence

    CVE-2025-29009 – Webkul WooCommerce Medical Prescription Attachment Plugin Unrestricted File Upload Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Mac Mini won’t power on? Apple will fix it for you – for free

    News & Updates

    CVE-2025-49836 – GPT-SoVITS-WebUI Command Injection Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Highlights

    CVE-2025-32462 – Sudo Privilege Escalation

    June 30, 2025

    CVE ID : CVE-2025-32462

    Published : June 30, 2025, 9:15 p.m. | 1 hour, 12 minutes ago

    Description : Sudo before 1.9.17p1, when used with a sudoers file that specifies a host that is neither the current host nor ALL, allows listed users to execute commands on unintended machines.

    Severity: 2.8 | LOW

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    CVE-2025-48382 – Apache Fess Temporary File Information Disclosure Vulnerability

    May 27, 2025

    Wing FTP Server Max Severe Vulnerability Let Attackers Take Full Server Control

    July 3, 2025

    Proactive, Not Reactive – The Key to Inclusive and Accessible Design

    June 20, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.