Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      The state of DevOps and AI: Not just hype

      September 1, 2025

      A Breeze Of Inspiration In September (2025 Wallpapers Edition)

      August 31, 2025

      10 Top Generative AI Development Companies for Enterprise Node.js Projects

      August 30, 2025

      Prompting Is A Design Act: How To Brief, Guide And Iterate With AI

      August 29, 2025

      Look out, Meta Ray-Bans! These AI glasses just raised over $1M in pre-orders in 3 days

      September 2, 2025

      Samsung ‘Galaxy Glasses’ powered by Android XR are reportedly on track to be unveiled this month

      September 2, 2025

      The M4 iPad Pro is discounted $100 as a last-minute Labor Day deal

      September 2, 2025

      Distribution Release: Linux From Scratch 12.4

      September 1, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Enhanced Queue Job Control with Laravel’s ThrottlesExceptions failWhen() Method

      September 2, 2025
      Recent

      Enhanced Queue Job Control with Laravel’s ThrottlesExceptions failWhen() Method

      September 2, 2025

      August report 2025

      September 2, 2025

      Fake News Detection using Python Machine Learning (ML)

      September 1, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Installing Proxmox on a Raspberry Pi to run Virtual Machines on it

      September 2, 2025
      Recent

      Installing Proxmox on a Raspberry Pi to run Virtual Machines on it

      September 2, 2025

      Download Transcribe! for Windows

      September 1, 2025

      Microsoft Fixes CertificateServicesClient (CertEnroll) Error in Windows 11

      September 1, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»LLMs Can Now Reason Beyond Language: Researchers Introduce Soft Thinking to Replace Discrete Tokens with Continuous Concept Embeddings

    LLMs Can Now Reason Beyond Language: Researchers Introduce Soft Thinking to Replace Discrete Tokens with Continuous Concept Embeddings

    May 28, 2025

    Human reasoning naturally operates through abstract, non-verbal concepts rather than strictly relying on discrete linguistic tokens. However, current LLMs are limited to reasoning within the boundaries of natural language, producing one token at a time through predefined vocabulary. This token-by-token approach not only restricts the expressive capacity of the model but also limits the breadth of reasoning paths it can explore, especially in ambiguous or complex scenarios. Standard Chain-of-Thought (CoT) methods exemplify this limitation, forcing the model to commit to a single path at each step. In contrast, human cognition is more flexible and parallel, allowing for simultaneous consideration of multiple ideas and delaying verbalization until concepts are fully formed. This makes human reasoning more adaptable and robust in dealing with uncertainty.

    To address these limitations, researchers have proposed transitioning from token-based reasoning to reasoning within a continuous concept space, representing reasoning steps as token embeddings combinations. This approach allows models to explore multiple reasoning trajectories in parallel and integrate richer conceptual representations. Prior studies have demonstrated the potential of manipulating hidden states to influence reasoning outcomes or introduce latent planning. However, applying continuous-space reasoning to larger models presents challenges. In models under 7B parameters, shared weights between input and output layers allow hidden states to align with token embeddings, facilitating continuous reasoning. However, in larger models, where input and output spaces are decoupled, directly using hidden states as inputs causes mismatches that are hard to resolve. Attempts to retrain these models to bridge this gap often result in overfitting or degraded performance, highlighting the difficulty of enabling effective continuous reasoning at scale. 

    Researchers from the University of California, Santa Barbara, University of California, Santa Cruz, University of California, Los Angeles, Purdue University, LMSYS Org, and Microsoft introduce Soft Thinking. This training-free approach enhances reasoning in large language models by operating in a continuous concept space. Instead of choosing one discrete token at each step, the model generates concept tokens—probability-weighted mixtures of all token embeddings—enabling parallel reasoning over multiple paths. This results in richer, more abstract representations. The method includes a Cold Stop mechanism to improve efficiency. Evaluations on mathematical and coding tasks show up to 2.48% higher accuracy and 22.4% fewer tokens used than standard Chain-of-Thought reasoning. 

    The Soft Thinking method enhances standard CoT reasoning by replacing discrete token sampling with concept tokens—probability distributions over the entire vocabulary. These distributions compute weighted embeddings, allowing the model to reason in a continuous concept space. This preserves uncertainty and enables parallel exploration of multiple reasoning paths. A Cold Stop mechanism monitors entropy to halt reasoning when the model becomes confident, improving efficiency and preventing collapse. Theoretical analysis shows that Soft Thinking approximates the full marginalization over all reasoning paths through linearization, offering a more expressive and computationally tractable alternative to discrete CoT. 

    The study evaluates the Soft Thinking method on eight benchmarks in math and programming using three open-source LLMs of varying sizes and architectures. Compared to standard and greedy CoT methods, Soft Thinking consistently improves accuracy (Pass@1) while significantly reducing the number of tokens generated, indicating more efficient reasoning. The approach uses concept tokens and a Cold Start controller without modifying model weights or requiring extra training. Experiments show that soft thinking balances higher accuracy with lower computational cost, outperforming baselines by enabling richer, more abstract reasoning in fewer steps across diverse tasks and models. 

    In conclusion, Soft Thinking is a training-free approach that enables large language models to reason using continuous concept tokens instead of traditional discrete tokens. By combining weighted token embeddings, Soft Thinking allows models to explore multiple reasoning paths simultaneously, improving accuracy and efficiency. Tested on math and coding benchmarks, it consistently boosts pass@1 accuracy while reducing the number of generated tokens, all without extra training or architectural changes. The method maintains interpretability and concise reasoning. Future research may focus on training adaptations to enhance robustness, especially for out-of-distribution inputs. The code is publicly accessible. 


    Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter.

    The post LLMs Can Now Reason Beyond Language: Researchers Introduce Soft Thinking to Replace Discrete Tokens with Continuous Concept Embeddings appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleA Coding Implementation to Build an Interactive Transcript and PDF Analysis with Lyzr Chatbot Framework
    Next Article Foundation Model Hidden Representations for Heart Rate Estimation from Auscultation

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    September 2, 2025
    Machine Learning

    Introducing auto scaling on Amazon SageMaker HyperPod

    August 30, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    CVE-2025-5212 – PHPGurukul Employee Record Management System SQL Injection Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    New Xbox games launching this week, from June 2 through June 8 — Zenless Zone Zero finally comes to Xbox

    News & Updates

    CVE-2025-6936 – Simple Pizza Ordering System SQL Injection

    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-38232 – Linux NFSd Race Condition Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Highlights

    QuickPoint

    July 10, 2025

    Post Content Source: Read More 

    Ex-Microsoft Bill Gates vs. NVIDIA CEO Jensen Huang: Does AI pose a real threat to professional coding careers?

    July 30, 2025

    Serverless Image Processing Pipeline with AWS ECS and Lambda

    July 16, 2025

    Parallels Desktop 26 Update Improves Ubuntu VMs on macOS

    August 26, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.