Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Decoding Arithmetic Reasoning in LLMs: The Role of Heuristic Circuits over Generalized Algorithms

    Decoding Arithmetic Reasoning in LLMs: The Role of Heuristic Circuits over Generalized Algorithms

    November 3, 2024

    A key question about LLMs is whether they solve reasoning tasks by learning transferable algorithms or simply memorizing training data. This distinction matters: while memorization might handle familiar tasks, true algorithmic understanding allows for broader generalization. Arithmetic reasoning tasks could reveal if LLMs apply learned algorithms, like vertical addition in human learning, or if they rely on memorized patterns from training data. Recent studies identify specific model components linked to arithmetic in LLMs, with some findings suggesting that Fourier features assist in addition tasks. However, the full mechanism underlying generalization versus memorization remains to be determined.

    Mechanistic interpretability (MI) seeks to understand language models by dissecting the roles of their components. Techniques such as activation and path patching help link specific behaviors to model parts, while other methods focus on how certain weights influence token responses. Studies also address whether LLMs generalize or simply memorize training data, with insights into how internal activations indicate this balance. For arithmetic reasoning, recent research identifies general structures in arithmetic circuits but needs to include how operand data is processed for accuracy. This study broadens the view, showing how multiple heuristics and feature types combine in LLMs for arithmetic tasks.

    Researchers from Technion and Northeastern University investigated how LLMs handle arithmetic, discovering that instead of using robust algorithms or pure memorization, LLMs apply a “bag of heuristics” approach. Analyzing individual neurons in an arithmetic circuit identified that specific neurons fire according to simple patterns, such as operand ranges, to produce correct answers. This mix of heuristics emerges early in training and persists as the main mechanism for solving arithmetic prompts. The study’s findings provide detailed insights into LLMs’ arithmetic reasoning, showing how these heuristics operate, evolve, and contribute to both capabilities and limitations in reasoning tasks.

    In transformer-based language models, a circuit is a subset of model components (MLPs and attention heads) that execute specific tasks, such as arithmetic. Researchers analyzed the arithmetic circuits in four models (Llama3-8B/70B, Pythia-6.9B, and GPT-J) to identify components responsible for arithmetic. They located key MLPs and attention heads through activation patching, observing that middle- and late-layer MLPs promoted answer prediction. The evaluation showed that only about 1.5% of neurons per layer were needed to achieve high accuracy. These neurons operate as “memorized heuristics,” activating for specific operand patterns and encoding plausible answer tokens.

    To solve arithmetic prompts, models use a “bag of heuristics,” where individual neurons recognize specific patterns, and each incrementally contributes to the correct answer’s probability. Neurons are classified by their activation patterns into heuristic types, and neurons within each heuristic are responsible for distinct arithmetic tasks. Ablation tests confirm that each heuristic type causally impacts prompts aligned with its pattern. These heuristic neurons develop gradually throughout training, eventually dominating the model’s arithmetic capability, even as vestigial heuristics emerge mid-training. This suggests that arithmetic proficiency primarily emerges from these coordinated heuristic neurons across training.

    LLMs approach arithmetic tasks through heuristic-driven reasoning rather than robust algorithms or memorization. The study reveals that LLMs use a “bag of heuristics,” a mix of learned patterns rather than generalizable algorithms, to solve arithmetic. By identifying specific model components—neurons within a circuit—that handle arithmetic, they found that each neuron activates for specific input patterns, collectively supporting accurate responses. This heuristic-driven method appears early in model training and develops gradually. The findings suggest that enhancing LLMs’ mathematical skills may require fundamental changes in training and architecture beyond current post-hoc techniques.


    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 55k+ ML SubReddit.

    [Trending] LLMWare Introduces Model Depot: An Extensive Collection of Small Language Models (SLMs) for Intel PCs

    The post Decoding Arithmetic Reasoning in LLMs: The Role of Heuristic Circuits over Generalized Algorithms appeared first on MarkTechPost.

    Source: Read More 

    Hostinger
    Facebook Twitter Reddit Email Copy Link
    Previous ArticleCornell Researchers Introduce QTIP: A Weight-Only Post-Training Quantization Algorithm that Achieves State-of-the-Art Results through the Use of Trellis-Coded Quantization (TCQ)
    Next Article OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization

    Related Posts

    Machine Learning

    LLMs Struggle with Real Conversations: Microsoft and Salesforce Researchers Reveal a 39% Performance Drop in Multi-Turn Underspecified Tasks

    May 17, 2025
    Machine Learning

    This AI paper from DeepSeek-AI Explores How DeepSeek-V3 Delivers High-Performance Language Modeling by Minimizing Hardware Overhead and Maximizing Computational Efficiency

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    How can businesses save money on internet security in 2015?

    Development

    CVE-2025-40626 – AbanteCart Reflected Cross-Site Scripting (XSS)

    Common Vulnerabilities and Exposures (CVEs)

    LLMs Can Now Reason in Parallel: UC Berkeley and UCSF Researchers Introduce Adaptive Parallel Reasoning to Scale Inference Efficiently Without Exceeding Context Windows

    Machine Learning

    CVE-2025-29840 – Windows Media Stack-based Buffer Overflow Remote Code Execution

    Common Vulnerabilities and Exposures (CVEs)

    Highlights

    Databases

    New – Amazon DynamoDB lowers pricing for on-demand throughput and global tables

    November 15, 2024

    Over 1 million customers choose Amazon DynamoDB as their go-to NoSQL database for building high-performance,…

    CVE-2025-3742 – WordPress Responsive Lightbox & Gallery Stored Cross-Site Scripting Vulnerability

    May 15, 2025

    Lightning AI Studio Vulnerability Allowed RCE via Hidden URL Parameter

    January 30, 2025
    If Call of Duty: Black Ops 6’s Kilo 141 Jade camo challenge is bugged for you, try this

    If Call of Duty: Black Ops 6’s Kilo 141 Jade camo challenge is bugged for you, try this

    April 8, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.