Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      June 1, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      June 1, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      June 1, 2025

      How To Prevent WordPress SQL Injection Attacks

      June 1, 2025

      My top 5 must-play PC games for the second half of 2025 — Will they live up to the hype?

      June 1, 2025

      A week of hell with my Windows 11 PC really makes me appreciate the simplicity of Google’s Chromebook laptops

      June 1, 2025

      Elden Ring Nightreign Night Aspect: How to beat Heolstor the Nightlord, the final boss

      June 1, 2025

      New Xbox games launching this week, from June 2 through June 8 — Zenless Zone Zero finally comes to Xbox

      June 1, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Student Record Android App using SQLite

      June 1, 2025
      Recent

      Student Record Android App using SQLite

      June 1, 2025

      When Array uses less memory than Uint8Array (in V8)

      June 1, 2025

      Laravel 12 Starter Kits: Definite Guide Which to Choose

      June 1, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      My top 5 must-play PC games for the second half of 2025 — Will they live up to the hype?

      June 1, 2025
      Recent

      My top 5 must-play PC games for the second half of 2025 — Will they live up to the hype?

      June 1, 2025

      A week of hell with my Windows 11 PC really makes me appreciate the simplicity of Google’s Chromebook laptops

      June 1, 2025

      Elden Ring Nightreign Night Aspect: How to beat Heolstor the Nightlord, the final boss

      June 1, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»This AI Paper Proposes TALE: An AI Framework that Reduces Token Redundancy in Chain-of-Thought (CoT) Reasoning by Incorporating Token Budget Awareness

    This AI Paper Proposes TALE: An AI Framework that Reduces Token Redundancy in Chain-of-Thought (CoT) Reasoning by Incorporating Token Budget Awareness

    December 29, 2024

    Large Language Models (LLMs) have shown significant potential in reasoning tasks, using methods like Chain-of-Thought (CoT) to break down complex problems into manageable steps. However, this capability comes with challenges. CoT prompts often increase token usage, leading to higher computational costs and energy consumption. This inefficiency is a concern for applications that require both precision and resource efficiency. Current LLMs tend to generate unnecessarily lengthy outputs, which do not always translate into better accuracy but incur additional costs. The key challenge is finding a balance between reasoning performance and resource efficiency.

    Researchers from Nanjing University, Rutgers University, and UMass Amherst have introduced a Token-Budget-Aware LLM Reasoning Framework. This framework dynamically estimates token budgets based on the complexity of a reasoning task and uses these estimates to guide the process. Known as TALE (Token-Budget-Aware LLM rEasoning), the approach seeks to reduce token usage without compromising the accuracy of responses. By integrating a token budget into CoT prompts, TALE provides a practical solution for enhancing cost-efficiency in LLMs while maintaining their performance.

    Technical Details and Benefits

    TALE operates in two main phases: budget estimation and token-budget-aware reasoning. Initially, it estimates an appropriate token budget for a problem using methods such as zero-shot prediction or regression-based estimators. This budget is then embedded in the prompt to encourage the LLM to generate concise yet accurate responses.

    A key innovation in TALE is the concept of “Token Elasticity,” which identifies an optimal range of token budgets that minimizes token usage while preserving accuracy. Using iterative search techniques like binary search, TALE determines the optimal budget for various tasks and LLM architectures. On average, the framework achieves a 68.64% reduction in token usage with less than a 5% decrease in accuracy, making it a practical and adaptable approach for token efficiency.

    Results and Insights

    Experiments demonstrate TALE’s effectiveness across benchmarks like GSM8K and MathBench. For instance, on the GSM8K dataset, TALE achieved 84.46% accuracy, surpassing the Vanilla CoT method while reducing token costs from 318.10 to 77.26 on average. On GSM8K-Zero, it reduced token costs by 91%, maintaining an accuracy of 98.72%.

    TALE also generalizes well across different LLMs, such as GPT-4o-mini and Yi-lightning. When applied to the MathBench-College dataset, TALE reduced token costs by up to 70% while maintaining competitive accuracy. Additionally, the framework significantly lowers operational expenses, cutting costs by 59% on average compared to Vanilla CoT. These results highlight TALE’s ability to enhance efficiency without sacrificing performance, making it suitable for a variety of applications.

    Conclusion

    The Token-Budget-Aware LLM Reasoning Framework addresses the inefficiency of token usage in reasoning tasks. By dynamically estimating and applying token budgets, TALE strikes a balance between accuracy and cost-effectiveness. This approach reduces computational expenses and broadens the accessibility of advanced LLM capabilities. As AI continues to evolve, frameworks like TALE offer a pathway to more efficient and sustainable use of LLMs in both academic and industrial contexts.


    Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 60k+ ML SubReddit.

    🚨 Trending: LG AI Research Releases EXAONE 3.5: Three Open-Source Bilingual Frontier AI-level Models Delivering Unmatched Instruction Following and Long Context Understanding for Global Leadership in Generative AI Excellence….

    The post This AI Paper Proposes TALE: An AI Framework that Reduces Token Redundancy in Chain-of-Thought (CoT) Reasoning by Incorporating Token Budget Awareness appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleLe notizie minori del mondo GNU/Linux e dintorni della settimana nr 52/2024
    Next Article Researchers from Tsinghua University Propose ReMoE: A Fully Differentiable MoE Architecture with ReLU Routing

    Related Posts

    Security

    New Linux Flaws Allow Password Hash Theft via Core Dumps in Ubuntu, RHEL, Fedora

    June 1, 2025
    Security

    DevSecOps Phase 4B: Manual Penetration Testing

    June 1, 2025
    Leave A Reply Cancel Reply

    Hostinger

    Continue Reading

    CVE-2025-1626 – Qi Blocks WordPress Stored Cross-Site Scripting (XSS)

    Common Vulnerabilities and Exposures (CVEs)

    Transparent Tribe’s Android Spyware Targets Gamers and Weapons Enthusiasts

    Development

    The Benefits and Risks of AI

    Development

    Deepfake Defense in the Age of AI

    Development

    Highlights

    News & Updates

    Dark Souls 3 Seamless Co-op is here, bringing Elden Ring’s best mod to one of FromSoftware’s best games

    February 25, 2025

    Three years after Seamless Co-op debuted in Elden Ring, the incredible mod has come to…

    How to Import Assets in Godot [FREE]

    November 12, 2024

    TwelveTransfers

    May 17, 2025

    Manage Metadata on Laravel Eloquent Models with JSON Support

    January 9, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.