Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 22, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 22, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 22, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 22, 2025

      Sam Altman says ChatGPT’s viral Ghibli effect “forced OpenAI to do a lot of unnatural things”

      May 22, 2025

      How to get started with Microsoft Copilot on Windows 11

      May 22, 2025

      Microsoft blocks employees from sending emails that mention “Palestine” or “Gaza”

      May 22, 2025

      I missed out on the Clair Obscur: Expedition 33 Collector’s Edition but thankfully, the developers are launching something special

      May 22, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Perficient is Shaping the Future of Salesforce Innovation

      May 22, 2025
      Recent

      Perficient is Shaping the Future of Salesforce Innovation

      May 22, 2025

      Opal – Optimizely’s AI-Powered Marketing Assistant

      May 22, 2025

      Content Compliance Without the Chaos: How Optimizely CMP Empowers Financial Services Marketers

      May 22, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Sam Altman says ChatGPT’s viral Ghibli effect “forced OpenAI to do a lot of unnatural things”

      May 22, 2025
      Recent

      Sam Altman says ChatGPT’s viral Ghibli effect “forced OpenAI to do a lot of unnatural things”

      May 22, 2025

      How to get started with Microsoft Copilot on Windows 11

      May 22, 2025

      Microsoft blocks employees from sending emails that mention “Palestine” or “Gaza”

      May 22, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Optimizing Large Language Models for Concise and Accurate Responses through Constrained Chain-of-Thought Prompting

    Optimizing Large Language Models for Concise and Accurate Responses through Constrained Chain-of-Thought Prompting

    August 2, 2024

    LLMs have shown impressive abilities in handling complex question-answering tasks, supported by advancements in model architectures and training methods. Techniques like chain-of-thought (CoT) prompting have gained popularity for improving the explanation and accuracy of responses by guiding the model through intermediate reasoning steps. However, CoT prompting can result in longer outputs, increasing the time needed for response generation due to the word-by-word decoding process of autoregressive transformers. This creates challenges in maintaining interactive conversations, highlighting the need for metrics to evaluate output conciseness and strategies to reduce overly lengthy reasoning chains.

    Researchers from the Department of Excellence in Robotics and AI at Scuola Superiore Sant’Anna and Mediavoice Srl analyzed how output length affects LLM inference time. They proposed new metrics to evaluate conciseness and correctness. They introduced a refined prompt engineering strategy, Constrained-Chain-of-Thought (CCoT), which limits output length to improve accuracy and response time. Experiments with LLaMA2-70b on the GSM8K dataset showed that constraining reasoning to 100 words improved accuracy and reduced output length. The study emphasizes the need for brevity in LLM reasoning and highlights the varying effectiveness of CCoT across different model sizes.

    Recent research on LLMs has focused on improving accuracy, often leading to longer and more detailed responses. These extended outputs can cause hallucinations, where the model generates plausible but incorrect information and overly lengthy explanations that obscure key information. Various prompt engineering techniques have been developed to address this, including CoT prompting, which improves reasoning but increases response time. The study introduces metrics to evaluate both conciseness and correctness and proposes a refined CoT approach, CCoT, to control output length while maintaining quality.

    The output generation time of LLMs is influenced by factors such as model architecture, preprocessing, decoding, and the prompt used. Longer outputs typically increase response time due to the iterative nature of autoregressive models. Tests on various models (Falcon-7b/40b, Llama2-7b/70b) showed that as output length increases, so does generation time. CoT prompting, which improves response correctness, also lengthens outputs and generation times. To address this, a CCoT approach is proposed, which limits output length while maintaining accuracy, reducing generation time effectively.

    The experiments evaluate the effectiveness of the CCoT approach compared to classic CoT, focusing on efficiency, accuracy, and the ability to control output length. Using the GSM8K dataset, various LLMs (e.g., Llama2-70b, Falcon-40b) were tested. Results show that CCoT reduces generation time and can improve or maintain accuracy. The study also introduces new metrics (HCA, SCA, CCA) to assess model performance, considering correctness and conciseness. Larger models like Llama2-70b benefit more from CCoT, while smaller models struggle. CCoT demonstrates improved efficiency and concise accuracy, especially for larger LLMs.

    The study emphasizes the importance of conciseness in text generation by LLMs and introduces CCoT as a prompt engineering technique to control output length. Experiments show that larger models like Llama2-70b and Falcon-40b benefit from CCoT, but smaller models need help to meet length constraints. The study also proposes new metrics to evaluate the balance between conciseness and correctness. Future research will explore integrating these metrics into model fine-tuning and examining how conciseness impacts phenomena like hallucinations or incorrect reasoning in LLMs.

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..

    Don’t Forget to join our 47k+ ML SubReddit

    Find Upcoming AI Webinars here

    Arcee AI Released DistillKit: An Open Source, Easy-to-Use Tool Transforming Model Distillation for Creating Efficient, High-Performance Small Language Models

    The post Optimizing Large Language Models for Concise and Accurate Responses through Constrained Chain-of-Thought Prompting appeared first on MarkTechPost.

    Source: Read More 

    Hostinger
    Facebook Twitter Reddit Email Copy Link
    Previous ArticleAppium: How to resolve Class cast exception for iOS driver? “java.lang.ClassCastException”
    Next Article PersonaGym: A Dynamic AI Framework for Comprehensive Evaluation of LLM Persona Agents

    Related Posts

    Machine Learning

    Researchers from the National University of Singapore Introduce ‘Thinkless,’ an Adaptive Framework that Reduces Unnecessary Reasoning by up to 90% Using DeGRPO

    May 23, 2025
    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 23, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    How teams tap into the power of design with Figma Slides

    Web Development

    Mozilla Revenue Jumped in 2023, But Search Deal Cash Fell

    Development

    How to delete cypress reports for both windows and linux before every run?

    Development

    Highly anticipated Sega 2025 game is cancelled entirely because it simply won’t be ready in time

    News & Updates

    Highlights

    Development

    CensysGPT: AI-Powered Threat Hunting for Cybersecurity Pros (Webinar)

    May 10, 2024

    Artificial intelligence (AI) is transforming cybersecurity, and those leading the charge are using it to…

    The best VPN extensions for Chrome in 2024: Expert tested and reviewed

    August 14, 2024

    New Phishing Campaign Deploys WARMCOOKIE Backdoor Targeting Job Seekers

    June 12, 2024

    Actionsflow automates developers’ workflows based on GitHub actions

    April 29, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.