Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Cleanlab Introduces the Trustworthy Language Model (TLM) that Addresses the Primary Challenge to Enterprise Adoption of LLMs: Unreliable Outputs and Hallucinations

    Cleanlab Introduces the Trustworthy Language Model (TLM) that Addresses the Primary Challenge to Enterprise Adoption of LLMs: Unreliable Outputs and Hallucinations

    April 29, 2024

    While 55% of organizations are experimenting with generative AI, only 10% have implemented it in production, according to a recent Gartner poll. LLMs face a major obstacle in transitioning to production due to their tendency to generate erroneous outputs, termed hallucinations. These inaccuracies hinder their utilization in applications requiring correct results. Instances like Air Canada’s chatbot misinforming customers about refund policies and a law firm’s use of ChatGPT to produce a brief filled with fabricated citations illustrate the risks associated with deploying unreliable LLMs. Similarly, New York City’s “MyCity” chatbot has provided incorrect responses to inquiries about local laws, underscoring the challenges in ensuring accurate outputs from LLMs.

    Image Source

    Cleanlab presents the Trustworthy Language Model (TLM), addressing the primary challenge hindering enterprise adoption of LLMs: unreliable outputs and hallucinations. TLM integrates a trust score into each LLM response, empowering users to identify and control erroneous outputs, thus facilitating the deployment of generative AI in previously inaccessible scenarios. Extensive benchmarking demonstrates that TLM outperforms existing LLMs in accuracy while offering better-calibrated trustworthiness scores, leading to enhanced cost and time efficiency compared to alternative methods for managing LLM uncertainty.

    TLM addresses the inevitable presence of hallucinations in LLMs by assigning a trustworthiness score to each output, enabling users to identify instances of hallucination. TLM prioritizes minimizing false negatives, ensuring that the trustworthiness score is low when hallucinations occur, thereby facilitating the reliable deployment of LLM-based applications. 

    Image Source

    The TLM API serves multiple purposes: it can function as a seamless replacement for existing LLMs, offering a .prompt() method that returns responses and trustworthiness scores, enabling new applications. Also, TLM enhances the accuracy of responses by internally generating multiple responses and selecting the one with the highest trustworthiness score. TLM can augment trust for outputs from existing LLMs or human-generated data through its .get_trustworthiness_score() method. TLM operates by integrating a trust layer onto existing LLMs, allowing users to select from popular base models like GPT-3.5 and GPT-4 or augment any LLM with only black-box access to the LLM API. For enterprise needs, such as enhancing trustworthiness in custom fine-tuned LLMs, users can engage with Cleanlab directly.

    Image Source

    The evaluation compares Cleanlab’s TLM to OpenAI’s GPT-4, focusing on response accuracy and cost/time savings. TLM’s trustworthiness score enhances trust in LLM outputs, detecting errors efficiently. Compared to self-evaluation and probability-based methods, TLM’s comprehensive assessment includes epistemic uncertainty, offering superior reliability. TLM optimizes resource allocation by flagging low-scoring outputs for human review, ensuring robust decision-making. Berkeley Research Group (BRG) has already seen significant cost savings from leveraging TLM, according to Steven Gawthorpe, PhD, Associate Director and Senior Data Scientist at BRG.

    Image Source

    In conclusion, Cleanlab’s Trustworthy Language Model (TLM) is an extensive solution to organizations’ challenges in deploying LLM applications. TLM enables more accurate and dependable outputs by addressing the reliability issues associated with hallucinations through trustworthiness scores. With its ability to augment existing LLMs and enhance trust in various applications, TLM signifies a significant advancement in the deployment of generative AI, paving the way for increased adoption & utilization in enterprise settings.

    The post Cleanlab Introduces the Trustworthy Language Model (TLM) that Addresses the Primary Challenge to Enterprise Adoption of LLMs: Unreliable Outputs and Hallucinations appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleThis AI Paper by DeepMind Introduces Gecko: Setting New Standards in Text-to-Image Model Assessment
    Next Article Moldova Government Hit by NoName Ransomware: Websites Down

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 17, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-40906 – MongoDB BSON Serialization BSON::XS Multiple Vulnerabilities

    May 17, 2025
    Leave A Reply Cancel Reply

    Hostinger

    Continue Reading

    Essential Tools and Frameworks for Mastering Ethical Hacking on Linux

    Learning Resources

    CVE-2025-3828 – PHPGurukul Men Salon Management System SQL Injection Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    What Is General Ledger Reconciliation?

    Artificial Intelligence

    Meet BricksAI: An Open-Core AI Gateway that Helps Developers Implement All Essential Features Needed in Any GenAI Project

    Development

    Highlights

    Devin AI Introduces DeepWiki: A New AI-Powered Interface to Understand GitHub Repositories

    April 28, 2025

    Devin AI recently introduced DeepWiki, a free tool that automatically generates structured, wiki-style documentation for…

    You can now learn sign language with Sign, Nvidia’s new AI platform

    February 24, 2025

    The best photo editing software of 2025: Expert tested and reviewed

    March 24, 2025

    Navigating the Future: Key IT Vulnerability Management Trends 

    February 5, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.