Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 15, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 15, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 15, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 15, 2025

      Intel’s latest Arc graphics driver is ready for DOOM: The Dark Ages, launching for Premium Edition owners on PC today

      May 15, 2025

      NVIDIA’s drivers are causing big problems for DOOM: The Dark Ages, but some fixes are available

      May 15, 2025

      Capcom breaks all-time profit records with 10% income growth after Monster Hunter Wilds sold over 10 million copies in a month

      May 15, 2025

      Microsoft plans to lay off 3% of its workforce, reportedly targeting management cuts as it changes to fit a “dynamic marketplace”

      May 15, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      A cross-platform Markdown note-taking application

      May 15, 2025
      Recent

      A cross-platform Markdown note-taking application

      May 15, 2025

      AI Assistant Demo & Tips for Enterprise Projects

      May 15, 2025

      Celebrating Global Accessibility Awareness Day (GAAD)

      May 15, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Intel’s latest Arc graphics driver is ready for DOOM: The Dark Ages, launching for Premium Edition owners on PC today

      May 15, 2025
      Recent

      Intel’s latest Arc graphics driver is ready for DOOM: The Dark Ages, launching for Premium Edition owners on PC today

      May 15, 2025

      NVIDIA’s drivers are causing big problems for DOOM: The Dark Ages, but some fixes are available

      May 15, 2025

      Capcom breaks all-time profit records with 10% income growth after Monster Hunter Wilds sold over 10 million copies in a month

      May 15, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Galileo Introduces Luna: An Evaluation Foundation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost

    Galileo Introduces Luna: An Evaluation Foundation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost

    June 15, 2024

    The Galileo Luna represents a significant advancement in language model evaluation. It is specifically designed to address the prevalent issue of hallucinations in large language models (LLMs). Hallucinations, or instances where models generate information not grounded in the retrieved context, pose a significant challenge in deploying language models in industry applications. The Galileo Luna is a purpose-built evaluation foundation model (EFM) that ensures high accuracy, low latency, and cost efficiency in detecting and mitigating these hallucinations.

    The Problem of Hallucinations in LLMs

    Large language models have revolutionized natural language processing with their impressive ability to generate human-like text. However, their tendency to produce factually incorrect information (hallucinations) undermines their reliability, especially in critical applications such as customer support, legal advice, and biomedical research. Hallucinations can arise from various factors, including outdated knowledge bases, randomization in response generation, faulty training data, and the incorporation of new knowledge during fine-tuning.

    Retrieval-augmented generation (RAG) systems have been developed to incorporate relevant external knowledge into the LLM’s responses to address these issues. Despite this, existing hallucination detection techniques often fail to balance accuracy, latency, and cost, making them less feasible for real-time, large-scale industry applications.

    Luna: The Evaluation Foundation Model

    Galileo Technologies has introduced Luna, a DeBERTa-large encoder fine-tuned to detect hallucinations in RAG settings. Luna stands out for its high accuracy, low cost, and millisecond-level inference speed. It surpasses existing models, including GPT-3.5, in both performance and efficiency.

    Luna’s architecture is built upon a 440-million parameter DeBERTa-large model, fine-tuned with real-world RAG data. This model is designed to generalize across multiple industry domains and handle long-context RAG inputs, making it an ideal solution for diverse applications. Its training involves a novel chunking approach that processes long context documents to minimize false positives in hallucination detection.

    Image Source

    The 5 Breakthroughs in GenAI Evaluations with Galileo Luna:

    Leading Evaluation Accuracy Benchmarks: Luna is 18% more accurate than GPT-3.5 in detecting hallucinations in RAG-based systems. This accuracy extends to other evaluation tasks, such as prompt injections and PII detection.

    Ultra Low-Cost Evaluation: Luna significantly reduces evaluation costs by 97% compared to GPT-3.5, making it a cost-effective solution for large-scale deployments.

    Ultra Low Latency Evaluation: Luna is 11 times faster than GPT-3.5, processing evaluations in milliseconds, ensuring a seamless and responsive user experience.

    Detect Hallucinations, Security, and Data Privacy Without Ground Truth: eliminates the need for costly and labor-intensive ground truth test sets by using pre-trained evaluation-specific datasets, allowing for immediate and effective evaluation.

    Built for Customizability: Luna can be quickly fine-tuned to meet specific industry needs, providing ultra-high accuracy custom evaluation models within minutes.

    Image Source

    Performance and Cost Efficiency

    Luna has demonstrated superior performance in extensive benchmarking against other models. Compared to GPT-3.5 and other commercial evaluation frameworks, it achieves a 97% reduction in cost and a 91% reduction in latency. These efficiencies are critical for large-scale deployment, where real-time response generation and cost management are paramount.

    The model’s ability to process up to 16,000 tokens in milliseconds makes it suitable for real-time applications like customer support and interactive chatbots. Luna’s lightweight architecture allows it to be deployed on local GPUs, ensuring data privacy and security, a significant advantage over third-party API-based solutions.

    Image Source

    Applications and Customizability

    Luna is designed to be highly customizable, enabling fine-tuning to meet specific industry needs. For instance, in pharmaceutical applications, where hallucinations can have serious implications, Luna can be tailored to detect particular classes of hallucinations with over 95% accuracy. This flexibility ensures the model can be adapted to various domains, enhancing its utility and effectiveness.

    Luna supports a range of evaluation tasks beyond hallucination detection, including context adherence, chunk utilization, context relevance, and security checks. Its multi-task training approach allows it to perform multiple evaluations with a single input, sharing insights across tasks for more robust and accurate results.

    Conclusion

    The introduction of Galileo Luna marks a significant milestone in developing evaluation models for large language systems. Its high accuracy, cost efficiency, and low latency make it a valuable tool for ensuring the reliability and trustworthiness of AI-driven applications. By addressing the critical issue of hallucinations in LLMs, Luna paves the way for more robust and dependable language models in various industry settings.

    Check out the Paper and Blog. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 44k+ ML SubReddit

    The post Galileo Introduces Luna: An Evaluation Foundation Model to Catch Language Model Hallucinations with High Accuracy and Low Cost appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleSelfGoal: An Artificial Intelligence AI Framework to Enhance an LLM-based Agent’s Capabilities to Achieve High-Level Goals
    Next Article Is Content Design Dead?

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 16, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-4732 – TOTOLINK A3002R/A3002RU HTTP POST Request Handler Buffer Overflow

    May 16, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Web design trends to keep an eye on in 2024

    Development

    Web Designer Salaries Around the World: A Comprehensive Guide

    Development

    A Coding Guide to Asynchronous Web Data Extraction Using Crawl4AI: An Open-Source Web Crawling and Scraping Toolkit Designed for LLM Workflows

    Machine Learning

    React Native 0.77 – New Styling Features, Android’s 16KB page support, Swift Template

    Development

    Highlights

    Development

    Elden Ring DLC players: 1 important tip for you as you begin your new adventure

    June 20, 2024

    Elden Ring: Shadow of the Erdtree is finally here, but before you jump into the…

    Report: “Jazzed and spooked.” Sam Altman and OpenAI will meet with the U.S. government to discuss “PhD-level” super AI that can conquer even the most complex human tasks.

    January 20, 2025

    Elon Musk’s X Halts EU Data Processing Amid AI Grok Training Concerns

    August 9, 2024

    Can you still enable classic Alt+Tab in Windows 11 24H2?

    December 27, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.