Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»LongRAG: A New Artificial Intelligence AI Framework that Combines RAG with Long-Context LLMs to Enhance Performance

    LongRAG: A New Artificial Intelligence AI Framework that Combines RAG with Long-Context LLMs to Enhance Performance

    June 25, 2024

    Retrieval-Augmented Generation (RAG) methods enhance the capabilities of large language models (LLMs) by incorporating external knowledge retrieved from vast corpora. This approach is particularly beneficial for open-domain question answering, where detailed and accurate responses are crucial. By leveraging external information, RAG systems can overcome the limitations of relying solely on the parametric knowledge embedded in LLMs, making them more effective in handling complex queries.

    A significant challenge in RAG systems is the imbalance between the retriever and reader components. Traditional frameworks often use short retrieval units, such as 100-word passages, requiring the retriever to sift through large amounts of data. This design burdens the retriever heavily while the reader’s task remains relatively simple, leading to inefficiencies and potential semantic incompleteness due to document truncation. This imbalance restricts the overall performance of RAG systems, necessitating a re-evaluation of their design.

    Current methods in RAG systems include techniques like Dense Passage Retrieval (DPR), which focuses on finding precise, short retrieval units from large corpora. These methods often involve recalling many units and employing complex re-ranking processes to achieve high accuracy. While effective to some extent, these approaches still need to work on inherent inefficiency and incomplete semantic representation due to their reliance on short retrieval units.

    To address these challenges, the research team from the University of Waterloo introduced a novel framework called LongRAG. This framework comprises a “long retriever” and a “long reader” component, designed to process longer retrieval units of around 4K tokens each. By increasing the size of the retrieval units, LongRAG reduces the number of units from 22 million to 600,000, significantly easing the retriever’s workload and improving retrieval scores. This innovative approach allows the retriever to handle more comprehensive information units, enhancing the system’s efficiency and accuracy.

    The LongRAG framework operates by grouping related documents into long retrieval units, which the long retriever then processes to identify relevant information. To extract the final answers, the retriever filters the top 4 to 8 units, concatenated and fed into a long-context LLM, such as Gemini-1.5-Pro or GPT-4o. This method leverages the advanced capabilities of long-context models to process large amounts of text efficiently, ensuring a thorough and accurate extraction of information.

    In-depth, the methodology involves using an encoder to map the input question to a vector and a different encoder to map the retrieval units to vectors. The similarity between the question and the retrieval units is calculated to identify the most relevant units. The long retriever searches through these units, reducing the corpus size and improving the retriever’s precision. The retrieved units are then concatenated and fed into the long reader, which uses the context to generate the final answer. This approach ensures that the reader processes a comprehensive set of information, improving the system’s overall performance.

    The performance of LongRAG is truly remarkable. On the Natural Questions (NQ) dataset, it achieved an exact match (EM) score of 62.7%, a significant leap forward compared to traditional methods. On the HotpotQA dataset, it reached an EM score of 64.3%. These impressive results demonstrate the effectiveness of LongRAG, matching the performance of state-of-the-art fine-tuned RAG models. The framework reduced the corpus size by 30 times and improved the answer recall by approximately 20 percentage points compared to traditional methods, with an answer recall@1 score of 71% on NQ and 72% on HotpotQA.

    LongRAG’s ability to process long retrieval units preserves the semantic integrity of documents, allowing for more accurate and comprehensive responses. By reducing the burden on the retriever and leveraging advanced long-context LLMs, LongRAG offers a more balanced and efficient approach to retrieval-augmented generation. The research from the University of Waterloo not only provides valuable insights into modernizing RAG system design but also highlights the exciting potential for further advancements in this field, sparking optimism for the future of retrieval-augmented generation systems.

    In conclusion, LongRAG represents a significant step forward in addressing the inefficiencies and imbalances in traditional RAG systems. Employing long retrieval units and leveraging the capabilities of advanced LLMs’ capabilities enhances the accuracy and efficiency of open-domain question-answering tasks. This innovative framework improves retrieval performance and sets the stage for future developments in retrieval-augmented generation systems.

    Check out the Paper and GitHub. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. 

    Join our Telegram Channel and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 45k+ ML SubReddit

    Create, edit, and augment tabular data with the first compound AI system, Gretel Navigator, now generally available! [Advertisement]

    The post LongRAG: A New Artificial Intelligence AI Framework that Combines RAG with Long-Context LLMs to Enhance Performance appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleXRay for Jira – Equivalent of Scenario Outlines
    Next Article Meet Maestro: An AI Framework for Claude Opus, GPT and Local LLMs to Orchestrate Subagents

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 17, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-4831 – TOTOLINK HTTP POST Request Handler Buffer Overflow Vulnerability

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    API with NestJS #177. Response serialization with the Drizzle ORM

    Development

    Collaboration within Talent Acquisition | Sourcing and Recruiting at Perficient

    Development

    PakOS – Debian-based Linux distribution from Pakistan

    Development

    Customer Account Takeovers: The Multi-Billion Dollar Problem You Don’t Know About

    Development

    Highlights

    Development

    This controller is literally faster than anything created by Xbox and it’s almost $100 off — the perfect gift for any gamer!

    December 24, 2024

    For less than a standard Xbox controller, the Razer Wolverine V2 Chroma has fully remappable…

    Microsoft Teams marches closer to letting you skip meetings, don’t tell your boss

    January 14, 2025

    Lenovo just cut $1,000 and more off the lightest and most secure ThinkPad I still use every day

    June 28, 2024

    I’m a New Homeowner, and Here’s How to BYO Smart Home

    June 14, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.