Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 14, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 14, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 14, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 14, 2025

      I test a lot of AI coding tools, and this stunning new OpenAI release just saved me days of work

      May 14, 2025

      How to use your Android phone as a webcam when your laptop’s default won’t cut it

      May 14, 2025

      The 5 most customizable Linux desktop environments – when you want it your way

      May 14, 2025

      Gen AI use at work saps our motivation even as it boosts productivity, new research shows

      May 14, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Strategic Cloud Partner: Key to Business Success, Not Just Tech

      May 14, 2025
      Recent

      Strategic Cloud Partner: Key to Business Success, Not Just Tech

      May 14, 2025

      Perficient’s “What If? So What?” Podcast Wins Gold at the 2025 Hermes Creative Awards

      May 14, 2025

      PIM for Azure Resources

      May 14, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Windows 11 24H2’s Settings now bundles FAQs section to tell you more about your system

      May 14, 2025
      Recent

      Windows 11 24H2’s Settings now bundles FAQs section to tell you more about your system

      May 14, 2025

      You can now share an app/browser window with Copilot Vision to help you with different tasks

      May 14, 2025

      Microsoft will gradually retire SharePoint Alerts over the next two years

      May 14, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Nearest Neighbor Speculative Decoding (NEST): An Inference-Time Revision Method for Language Models to Enhance Factuality and Attribution Using Nearest-Neighbor Speculative Decoding

    Nearest Neighbor Speculative Decoding (NEST): An Inference-Time Revision Method for Language Models to Enhance Factuality and Attribution Using Nearest-Neighbor Speculative Decoding

    June 2, 2024

    Large language models (LLMs) have proven their potential to handle multiple tasks and perform extremely well across various applications. However, it is challenging for LLMs to generate accurate information, especially when the knowledge is less represented in their training data. To overcome this challenge, retrieval augmentation combines information retrieval and nearest neighbor search from a non-parametric data store that improves evidence-based and situated reasoning with LLMs. This leads to a reduction tendency in semi-parametric LMs while generating unsupported content.

    Many works have been explored to overcome these shortcomings. One of the existing methods is Retrieval Augmentation (RA), which uses external knowledge sources to enhance the performance of LMs in tasks that require deep understanding. Advancements in retrieval augmentation, like REALM, RAG, and Atlas, integrate the retrieval component into pre-training and fine-tuning for these downstream tasks. Another method discussed is Speculative decoding, which utilizes a small model to generate drafts for a large model. The most related method is REST which takes multiple drafts from a data store and uses a prefix trie tree to find the proposal distribution. 

    Researchers from FAIR at Meta, the University of Waterloo, Carnegie Mellon University, and the University of Chicago have proposed Nearest Neighbor Speculative Decoding (NEST). NEST is a new semi-parametric language modeling method that can integrate real-world text spans of any length into the generations of an existing LM, enhancing both the quality and latency. NEST extends the standard kNN-LM method by interpolating the output distribution of an LM with the distribution of potential next tokens derived from a corpus. Initially, it includes an extra passage retrieval step, which reduces the need to store and search through all tokens in the corpus, creating a balance between search accuracy and efficiency.

    NEST generates content with three sub-steps at each inference step. These steps are:

    Confidence-based interpolation: Relative Retrieval Confidence (RRC) score is used to evaluate the uncertainty of the token retriever, which is then used as the interpolation coefficient for the output probability mixture.

    Dynamic span selection: NEST selects the best token predicted by the mixture probability and extends to include the span from that token when the threshold is exceeded by token retrieval confidence.

    Relaxed speculative decoding: When a span of multiple tokens is selected, it is evaluated based on mixture probability, and only a prefix that is highly likely according to the mixture probability is accepted.

    NEST outperforms both the methods, base LM and the standard kNN-LM under a zero-shot setting using Llama-2-Chat models of different sizes on tasks such as text completion, and factuality aware generation. For example, the NEST, combined with the Llama-2-Chat 70B model, shows a 42.3% improvement of ROUGE-1 on WikiText-103 and a 21.6% improvement of FActScore on Biography. Moreover, NEST enhances the efficiency of long-form generation by producing multiple tokens at each time step, and becomes 1.8 times faster in inference time with Llama-2-Chat 70B, without affecting attribution or fluency.  

    In conclusion, researchers introduced NEST, an inference-time revision method for LMs that enhances their factuality and attribution with the help of nearest-neighbor speculative decoding. NEST enhances both validation perplexity and quality of free-form generation across 9 different tasks. However, some of the limitations of the proposed method are:

    The results of NEST might have factual errors depending on the accuracy of the first-stage passage retrieval and the second-stage token retrieval. 

    The results can be better if fine-tuned on appropriate tasks because the integrated system without fine-tuning might be sub-optimal.

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 43k+ ML SubReddit | Also, check out our AI Events Platform

    The post Nearest Neighbor Speculative Decoding (NEST): An Inference-Time Revision Method for Language Models to Enhance Factuality and Attribution Using Nearest-Neighbor Speculative Decoding appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleData Complexity and Scaling Laws in Neural Language Models
    Next Article Ant Group Proposes MetRag: A Multi-Layered Thoughts Enhanced Retrieval Augmented Generation Framework

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 15, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-30419 – NI Circuit Design Suite SymbolEditor Out-of-Bounds Read Vulnerability

    May 15, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Building a Multi-Tenant SaaS Application with Next.js (Backend Integration)

    Building a Multi-Tenant SaaS Application with Next.js (Backend Integration)

    Development

    How to give your Xbox Series X|S or Xbox One a Static IP address

    Development

    Commvault back-upserver via kritiek path traversal-lek over te nemen

    Security

    Secure Salesforce Integrations

    Development
    GetResponse

    Highlights

    Hacking of Ewon Cosy+ Secure Industrial Remote Access Gateway is Possible

    August 13, 2024

    While industrial VPN gateways such as Cosy+ play a crucial role in enabling secure remote…

    MyNav – workspace and session management TUI

    January 14, 2025

    CVE-2025-4639 – Peergos XML XXE Vulnerability

    May 14, 2025

    Ransomware Attack Hits Union County, Exposing Residents’ Personal Data

    March 25, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.