Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      June 3, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      June 3, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      June 3, 2025

      How To Prevent WordPress SQL Injection Attacks

      June 3, 2025

      SteelSeries reveals new Arctis Nova 3 Wireless headset series for Xbox, PlayStation, Nintendo Switch, and PC

      June 3, 2025

      The Witcher 4 looks absolutely amazing in UE5 technical presentation at State of Unreal 2025

      June 3, 2025

      Razer’s having another go at making it so you never have to charge your wireless gaming mouse, and this time it might have nailed it

      June 3, 2025

      Alienware’s rumored laptop could be the first to feature NVIDIA’s revolutionary Arm-based APU

      June 3, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      easy-live2d – About Make your Live2D as easy to control as a pixi sprite! Live2D Web SDK based on Pixi.js.

      June 3, 2025
      Recent

      easy-live2d – About Make your Live2D as easy to control as a pixi sprite! Live2D Web SDK based on Pixi.js.

      June 3, 2025

      From Kitchen To Conversion

      June 3, 2025

      Perficient Included in Forrester’s AI Technical Services Landscape, Q2 2025

      June 3, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      SteelSeries reveals new Arctis Nova 3 Wireless headset series for Xbox, PlayStation, Nintendo Switch, and PC

      June 3, 2025
      Recent

      SteelSeries reveals new Arctis Nova 3 Wireless headset series for Xbox, PlayStation, Nintendo Switch, and PC

      June 3, 2025

      The Witcher 4 looks absolutely amazing in UE5 technical presentation at State of Unreal 2025

      June 3, 2025

      Razer’s having another go at making it so you never have to charge your wireless gaming mouse, and this time it might have nailed it

      June 3, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»This AI Paper from Tel Aviv University Introduces GASLITE: A Gradient-Based Method to Expose Vulnerabilities in Dense Embedding-Based Text Retrieval Systems

    This AI Paper from Tel Aviv University Introduces GASLITE: A Gradient-Based Method to Expose Vulnerabilities in Dense Embedding-Based Text Retrieval Systems

    January 7, 2025

    Dense embedding-based text retrieval has become the cornerstone for ranking text passages in response to queries. The systems use deep learning models for embedding text into vector spaces that enable semantic similarity measurements. This method has been adopted widely in applications such as search engines and retrieval-augmented generation (RAG), where retrieving accurate and contextually relevant information is critical. These systems efficiently match queries with relevant content by building on learned representations, driving huge advancements in knowledge-intensive domains.

    However, the main challenge for embedding-based retrieval systems is their susceptibility to manipulation by adversaries. The reason is that these systems often build on public corpora, which are not immune to adversarial content. Malicious actors can inject crafted passages into the corpus in a way that affects the retrieval system’s ranking to prioritize the adversarial entries over the queries containing them. This can threaten the integrity of search results with the spread of misinformation or the introduction of biased content, endangering the reliability of knowledge systems.

    Previous approaches to counter adversarial attacks have used simple poisoning techniques, such as stuffing targeted queries with repetitive text or embedding misleading information. Although these methods can break single-query systems, they are often ineffective against more complex models that handle diverse query distributions. Existing defenses also do not address the core vulnerabilities in embedding-based retrieval systems, leaving the systems open to more advanced and subtle attacks.

    Researchers at Tel Aviv University introduced a mathematically grounded gradient-based optimization method called GASLITE for crafting adversarial passages. GASLITE performs better than previous techniques because it focuses precisely on the retrieval model’s embedding space rather than modifying content in the text. It aligns itself with certain query distributions, which results in adversarial passages achieving high visibility within retrieval results. Thus, this makes it a potent tool for evaluating vulnerabilities in dense embedding-based systems.

    The GASLITE methodology is grounded in rigorous mathematical principles and innovative optimization techniques. It constructs adversarial passages from attacker-chosen prefixes combined with optimized triggers designed to maximize similarity to targeted query distributions. Optimization takes the form of gradient calculations in the embedding space to find optimal token substitutions. Unlike previous approaches, GASLITE does not edit the corpus or model but instead focuses on generating text that the retrieval system’s ranking algorithm can manipulate. This design makes it stealthy and effective; adversarial passages can blend directly into the corpus without being detectable by standard defenses.

    The authors test GASLITE with nine state-of-the-art retrieval models under various threat scenarios. The method consistently outperformed baseline approaches, achieving a remarkable 61-100% success rate in ranking adversarial passages within the top 10 results for concept-specific queries. These results were achieved with minimal poisoning of the corpus, with adversarial passages comprising just 0.0001% of the dataset. For example, GASLITE demonstrated top-10 visibility across most retrieval models when targeting concept-specific queries, showcasing its precision and efficiency. In single-query attacks, the method consistently ranked adversarial content as the top result, which is effective even under the most stringent conditions.

    Further analysis of the factors that contributed to the success of GASLITE showed that embedding-space geometry and similarity metrics significantly determined model susceptibility. Models using dot-product similarity measures were particularly vulnerable because the GASLITE method exploited these characteristics to achieve optimal alignment with targeted query distributions. The researchers further emphasized that models with anisotropic embedding spaces, where random text pairs produced high similarities, were more susceptible to attacks. This again points towards the importance of understanding embedding-space properties while designing retrieval systems.

    It underscores the need for strong defenses against adversarial manipulations in embedding-based retrieval systems. The authors thus recommend utilizing hybrid retrieval approaches like dense and sparse retrieval techniques that can minimize the risks provided by such methods as GASLITE. It serves, on its own, to expose the vulnerability in current retrieval systems to risks and pave the way for more secure and resilient technologies.

    The researchers urgently call to focus on the risks presented by such adversarial attacks to dense embedding-based systems. The minimal effort that GASLITE could use to manipulate search results shows the potential severity of such attacks. However, by characterizing critical vulnerabilities and developing actionable defenses, this work provides valuable insights into improving this robustness and reliability in retrieval models.


    Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 60k+ ML SubReddit.

    🚨 FREE UPCOMING AI WEBINAR (JAN 15, 2025): Boost LLM Accuracy with Synthetic Data and Evaluation Intelligence–Join this webinar to gain actionable insights into boosting LLM model performance and accuracy while safeguarding data privacy.

    The post This AI Paper from Tel Aviv University Introduces GASLITE: A Gradient-Based Method to Expose Vulnerabilities in Dense Embedding-Based Text Retrieval Systems appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleNVIDIA AI Introduces Cosmos World Foundation Model (WFM) Platform to Advance Physical AI Development
    Next Article Researchers from USC and Prime Intellect Released METAGENE-1: A 7B Parameter Autoregressive Transformer Model Trained on Over 1.5T DNA and RNA Base Pairs

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    June 3, 2025
    Machine Learning

    Distillation Scaling Laws

    June 3, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Understanding In-Out and Input Parameters in IICS

    Development

    Solo Development: Learning To Let Go Of Perfection

    Tech & Work

    Chats – messaging application for mobile and desktop

    Linux

    Google DeepMind at NeurIPS 2024

    Artificial Intelligence

    Highlights

    CVE-2025-4194 – WordPress AlT Monitoring CSRF

    May 17, 2025

    CVE ID : CVE-2025-4194

    Published : May 17, 2025, 4:16 a.m. | 28 minutes ago

    Description : The AlT Monitoring plugin for WordPress is vulnerable to Cross-Site Request Forgery in all versions up to, and including, 1.0.3. This is due to missing or incorrect nonce validation on the ‘ALT_Monitoring_edit’ page. This makes it possible for unauthenticated attackers to update settings and inject malicious web scripts via a forged request granted they can trick a site administrator into performing an action such as clicking on a link.

    Severity: 6.1 | MEDIUM

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    Explore British Culture and Lifestyle

    December 23, 2024

    ChatGPT search gets a new shopping experience — But will OpenAI need Chrome to compete with Google and Microsoft?

    April 30, 2025

    Timelinize is a tool that stores data in a cohesive timeline

    March 23, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.