Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      June 4, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      June 4, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      June 4, 2025

      How To Prevent WordPress SQL Injection Attacks

      June 4, 2025

      Players aren’t buying Call of Duty’s “error” excuse for the ads Activision started forcing into the game’s menus recently

      June 4, 2025

      In Sam Altman’s world, the perfect AI would be “a very tiny model with superhuman reasoning capabilities” for any context

      June 4, 2025

      Sam Altman’s ouster from OpenAI was so dramatic that it’s apparently becoming a movie — Will we finally get the full story?

      June 4, 2025

      One of Microsoft’s biggest hardware partners joins its “bold strategy, Cotton” moment over upgrading to Windows 11, suggesting everyone just buys a Copilot+ PC

      June 4, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      LatAm’s First Databricks Champion at Perficient

      June 4, 2025
      Recent

      LatAm’s First Databricks Champion at Perficient

      June 4, 2025

      Beyond AEM: How Adobe Sensei Powers the Full Enterprise Experience

      June 4, 2025

      Simplify Negative Relation Queries with Laravel’s whereDoesntHaveRelation Methods

      June 4, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Players aren’t buying Call of Duty’s “error” excuse for the ads Activision started forcing into the game’s menus recently

      June 4, 2025
      Recent

      Players aren’t buying Call of Duty’s “error” excuse for the ads Activision started forcing into the game’s menus recently

      June 4, 2025

      In Sam Altman’s world, the perfect AI would be “a very tiny model with superhuman reasoning capabilities” for any context

      June 4, 2025

      Sam Altman’s ouster from OpenAI was so dramatic that it’s apparently becoming a movie — Will we finally get the full story?

      June 4, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Purdue University Researchers Introduce ETA: A Two-Phase AI Framework for Enhancing Safety in Vision-Language Models During Inference

    Purdue University Researchers Introduce ETA: A Two-Phase AI Framework for Enhancing Safety in Vision-Language Models During Inference

    January 19, 2025

    Vision-language models (VLMs) represent an advanced field within artificial intelligence, integrating computer vision and natural language processing to handle multimodal data. These models allow systems to simultaneously understand and process images and text, enabling applications like medical imaging, automated systems, and digital content analysis. Their ability to bridge the gap between visual & textual data has made them a cornerstone of multimodal intelligence research.

    One of the main challenges facing the development of VLMs involves the safety guarantee of their output. Visual input streams are bound to contain malicious or unsafe information that can sometimes evade model defense mechanisms, resulting in dangerous or insensitive answers. Stronger textual countermeasures provide protection, but such is not yet the case in visual modalities because visual embedding is continuous and, therefore, vulnerable to such attacks. In this respect, the task becomes much tougher to evaluate under multimodal input streams, especially regarding safety.

    Current approaches to VLM safety are mainly fine-tuning and inference-based defenses. The fine-tuning methods include supervised fine-tuning and reinforcement learning from human feedback, which is effective but very resource-intensive, using lots of data, labor, and computational power. These methods also risk compromising the model’s general utility. Inference-based methods use safety evaluators to check the outputs against predefined criteria. These methods, however, focus mostly on textual inputs, neglecting the safety implications of visual content. Thus, unsafe visual inputs are passed through without evaluation, leading to poor model performance and reliability.

    Researchers from Purdue University introduced the “Evaluating Then Aligning” (ETA) framework to address these issues. This new inference-time method ensures VLM safety without needing extra data or fine-tuning. ETA addresses the limitations of current methods by breaking down the safety mechanism into two phases, namely multimodal evaluation and bi-level alignment. The researchers designed ETA as a plug-and-play solution adaptable to various VLM architectures while maintaining computational efficiency.

    The ETA framework works in two stages. The pre-generation evaluation stage checks the safety of visual inputs by applying a predefined safety guard that relies on CLIP scores. It thus filters out potentially harmful visual content before generating the response. Then, in the post-generation evaluation stage, a reward model is used to evaluate the safety of textual outputs. If unsafe behavior is detected, the framework applies two alignment strategies. Shallow alignment uses interference prefixes to shift the model’s generative distribution toward safer outputs, while deep alignment conducts sentence-level optimization to refine responses further. This combination ensures both the safety and utility of the generated outputs.

    The researchers extensively tested ETA against multiple benchmarks to evaluate its performance. The framework reduced the unsafe response rate by 87.5% in cross-modality attacks and significantly outperformed existing methods like ECSO. ETA markedly improved in experiments on the SPA-VL Harm dataset, decreasing the unsafe rate from 46.04% to 16.98%. In multimodal datasets such as MM-SafetyBench and FigStep, ETA reliably showed better mechanisms for safety in handling adversarial and harmful visual inputs. Notably, it achieved a win-tie rate of 96.6% in GPT-4 evaluations for helpfulness, showing its ability to maintain model utility and enhanced safety. The researchers also demonstrated the framework’s efficiency, adding only 0.1 seconds to inference time compared to the 0.39 seconds overhead of competing methods like ECSO.

    This is how the proposed method achieves safety and utility through the root cause of vulnerabilities in VLMs: the continuous nature of visual token embeddings. The ETA framework aligns visual and textual data such that existing safety mechanisms can work effectively by mapping visual token embeddings into discrete textual embeddings. This ensures that the visual and textual inputs are put through rigorous safety checks, making it impossible for harmful content to escape.

    Through their work, the research team has provided one of the scalable and efficient solutions to one of the most challenging tasks in multimodal AI systems. The ETA framework shows how strategic evaluation and alignment strategies can flip VLM safety while retaining all their general capabilities. This advancement deals with current safety and lays the foundation for further developments and the deployment of VLMs with much more confidence in real applications.


    Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 65k+ ML SubReddit.

    🚨 Recommend Open-Source Platform: Parlant is a framework that transforms how AI agents make decisions in customer-facing scenarios. (Promoted)

    The post Purdue University Researchers Introduce ETA: A Two-Phase AI Framework for Enhancing Safety in Vision-Language Models During Inference appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleStanford Researchers Introduce BIOMEDICA: A Scalable AI Framework for Advancing Biomedical Vision-Language Models with Large-Scale Multimodal Datasets
    Next Article Dark Mode Design: Best Practices and User Preferences

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    June 4, 2025
    Machine Learning

    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025

    June 4, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Gladys Assistant – privacy-first home assistant

    Development

    Muse, Xbox’s new AI, can bring back old games from the dead

    Operating Systems

    Monster Hunter Wilds’ launch trailer reveals its first Title Update and post-launch roadmap before the game is even out

    News & Updates

    The Birth of Ariadne – The Chilling AI

    Artificial Intelligence

    Highlights

    Development

    New Chrome Vulnerability Enables Cross-Origin Data Leak via Loader Referrer Policy

    May 15, 2025

    Google on Wednesday released updates to address four security issues in its Chrome web browser,…

    CVE-2025-4838 – Kanwangzjm Funiture Open Redirect Vulnerability

    May 17, 2025

    The Different (and Modern) Ways to Toggle Content

    December 1, 2024

    Samsung will offer up to $1,500 off Galaxy Z Fold 6 and Z Flip 6 – here’s how it’ll likely work

    June 25, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.