Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Artificial Intelligence»Microsoft reveal “Skeleton Key Jailbreak” which works across different AI models

    Microsoft reveal “Skeleton Key Jailbreak” which works across different AI models

    June 28, 2024

    Microsoft security researchers have discovered a new way to manipulate AI systems into ignoring their ethical constraints and generating harmful, unrestricted content. 

    This “Skeleton Key” jailbreak uses a series of prompts to gaslight the AI into believing it should comply with any request, no matter how unethical. 

    It’s remarkably easy to execute. The attacker simply reframed their request as coming from an “advanced researcher” requiring “uncensored information” for “safe educational purposes.”

    When exploited, these AIs readily provided information on topics like explosives, bioweapons, self-harm, graphic violence, and hate speech.

    “The Skeleton Key” is a remarkably simple jailbreak. Source: Microsoft.

    The compromised models included Meta’s Llama3-70b-instruct, Google’s Gemini Pro, OpenAI’s GPT-3.5 Turbo and GPT-4o, Anthropic’s Claude 3 Opus, and Cohere’s Commander R Plus. 

    Among the tested models, only OpenAI’s GPT-4 demonstrated resistance. Even then, it could be compromised if the malicious prompt was submitted through its application programming interface (API).

    Despite models becoming more complex, jailbreaking them remains quite straightforward. Since there are many different forms of jailbreaks, it’s nearly impossible to combat them all. 

    In March 2024, a team from the University of Washington, Western Washington University, and Chicago University published a paper on “ArtPrompt,” a method that bypasses an AI’s content filters using ASCII art  – a graphic design technique that creates images from textual characters.

    In April, Anthropic highlighted another jailbreak risk stemming from the expanding context windows of language models. For this type of jailbreak, an attacker feeds the AI an extensive prompt containing a fabricated back-and-forth dialogue.

    The conversation is loaded with queries on banned topics and corresponding replies showing an AI assistant happily providing the requested information. After being exposed to enough of these fake exchanges, the targeted model can be coerced into breaking its ethical training and complying with a final malicious request.

    As Microsoft explains in their blog post, jailbreaks reveal the need to fortify AI systems from every angle:

    Implementing sophisticated input filtering to identify and intercept potential attacks, even when disguised
    Deploying robust output screening to catch and block any unsafe content the AI generates
    Meticulously designing prompts to constrain an AI’s ability to override its ethical training
    Utilizing dedicated AI-driven monitoring to recognize malicious patterns across user interactions

    But the truth is, Skeleton Key is a simple jailbreak. If AI developers can’t protect that, what hope is there for some more complex approaches?

    Some vigilante ethical hackers, like Pliny the Prompter, have been featured in the media for their work in exposing how vulnerable AI models are to manipulation.

    honored to be featured on @BBCNews! pic.twitter.com/S4ZH0nKEGX

    — Pliny the Prompter (@elder_plinius) June 28, 2024

    It’s worth stating that this research was, in part, an opportunity to market Microsoft’s Azure AI new safety features like Content Safety Prompt Shields.

    These assist developers in preemptively testing for and defending against jailbreaks. 

    But even so, Skeleton Key reveals again how vulnerable even the most advanced AI models can be to the most basic manipulation.

    The post Microsoft reveal “Skeleton Key Jailbreak” which works across different AI models appeared first on DailyAI.

    Source: Read More 

    Hostinger
    Facebook Twitter Reddit Email Copy Link
    Previous Article10 Best Opera GX Mods to Personalize Your Browser
    Next Article As AI auditions for role in tomorrow’s entertainment industry, workers race for protections

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 17, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-48187 – RAGFlow Authentication Bypass

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Explaining Eleventy (11ty) – The Beginner-Friendly Static Site Generator

    Development

    U.S. Releases High-Profile Russian Hackers in Diplomatic Prisoner Exchange

    Development

    CVE-2025-1948 – Jetty HTTP/2 Buffer Overflow

    Common Vulnerabilities and Exposures (CVEs)

    Four Perficient Colleagues Named 2024 CRN Women of the Channel

    Development

    Highlights

    Smashing Security podcast #370: The closed loop conundrum, default passwords, and Baby Reindeer

    May 1, 2024

    The UK Government takes aim at IoT devices shipping with weak or default passwords, an…

    Unity Catalog, the Well-Architected Lakehouse and Performance Efficiency

    August 31, 2024

    Identity theft – six tips to help keep yours safe

    April 9, 2025

    This AI paper from DeepSeek-AI Explores How DeepSeek-V3 Delivers High-Performance Language Modeling by Minimizing Hardware Overhead and Maximizing Computational Efficiency

    May 17, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.