Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 14, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 14, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 14, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 14, 2025

      I test a lot of AI coding tools, and this stunning new OpenAI release just saved me days of work

      May 14, 2025

      How to use your Android phone as a webcam when your laptop’s default won’t cut it

      May 14, 2025

      The 5 most customizable Linux desktop environments – when you want it your way

      May 14, 2025

      Gen AI use at work saps our motivation even as it boosts productivity, new research shows

      May 14, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Strategic Cloud Partner: Key to Business Success, Not Just Tech

      May 14, 2025
      Recent

      Strategic Cloud Partner: Key to Business Success, Not Just Tech

      May 14, 2025

      Perficient’s “What If? So What?” Podcast Wins Gold at the 2025 Hermes Creative Awards

      May 14, 2025

      PIM for Azure Resources

      May 14, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Windows 11 24H2’s Settings now bundles FAQs section to tell you more about your system

      May 14, 2025
      Recent

      Windows 11 24H2’s Settings now bundles FAQs section to tell you more about your system

      May 14, 2025

      You can now share an app/browser window with Copilot Vision to help you with different tasks

      May 14, 2025

      Microsoft will gradually retire SharePoint Alerts over the next two years

      May 14, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Artificial Intelligence»LLM agents can autonomously exploit one-day vulnerabilities

    LLM agents can autonomously exploit one-day vulnerabilities

    April 25, 2024

    University of Illinois Urbana-Champaign (UIUC) researchers found that AI agents powered by GPT-4 can autonomously exploit cybersecurity vulnerabilities.

    As AI models become more powerful, their dual-use nature offers the potential for good and bad in equal measure. LLMs like GPT-4 are increasingly being used to commit cybercrime, with Google forecasting that AI will play a big role in committing and preventing these attacks.

    The threat of AI-powered cybercrime has been elevated as LLMs move beyond simple prompt-response interactions and act as autonomous AI agents.

    In their paper, the researchers explained how they tested the capability of AI agents to exploit identified “one-day” vulnerabilities.

    A one-day vulnerability is a security flaw in a software system that has been officially identified and disclosed to the public but has not yet been fixed or patched by the software’s creators.

    During this time, the software remains vulnerable, and bad actors with the appropriate skills can take advantage.

    When a one-day vulnerability is identified it is described in detail using the Common Vulnerabilities and Exposures, or CVE standard. The CVE is supposed to highlight the specifics of the vulnerabilities that need fixing but also lets the bad guys know where the security gaps are.

    We showed that LLM agents can autonomously hack mock websites, but can they exploit real-world vulnerabilities?

    We show that GPT-4 is capable of real-world exploits, where other models and open-source vulnerability scanners fail.

    Paper: https://t.co/utbmMdYfmu

    1/7 https://t.co/SAhdvZc8le

    — Daniel Kang (@daniel_d_kang) April 16, 2024

    The experiment

    The researchers created AI agents powered by GPT-4, GPT-3.5, and 8 other open-source LLMs.

    They gave the ​​agents access to tools, the CVE descriptions, and the use of the ReAct agent framework. The ReAct framework bridges the gap to enable the LLM to interact with other software and systems.

    System diagram of the LLM agent. Source: arXiv

    The researchers created a benchmark set of 15 real-world one-day vulnerabilities and set the agents the objective of attempting to exploit them autonomously.

    GPT-3.5 and the open-source models all failed in these attempts, but GPT-4 successfully exploited 87% of the one-day vulnerabilities.

    After removing the CVE description, the success rate fell from 87% to 7%. This suggests GPT-4 can exploit vulnerabilities once provided with the CVE details, but isn’t very good at identifying the vulnerabilities without this guidance.

    Implications

    Cybercrime and hacking used to require special skill sets, but AI is lowering the bar. The researchers said that creating their AI agent only required 91 lines of code.

    As AI models advance, the skill level required to exploit cybersecurity vulnerabilities will continue to decrease. The cost to scale these autonomous attacks will keep dropping too.

    When the researchers tallied the API costs for their experiment, their GPT-4 agent had incurred $8.80 per exploit. They estimate using a cybersecurity expert charging $50 an hour would work out at $25 per exploit.

    This means that using an LLM agent is already 2.8 times cheaper than human labor and much easier to scale than finding human experts. Once GPT-5 and other more powerful LLMs are released these capabilities and cost disparities will only increase.

    The researchers say their findings “highlight the need for the wider cybersecurity community and LLM providers to think carefully about how to integrate LLM agents in defensive measures and about their widespread deployment.”

    The post LLM agents can autonomously exploit one-day vulnerabilities appeared first on DailyAI.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleMicrosoft launches Phi-3 Mini, a tiny but powerful LM
    Next Article Redact PII in Audio with Make and AssemblyAI

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 15, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-3053 – “UiPress Lite WordPress Remote Code Execution Vulnerability”

    May 15, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Q*: A Versatile Artificial Intelligence AI Approach to Improve LLM Performance in Reasoning Tasks

    Development

    Windows Central Podcast: The life and death of HoloLens

    News & Updates

    How to Extract Key-Value Pairs Using Deep Learning

    Artificial Intelligence

    Accessible-Coconut – Linux distro dedicated to accessibility

    Development

    Highlights

    Microsoft is turning Windows Copilot into a regular app – and here’s why you’ll like it

    May 23, 2024

    Instead of being stuck with a static sidebar, you’ll be able to move, snap, and…

    Understanding Laravel’s Context Capabilities : Using Stacks and Handling Events

    June 27, 2024

    Securing Amazon Bedrock Agents: A guide to safeguarding against indirect prompt injections

    May 13, 2025

    TinyJs – React like framework in 35 lines of code

    February 1, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.