Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Slack’s AI search now works across an organization’s entire knowledge base

      July 17, 2025

      In-House vs Outsourcing for React.js Development: Understand What Is Best for Your Enterprise

      July 17, 2025

      Tiny Screens, Big Impact: The Forgotten Art Of Developing Web Apps For Feature Phones

      July 16, 2025

      Kong AI Gateway 3.11 introduces new method for reducing token costs

      July 16, 2025

      Microsoft is on track to become the second $4 trillion company by market cap, following NVIDIA — and mass layoffs

      July 17, 2025

      The wireless gaming mouse I’ve used for 5 years is down to $30 — that’s less than 2 cents a day (and it’s still my favorite)

      July 17, 2025

      Researchers from OpenAI, Anthropic, Meta, and Google issue joint AI safety warning – here’s why

      July 17, 2025

      You’ll soon be able to chat with Copilot and attend Teams meetings while driving your Mercedes-Benz — now there’s no excuse to miss your meetings

      July 17, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The details of TC39’s last meeting

      July 17, 2025
      Recent

      The details of TC39’s last meeting

      July 17, 2025

      Tinkerwell v5 is now released

      July 17, 2025

      Tinkerwell v5 is now released

      July 17, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft is on track to become the second $4 trillion company by market cap, following NVIDIA — and mass layoffs

      July 17, 2025
      Recent

      Microsoft is on track to become the second $4 trillion company by market cap, following NVIDIA — and mass layoffs

      July 17, 2025

      The wireless gaming mouse I’ve used for 5 years is down to $30 — that’s less than 2 cents a day (and it’s still my favorite)

      July 17, 2025

      You’ll soon be able to chat with Copilot and attend Teams meetings while driving your Mercedes-Benz — now there’s no excuse to miss your meetings

      July 17, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»How to Protect Your GitHub Repos Against Malicious Clones

    How to Protect Your GitHub Repos Against Malicious Clones

    July 16, 2025

    The world of open-source development comes with various cyber threats. GitHub is still facing a type of attack that is ongoing since last year where attackers mirrored a huge number of repositories. So as it turns out…the clone wars are not over!

    If you haven’t heard about what’s going on:

    GitHub is struggling to contain an ongoing attack that’s flooding the site with with millions of code repositories. These repositories contain obfuscated malware that steals passwords and cryptocurrency from developer devices. … The result is millions of forks with names identical to the original one.

    – Dan Goodin, Ars technica

    Because search engines and GitHub’s own search rankings favor recent activity, these cloned repositories often float to the top – then they lure unsuspecting developers into pulling code that may contain malware.

    One of my repositories has been targeted by such an attack, prompting me to monitor it closely. This guide offers tips to spot malicious repository clones before they catch you off guard.

    Table of Contents

    1. What is a Repository Confusion Attack?

      • Supply Chain Attacks
    2. 🛡️ Basic Mitigation Strategies

      • Verify the contributors profiles

      • Search for clone repositories

      • Examine the commit pattern

      • Examine the commit history

      • Examine the commit contents

      • Compare the concerned files

      • Some information about the malware

    3. Action Time

    4. Conclusion

    5. More Resources

    What is a Repository Confusion Attack?

    A repository confusion attack involves:

    • Cloning legitimate repositories.

    • Injecting malicious code into the clone.

    • Uploading the clone.

    • Spreading through various unaware actors.

    Supply Chain Attacks

    If you search for repository confusion on the internet, you’ll find out it’s a type of supply chain attack.

    A supply chain attack is an indirect threat where hackers try infiltrating a system by targeting a trusted third-party or software component, rather than attacking the primary target directly.

    It’s not the first time this has happened. Before GitHub was targeted, PyPI was attacked in 2023 with fake packages posing as legitimate. These packages lured negligent pip users into downloading malicious payloads (containing in most cases infostealer malware).

    🛡️ Basic Mitigation Strategies

    Before using any repository, make sure you follow these steps and take these precautions.

    Verify the contributors profiles

    That’s a first check: if you see a rather empty GitHub profile – one without reputation that contains just one repository but with a lot of daily commits to it – well, that’s a bit suspicious.

    In the fake repository, the original author will be listed as a contributor, too. Check that profile. You should be able to find the legitimate repository and do some comparisons.

    GitHub screenshot of a repository contributors

    In the above screenshot you can see solotech143, my evil doppelgänger (he’s been taken down since).

    Search for clone repositories

    You can do a GitHub search by repository name and sort the results by most recent first. Malicious repositories tend to appear at the top of the search results because they are updated more frequently. The original repository might be hidden deeper in the search results.

    GitHub clone search results.

    It’s like clone wars.

    This is where it’s dangerous: users generally click on the first few search results, and in that type of attack, you’re almost guaranteed to see the attacker’s fake repository at the top of the results. The attacker achieves that by giving the fake repository regular fresh commits (and sometimes even a few stars!).

    In my case, the original repository is a submission for the HackaViz 2025 competition. Hackathons offer a good attack surface because, beyond the fact they draw niche communities, they are also time sensitive.

    Now, let’s move forward a year and imagine Hackaviz 2026 is starting soon. The attacker has easily outranked the untouched original submission. Which repository is most likely to be visited when future competitors – unaware of the scam – will look for the previous submissions?

    Examine the commit pattern

    Here’s when things take a weird turn. Malicious clones are run by automated agents, so the commit history fits a pattern that is rather unusual for a human. Of course, you can automate for many legitimate reasons but… this will always follow a clear goal and there will always be a human-touch at some point. In this case, commits are not adding up.

    Let’s see how that looks in the screenshots below:

    1238dee9-3568-4d2b-88bb-f63258ffb045

    Regular like a clock…

    A GitHub screenshot of a very active contribution activity..

    … and hyperactive!

    Examine the commit history

    You can’t! And that’s the weird part. You’re just able to see the last and the initial commit. So why is it hiding all of them? Do you like it when someone hide things from you?

    A github commits history screenshot for one day.

    For July 10th, we should be able to see 11 commits, where are the ten others?

    A github commits history screenshot for a whole period.

    Well, you can only check the first and last commit. That is not a lot for a repository that has more than 2000 commits registered.

    Examine the commit contents

    Well, since I can always check the last commit, I checked some of them. They share the same pattern: the bot is constantly looping over the README file doing the same modifications. As you can see in the screenshot below, it’s updating the file with links to an infected release.

    A github screenshot of commits to a malicious repository.

    Above you can see an AI agent stuck in the Readme loop of change.

    Human edits are more varied. In a human-driven project, you will see a large mix of commits: feature commits, exploratory experiments, bug fixes, styling tweaks, and sometimes reverts. A bot clone will often just overwrite files, bump versions, or re-inject the same malicious payload repeatedly with no real contribution to the codebase.

    Compare the concerned files

    This is where common sense comes handy. So, you have two README’s:

    1. The first consists of AI-generated content that is cluttered with emojis and low-value information. It is designed solely to entice you into clicking the download link of the release.

    2. The other follows best practices for creating a good README file. It is accurate and well-structured and functions as a valuable helper and explainer to the code. It also goes deep into the most important aspects of the project. This is usually a good sign that a repository is organic and genuine.

    Some information about the malware

    What do we have so far? Well, a suspicious link in a phishy, AI-generated README file that is consistent with a very suspicious pattern in the commit history.

    Now, let’s have a closer look at that dubious release and let’s see what an online antivirus scanner might reveal about it.

    A  github screenshot of commits to a malicious repository.

    The malware is packed only in the miniature-fortnight-v1.7.6.zip release.

    A malware analysis result.

    Above you can see the result of a scan with an online scanner.

    The .zip file contains only four files:

    • config.txt

    • launch.bat

    • lua51.dll

    • luajit.exe

    These files are totally unrelated to the source project (a Python data science project with Jupyter notebooks combined to a React app using three.js).

    I will not go into the detail in this article. But for the curious ones, it’s an infostealer malware (a malware that will exfiltrate your credentials and other precious information about your configuration) similar to the one described in detail here.

    Action Time

    If you discover a potentially malicious repository, here are some steps you can take:

    1. Document some evidence.

    2. Notify the original repository maintainers.

    3. Report the malicious clone to GitHub.

    Reporting a repository or a profile on GitHub is easy and fast. Go to the user’s profile page, click “Block or report” in the left sidebar and choose “Report abuse” in the pop-up. You will have to complete a short contact form with some details about the behavior before submitting. If needed, you can find more information on GitHub.

    Conclusion

    This is a description of just one attack, from the perspective of someone who found out that one of his repository had been targeted. There are likely cases of more sophisticated attacks. But the clone repository flood we can see on GitHuB is definitely massive low quality automation. Quantity over quality.
    To be honest, I’m quite surprised algorithms crafted at GitHub didn’t manage to spot this one.

    This also raises questions related to AI.

    • What happens when LLMs are trained on malicious content? That’s a more general question about AI poisoning.

    • A human might easily spot the patterns and the low quality content for now. But..

      • Imagine you are using coding agents, many of them. Will the agents pick-up the malicious clone instead of the original one? How to distinguish the repositories from an automaton’s perspective?

      • The attackers will refine their tactics, making the clones more human-like and therefore luring us more easily into their traps.

    • This is really a situation that makes me wonder about the early days of Google. Back then, the company had to fight huge amounts of spam due to keyword stuffing and manipulative SEO tactics. Will big tech companies have to go through a Florida update moment to face the rise of AI generated spam ?

    More Resources

    • A detailed description of the attack

    • Complete safety recommendations

    Stay Informed, Stay Secure!

    A cheat-sheet is also available on my GitHub. Feel free to contribute to it!

    Source: freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleHow to Revert a Migration in Django
    Next Article Google finds custom backdoor being installed on SonicWall network devices

    Related Posts

    Artificial Intelligence

    Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment

    July 17, 2025
    Repurposing Protein Folding Models for Generation with Latent Diffusion
    Artificial Intelligence

    Repurposing Protein Folding Models for Generation with Latent Diffusion

    July 17, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    CVE-2025-6238 – WordPress AI Engine Plugin Open Redirect Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-5105 – “Tozed ZLT W51 Heap Memory Corruption”

    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-5491: Acer Control Center Bug Allows Remote Code Execution as NT AUTHORITYSYSTEM

    Security

    5+ WordPress Plugins for Developers To Use in 2025

    Development

    Highlights

    CVE-2025-4711 – Campcodes Sales and Inventory System SQL Injection Vulnerability

    May 15, 2025

    CVE ID : CVE-2025-4711

    Published : May 15, 2025, 6:15 p.m. | 1 hour, 45 minutes ago

    Description : A vulnerability, which was classified as critical, was found in Campcodes Sales and Inventory System 1.0. This affects an unknown part of the file /pages/stockin_add.php. The manipulation of the argument prod_name leads to sql injection. It is possible to initiate the attack remotely. The exploit has been disclosed to the public and may be used.

    Severity: 7.3 | HIGH

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    CVE-2025-6162 – TOTOLINK EX1200T HTTP POST Request Handler Buffer Overflow

    June 17, 2025

    CVE-2024-9062 – Apple Archify Local Privilege Escalation Vulnerability

    June 10, 2025

    CVE-2023-53130 – Linux Kernel Block Device Exclusivity Leak

    May 2, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.