Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Artificial Intelligence»Perplexity AI embroiled in controversy over alleged web scraping abuse

    Perplexity AI embroiled in controversy over alleged web scraping abuse

    June 30, 2024

    Perplexity AI has found itself at the center of a firestorm over its data collection practices. 

    Perplexity essentially fuses a search engine with generative AI, returning AI-generated content related to the user’s search query.  

    The processes required to do this likely involve improperly scraping content from numerous websites, including those that explicitly prohibit it. 

    The scandal erupted on June 11 when Forbes reported that Perplexity had lifted an entire article from its site, complete with custom illustrations, and repurposed it with only minimal attribution. 

    Not long after, WIRED conducted an investigation that uncovered evidence of Perplexity scraping content from websites that forbid automated data collection. 

    A website can request that its content isn’t scraped by web crawlers through a file called “robots.txt.”

    This exclusion protocol communicates with web crawlers and other automated bots. It’s a simple text file placed on a website’s server that specifies which pages or sections of the website should not be accessed or scraped.

    The robots.txt file has been a widely respected convention since the early days of the web. It helps website owners control their content and prevent unauthorized data collection.

    Although not legally binding, it has long been considered best practice for web crawlers to follow the instructions outlined in a website’s robots.txt file.

    Jason Kint, CEO of Digital Content Next, a trade group representing online publishers, minced no words in his assessment of Perplexity’s web scraping processes. 

    “By default, AI companies should assume they have no right to take and reuse publishers’ content without permission,” he said. 

    “If Perplexity is skirting terms of service or robots.txt, the red alarms should be going off that something improper is going on.”

    Amazon investigates

    These revelations have prompted Amazon Web Services (AWS), which hosts a server implicated in Perplexity’s alleged improper scraping, to launch an investigation. 

    AWS strictly prohibits customers from engaging in abusive or illegal activities that violate its terms of service.

    Perplexity CEO Aravind Srinivas initially brushed off the concerns, asserting they reflected “a deep and fundamental misunderstanding” of the company’s operations and the internet at large. 

    However, in a subsequent interview with Fast Company, he conceded that Perplexity relied on an unnamed third-party vendor for web crawling and indexing, suggesting they were to blame for any robots.txt violations. 

    Srinivas declined to identify the company, citing a non-disclosure agreement.

    For the moment, Perplexity appears determined to weather the storm, with a spokesperson downplaying the AWS probe as “standard procedure” and indicating the company has made no changes to its operations. 

    However, the startup’s defiant stance may prove untenable as the groundswell of concern over AI’s data practices continues to build.

    The post Perplexity AI embroiled in controversy over alleged web scraping abuse appeared first on DailyAI.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleWindows 11 23H2 update tests a “Continue from Phone” feature for Android
    Next Article The best Motorola phones of 2024: Expert tested and reviewed

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 17, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-40906 – MongoDB BSON Serialization BSON::XS Multiple Vulnerabilities

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Markdown previewer made with Vue.js

    Development

    Fruit Credits – keep plain text accounts

    Linux

    How to Pick Partners for a Software Product Business? – Fix My Software Product Business

    Development

    25+ Best Fancy Fonts

    Development
    GetResponse

    Highlights

    Linux

    Ubuntu 25.04 Beta is Now Available to Download

    March 27, 2025

    If you’ve been pining to sample the Plucky Puffin, now’s the time: the Ubuntu 25.04…

    PAR Scrape is a web scraping tool

    March 26, 2025

    Grand Theft Auto 6 is still slated for Fall 2025, says Take-Two Interactive CEO

    February 6, 2025

    40+ Free Packs of Procreate Brushes for Digital Artists

    August 7, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.