Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 15, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 15, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 15, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 15, 2025

      Intel’s latest Arc graphics driver is ready for DOOM: The Dark Ages, launching for Premium Edition owners on PC today

      May 15, 2025

      NVIDIA’s drivers are causing big problems for DOOM: The Dark Ages, but some fixes are available

      May 15, 2025

      Capcom breaks all-time profit records with 10% income growth after Monster Hunter Wilds sold over 10 million copies in a month

      May 15, 2025

      Microsoft plans to lay off 3% of its workforce, reportedly targeting management cuts as it changes to fit a “dynamic marketplace”

      May 15, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      A cross-platform Markdown note-taking application

      May 15, 2025
      Recent

      A cross-platform Markdown note-taking application

      May 15, 2025

      AI Assistant Demo & Tips for Enterprise Projects

      May 15, 2025

      Celebrating Global Accessibility Awareness Day (GAAD)

      May 15, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Intel’s latest Arc graphics driver is ready for DOOM: The Dark Ages, launching for Premium Edition owners on PC today

      May 15, 2025
      Recent

      Intel’s latest Arc graphics driver is ready for DOOM: The Dark Ages, launching for Premium Edition owners on PC today

      May 15, 2025

      NVIDIA’s drivers are causing big problems for DOOM: The Dark Ages, but some fixes are available

      May 15, 2025

      Capcom breaks all-time profit records with 10% income growth after Monster Hunter Wilds sold over 10 million copies in a month

      May 15, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»How Many Academic Papers are Written with the Help of ChatGPT? This AI Paper Delves into ChatGPT Usage in Academic Writing through Excess Vocabulary

    How Many Academic Papers are Written with the Help of ChatGPT? This AI Paper Delves into ChatGPT Usage in Academic Writing through Excess Vocabulary

    June 26, 2024

    There has been a rapid increase in the use of large language models (LLMs), such as ChatGPT, in academic writing. This study investigates how prevalent these AI tools are in scholarly literature, particularly focusing on detecting changes in writing style and vocabulary in biomedical research abstracts from PubMed between 2010 and 2024. The widespread availability of LLMs has led to concerns about the authenticity and originality of scientific texts, with implications for research integrity and the evaluation of academic contributions. 

    Traditionally, attempts to quantify the presence of LLM-generated text in academic literature have relied on several methods. One common approach involves using LLM detectors, trained to distinguish between human and AI-generated text based on known samples. Another method models word frequency distributions in scientific texts, treating them as mixtures of human and AI-generated content. A third strategy employs lists of marker words overused by LLMs, typically stylistic terms rather than content-specific vocabulary.

    A novel, data-driven approach is proposed that avoids some limitations of previous methods. Instead of relying on predefined datasets of human and LLM-generated texts, their method examines excess word usage to identify LLM involvement. Inspired by studies of excess mortality during the COVID-19 pandemic, this technique tracks the frequency of certain words that show a significant increase post-ChatGPT release compared to their expected usage based on trends from earlier years. This method allows for a more unbiased and comprehensive analysis of LLM’s impact on scientific writing.

    The researchers analyzed over 14 million PubMed abstracts from 2010 to 2024. They created a matrix of word occurrences across these abstracts and calculated the annual frequency of each word. By comparing the observed frequencies in 2023 and 2024 to counterfactual projections based on trends from 2021 and 2022, they identified words with significant increases in usage. These words, termed “excess words,” were then used to gauge the influence of LLMs.

    The analysis revealed that certain words, especially stylistic ones like “delves,” “showcasing,” and “underscores,” showed marked increases in frequency, suggesting LLM involvement. The researchers quantified this excess usage with two measures: the excess frequency gap (the difference between observed and expected frequencies) and the excess frequency ratio (the ratio of observed to expected frequencies). They found a substantial rise in the number of excess words in 2024, coinciding with the widespread availability of ChatGPT. This increase was unprecedented, surpassing the vocabulary changes observed during the COVID-19 pandemic.

    To estimate the extent of LLM usage, the researchers used the frequency gap of excess words as a lower bound. For example, the word “potential” showed an excess frequency gap, indicating that at least 4% of 2024 abstracts included this word due to LLM influence. By analyzing abstracts containing words with excess usage, the authors obtained a lower bound of 10% for LLM-assisted papers in 2024. This approach provided a robust lower bound, acknowledging that the actual figure could be higher due to some LLM-processed abstracts not containing any tracked excess words. This estimate differed across disciplines (e.g., 20% in computation, 6% in Nature/Science/Cell), countries (e.g., 16% in China vs 3% in the UK), and journals (e.g., 24% in Sensors, 17% in Frontiers/MDPI). The highest estimate was 35% for computation papers from China.

    The research highlights a significant shift in academic writing styles due to the advent of LLMs like ChatGPT. By developing a novel methodology to track excess word usage, the study provides compelling evidence that LLMs have had a notable impact on scientific literature, with at least 10% of recent biomedical abstracts showing signs of AI assistance. This underscores the transformative effect of LLMs on scholarly communication and raises important questions about research integrity and the future of academic writing.

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. 

    Join our Telegram Channel and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 45k+ ML SubReddit

    Create, edit, and augment tabular data with the first compound AI system, Gretel Navigator, now generally available! [Advertisement]

    The post How Many Academic Papers are Written with the Help of ChatGPT? This AI Paper Delves into ChatGPT Usage in Academic Writing through Excess Vocabulary appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleMeet Abstra: An AI-Powered Startup that Scales Business Processes with Python and AI
    Next Article Camb AI Releases MARS5 TTS: A Novel Open Source Text to Speech Model for Insane Prosody

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 16, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-4743 – Code-projects Employee Record System SQL Injection Vulnerability

    May 16, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Firefox Nightly: arriva il gestore profili

    Linux

    CVE-2025-45618 – Jeeweb Mybatis Springboot Unauthenticated Information Disclosure

    Common Vulnerabilities and Exposures (CVEs)

    Perficient is headed to Data Cloud Summit

    Development

    Experts Uncover Chinese Cybercrime Network Behind Gambling and Human Trafficking

    Development

    Highlights

    Highlights from Git 2.48

    January 10, 2025

    The open source Git project just released Git 2.48 with features and bug fixes from…

    Provable Uncertainty Decomposition via Higher-Order Calibration

    January 28, 2025

    If Intel can’t come up with a Qualcomm-killer soon, it’s game over for x86 PCs

    July 31, 2024

    ProVision: A Scalable Programmatic Approach to Vision-Centric Instruction Data for Multimodal Language Models

    January 11, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.