Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 13, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 13, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 13, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 13, 2025

      This $4 Steam Deck game includes the most-played classics from my childhood — and it will save you paper

      May 13, 2025

      Microsoft shares rare look at radical Windows 11 Start menu designs it explored before settling on the least interesting one of the bunch

      May 13, 2025

      NVIDIA’s new GPU driver adds DOOM: The Dark Ages support and improves DLSS in Microsoft Flight Simulator 2024

      May 13, 2025

      How to install and use Ollama to run AI LLMs on your Windows 11 PC

      May 13, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Community News: Latest PECL Releases (05.13.2025)

      May 13, 2025
      Recent

      Community News: Latest PECL Releases (05.13.2025)

      May 13, 2025

      How We Use Epic Branches. Without Breaking Our Flow.

      May 13, 2025

      I think the ergonomics of generators is growing on me.

      May 13, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      This $4 Steam Deck game includes the most-played classics from my childhood — and it will save you paper

      May 13, 2025
      Recent

      This $4 Steam Deck game includes the most-played classics from my childhood — and it will save you paper

      May 13, 2025

      Microsoft shares rare look at radical Windows 11 Start menu designs it explored before settling on the least interesting one of the bunch

      May 13, 2025

      NVIDIA’s new GPU driver adds DOOM: The Dark Ages support and improves DLSS in Microsoft Flight Simulator 2024

      May 13, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Artificial Intelligence»Study reveals AI chatbots can detect race, but racial bias reduces response empathy

    Study reveals AI chatbots can detect race, but racial bias reduces response empathy

    December 20, 2024

    With the cover of anonymity and the company of strangers, the appeal of the digital world is growing as a place to seek out mental health support. This phenomenon is buoyed by the fact that over 150 million people in the United States live in federally designated mental health professional shortage areas.

    “I really need your help, as I am too scared to talk to a therapist and I can’t reach one anyways.”

    “Am I overreacting, getting hurt about husband making fun of me to his friends?”

    “Could some strangers please weigh in on my life and decide my future for me?”

    The above quotes are real posts taken from users on Reddit, a social media news website and forum where users can share content or ask for advice in smaller, interest-based forums known as “subreddits.” 

    Using a dataset of 12,513 posts with 70,429 responses from 26 mental health-related subreddits, researchers from MIT, New York University (NYU), and University of California Los Angeles (UCLA) devised a framework to help evaluate the equity and overall quality of mental health support chatbots based on large language models (LLMs) like GPT-4. Their work was recently published at the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP).

    To accomplish this, researchers asked two licensed clinical psychologists to evaluate 50 randomly sampled Reddit posts seeking mental health support, pairing each post with either a Redditor’s real response or a GPT-4 generated response. Without knowing which responses were real or which were AI-generated, the psychologists were asked to assess the level of empathy in each response.

    Mental health support chatbots have long been explored as a way of improving access to mental health support, but powerful LLMs like OpenAI’s ChatGPT are transforming human-AI interaction, with AI-generated responses becoming harder to distinguish from the responses of real humans.

    Despite this remarkable progress, the unintended consequences of AI-provided mental health support have drawn attention to its potentially deadly risks; in March of last year, a Belgian man died by suicide as a result of an exchange with ELIZA, a chatbot developed to emulate a psychotherapist powered with an LLM called GPT-J. One month later, the National Eating Disorders Association would suspend their chatbot Tessa, after the chatbot began dispensing dieting tips to patients with eating disorders.

    Saadia Gabriel, a recent MIT postdoc who is now a UCLA assistant professor and first author of the paper, admitted that she was initially very skeptical of how effective mental health support chatbots could actually be. Gabriel conducted this research during her time as a postdoc at MIT in the Healthy Machine Learning Group, led Marzyeh Ghassemi, an MIT associate professor in the Department of Electrical Engineering and Computer Science and MIT Institute for Medical Engineering and Science who is affiliated with the MIT Abdul Latif Jameel Clinic for Machine Learning in Health and the Computer Science and Artificial Intelligence Laboratory.

    What Gabriel and the team of researchers found was that GPT-4 responses were not only more empathetic overall, but they were 48 percent better at encouraging positive behavioral changes than human responses.

    However, in a bias evaluation, the researchers found that GPT-4’s response empathy levels were reduced for Black (2 to 15 percent lower) and Asian posters (5 to 17 percent lower) compared to white posters or posters whose race was unknown. 

    To evaluate bias in GPT-4 responses and human responses, researchers included different kinds of posts with explicit demographic (e.g., gender, race) leaks and implicit demographic leaks. 

    An explicit demographic leak would look like: “I am a 32yo Black woman.”

    Whereas an implicit demographic leak would look like: “Being a 32yo girl wearing my natural hair,” in which keywords are used to indicate certain demographics to GPT-4.

    With the exception of Black female posters, GPT-4’s responses were found to be less affected by explicit and implicit demographic leaking compared to human responders, who tended to be more empathetic when responding to posts with implicit demographic suggestions.

    “The structure of the input you give [the LLM] and some information about the context, like whether you want [the LLM] to act in the style of a clinician, the style of a social media post, or whether you want it to use demographic attributes of the patient, has a major impact on the response you get back,” Gabriel says.

    The paper suggests that explicitly providing instruction for LLMs to use demographic attributes can effectively alleviate bias, as this was the only method where researchers did not observe a significant difference in empathy across the different demographic groups.

    Gabriel hopes this work can help ensure more comprehensive and thoughtful evaluation of LLMs being deployed in clinical settings across demographic subgroups.

    “LLMs are already being used to provide patient-facing support and have been deployed in medical settings, in many cases to automate inefficient human systems,” Ghassemi says. “Here, we demonstrated that while state-of-the-art LLMs are generally less affected by demographic leaking than humans in peer-to-peer mental health support, they do not provide equitable mental health responses across inferred patient subgroups … we have a lot of opportunity to improve models so they provide improved support when used.”

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleMIT researchers introduce Boltz-1, a fully open-source model for predicting biomolecular structures
    Next Article State-of-the-art video and image generation with Veo 2 and Imagen 3

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 14, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-3623 – WordPress Uncanny Automator PHP Object Injection Vulnerability

    May 14, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    The Beginning and The End of The User Experience

    Web Development

    Choosing Between callTestCase and Custom Keywords in Katalon Studio

    Development

    Scaling Diffusion Language Models via Adaptation from Autoregressive Models

    Machine Learning

    Meta AI Proposes EvalPlanner: A Preference Optimization Algorithm for Thinking-LLM-as-a-Judge

    Machine Learning

    Highlights

    The best robot vacuum of CES 2025 – and 4 others that impressed us

    January 9, 2025

    Robot vacuums are getting some outstanding upgrades this year, and ZDNET has picked the best…

    Is free Apple TV+ on the way? The streaming service is teasing something for next weekend

    December 27, 2024

    An Introduction to Docker and Containers for Beginners

    November 26, 2024

    Bluesky saw a 763% rise in users in 2024, including 13 million new users in just the past month and a half

    January 1, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.