Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 18, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 18, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 18, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 18, 2025

      Gears of War: Reloaded — Release date, price, and everything you need to know

      May 18, 2025

      I’ve been using the Logitech MX Master 3S’ gaming-influenced alternative, and it could be your next mouse

      May 18, 2025

      Your Android devices are getting several upgrades for free – including a big one for Auto

      May 18, 2025

      You may qualify for Apple’s $95 million Siri settlement – how to file a claim today

      May 18, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      YTConverter™ lets you download YouTube videos/audio cleanly via terminal — especially great for Termux users.

      May 18, 2025
      Recent

      YTConverter™ lets you download YouTube videos/audio cleanly via terminal — especially great for Termux users.

      May 18, 2025

      NodeSource N|Solid Runtime Release – May 2025: Performance, Stability & the Final Update for v18

      May 17, 2025

      Big Changes at Meteor Software: Our Next Chapter

      May 17, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Gears of War: Reloaded — Release date, price, and everything you need to know

      May 18, 2025
      Recent

      Gears of War: Reloaded — Release date, price, and everything you need to know

      May 18, 2025

      I’ve been using the Logitech MX Master 3S’ gaming-influenced alternative, and it could be your next mouse

      May 18, 2025

      How to Make Your Linux Terminal Talk Using espeak-ng

      May 18, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Steerability and Bias in LLMs: Navigating Multifaceted Persona Representation

    Steerability and Bias in LLMs: Navigating Multifaceted Persona Representation

    June 4, 2024

    LLMs need to generate text reflecting the diverse views of multifaceted personas. Prior studies on bias in LLMs have focused on simplistic, one-dimensional personas or multiple-choice formats. However, many applications require LLMs to generate open-ended text based on complex personas. The ability to steer LLMs to represent these multifaceted personas accurately is critical to avoid oversimplified or biased representations. If LLMs fail to capture the nuanced views of complex personas, they risk perpetuating stereotypes and monolithic perspectives, especially when personas don’t align with typical demographic views. This could introduce new biases in simulations of individuals.

    Carnegie Mellon University researchers define an incongruous persona as one where a trait makes other traits less likely in human data, such as political liberals supporting military spending. LLMs are 9.7% less steerable towards such personas than congruous ones, often reverting to stereotypical views. Models fine-tuned with RLHF are more steerable but show reduced view diversity. Steerability in multiple-choice tasks does not predict open-ended steerability. GPT-4 closely matches human evaluations. These findings highlight the need for improved steerability toward diverse personas and generating nuanced human opinions in LLMs.

    Recent research on persona-steered generation has expanded on previous frameworks by focusing on the steerability and congruity of multifaceted personas in LLMs, considering the model scale and fine-tuning effects. Studies have used LLMs to simulate human behavior and evaluate model-generated statements, noting that RLHF can amplify political biases. Concerns about toxic outputs in the persona-steered generation have also been raised. Evaluations of LLM biases show significant variance in model accuracy and alignment with human opinions, particularly in open-ended tasks. Recent work highlights the challenges in reliably simulating diverse personas and the importance of model alignment for downstream tasks.

    To assess the steerability of LLMs towards various personas, multifaceted personas combining a demographic and a stance were created using data from the Pew Research Center. Incongruous personas were identified where a demographic trait decreases the likelihood of holding certain stances. Models were tested by generating statements that align with these personas, using different model sizes and fine-tuning methods. GPT-4 evaluated steerability by comparing generated statements against given stances. Additional metrics such as individuation, exaggeration, entailment diversity, and semantic diversity were measured further to analyze the characteristics and diversity of model-generated statements.

    GPT-4 aligns closely with human evaluations, showing a strong steerability assessment correlation. Models fine-tuned with RLHF and DPO are generally more steerable, especially towards stances associated with women and political liberals. However, models struggle with incongruous personas, showing significant steerability differences. Steerability could be predicted better by survey response rates. Models are biased toward generating common stances for a demographic, leading to less diversity and more stereotypes. This can perpetuate social polarization and limit models’ ability to represent complex social identities, potentially causing representational harm.

    In conclusion, the study explores how effectively LLMs can be guided to generate persona-specific statements, revealing that models are more easily steered towards congruent personas across various stances on politics, race, and gender. Models fine-tuned with RLHF show higher steerability, particularly for stances linked to political liberals or women, though at the cost of diversity. Sensitivity to persona congruity suggests models may still propagate demographic stereotypes. Future research should investigate LLM behavior in more interactive settings and develop complex, multifaceted representations to understand better and mitigate these biases.

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 43k+ ML SubReddit | Also, check out our AI Events Platform

    The post Steerability and Bias in LLMs: Navigating Multifaceted Persona Representation appeared first on MarkTechPost.

    Source: Read More 

    Hostinger
    Facebook Twitter Reddit Email Copy Link
    Previous ArticleSparse Maximal Update Parameterization (SμPar): Optimizing Sparse Neural Networks for Superior Training Dynamics and Efficiency
    Next Article Are AI-RAG Solutions Really Hallucination-Free? Researchers at Stanford University Assess the Reliability of AI in Legal Research: Hallucinations and Accuracy Challenges

    Related Posts

    Development

    Last Week in AI #302 – QwQ 32B, OpenAI injunction refused, Alexa Plus

    May 18, 2025
    Artificial Intelligence

    LWiAI Podcast #202 – Qwen-32B, Anthropic’s $3.5 billion, LLM Cognitive Behaviors

    May 18, 2025
    Leave A Reply Cancel Reply

    Hostinger

    Continue Reading

    SEC Won’t Bring Charges Against Progress Software Over MOVEit Supply Chain Attack

    Development

    Use Single-AZ read replicas in Amazon RDS for SQL Server

    Databases

    CVE-2024-12225 – Quarkus WebAuthn Default Endpoints Information Disclosure and Authentication Bypass

    Common Vulnerabilities and Exposures (CVEs)

    Critical Webmin Vulnerability Let Remote Attackers Escalate Privileges to Root-Level

    Security

    Highlights

    News & Updates

    PowerToys latest update focuses on fixes and new features for the Command Palette

    May 15, 2025

    Command Palette is an evolution of PowerToys Run, and while it’s still a work in…

    JetBrains AI Assistant can now use local LLMs

    February 11, 2025

    Implementing Persistent Memory Using a Local Knowledge Graph in Claude Desktop

    April 27, 2025

    Need to relax? This new iPhone feature does the trick for me – here’s how

    April 16, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.