Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 17, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 17, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 17, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 17, 2025

      Microsoft’s allegiance isn’t to OpenAI’s pricey models — Satya Nadella’s focus is selling any AI customers want for maximum profits

      May 17, 2025

      If you think you can do better than Xbox or PlayStation in the Console Wars, you may just want to try out this card game

      May 17, 2025

      Surviving a 10 year stint in dev hell, this retro-styled hack n’ slash has finally arrived on Xbox

      May 17, 2025

      Save $400 on the best Samsung TVs, laptops, tablets, and more when you sign up for Verizon 5G Home or Home Internet

      May 17, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      NodeSource N|Solid Runtime Release – May 2025: Performance, Stability & the Final Update for v18

      May 17, 2025
      Recent

      NodeSource N|Solid Runtime Release – May 2025: Performance, Stability & the Final Update for v18

      May 17, 2025

      Big Changes at Meteor Software: Our Next Chapter

      May 17, 2025

      Apps in Generative AI – Transforming the Digital Experience

      May 17, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft’s allegiance isn’t to OpenAI’s pricey models — Satya Nadella’s focus is selling any AI customers want for maximum profits

      May 17, 2025
      Recent

      Microsoft’s allegiance isn’t to OpenAI’s pricey models — Satya Nadella’s focus is selling any AI customers want for maximum profits

      May 17, 2025

      If you think you can do better than Xbox or PlayStation in the Console Wars, you may just want to try out this card game

      May 17, 2025

      Surviving a 10 year stint in dev hell, this retro-styled hack n’ slash has finally arrived on Xbox

      May 17, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Steerability and Bias in LLMs: Navigating Multifaceted Persona Representation

    Steerability and Bias in LLMs: Navigating Multifaceted Persona Representation

    June 4, 2024

    LLMs need to generate text reflecting the diverse views of multifaceted personas. Prior studies on bias in LLMs have focused on simplistic, one-dimensional personas or multiple-choice formats. However, many applications require LLMs to generate open-ended text based on complex personas. The ability to steer LLMs to represent these multifaceted personas accurately is critical to avoid oversimplified or biased representations. If LLMs fail to capture the nuanced views of complex personas, they risk perpetuating stereotypes and monolithic perspectives, especially when personas don’t align with typical demographic views. This could introduce new biases in simulations of individuals.

    Carnegie Mellon University researchers define an incongruous persona as one where a trait makes other traits less likely in human data, such as political liberals supporting military spending. LLMs are 9.7% less steerable towards such personas than congruous ones, often reverting to stereotypical views. Models fine-tuned with RLHF are more steerable but show reduced view diversity. Steerability in multiple-choice tasks does not predict open-ended steerability. GPT-4 closely matches human evaluations. These findings highlight the need for improved steerability toward diverse personas and generating nuanced human opinions in LLMs.

    Recent research on persona-steered generation has expanded on previous frameworks by focusing on the steerability and congruity of multifaceted personas in LLMs, considering the model scale and fine-tuning effects. Studies have used LLMs to simulate human behavior and evaluate model-generated statements, noting that RLHF can amplify political biases. Concerns about toxic outputs in the persona-steered generation have also been raised. Evaluations of LLM biases show significant variance in model accuracy and alignment with human opinions, particularly in open-ended tasks. Recent work highlights the challenges in reliably simulating diverse personas and the importance of model alignment for downstream tasks.

    To assess the steerability of LLMs towards various personas, multifaceted personas combining a demographic and a stance were created using data from the Pew Research Center. Incongruous personas were identified where a demographic trait decreases the likelihood of holding certain stances. Models were tested by generating statements that align with these personas, using different model sizes and fine-tuning methods. GPT-4 evaluated steerability by comparing generated statements against given stances. Additional metrics such as individuation, exaggeration, entailment diversity, and semantic diversity were measured further to analyze the characteristics and diversity of model-generated statements.

    GPT-4 aligns closely with human evaluations, showing a strong steerability assessment correlation. Models fine-tuned with RLHF and DPO are generally more steerable, especially towards stances associated with women and political liberals. However, models struggle with incongruous personas, showing significant steerability differences. Steerability could be predicted better by survey response rates. Models are biased toward generating common stances for a demographic, leading to less diversity and more stereotypes. This can perpetuate social polarization and limit models’ ability to represent complex social identities, potentially causing representational harm.

    In conclusion, the study explores how effectively LLMs can be guided to generate persona-specific statements, revealing that models are more easily steered towards congruent personas across various stances on politics, race, and gender. Models fine-tuned with RLHF show higher steerability, particularly for stances linked to political liberals or women, though at the cost of diversity. Sensitivity to persona congruity suggests models may still propagate demographic stereotypes. Future research should investigate LLM behavior in more interactive settings and develop complex, multifaceted representations to understand better and mitigate these biases.

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 43k+ ML SubReddit | Also, check out our AI Events Platform

    The post Steerability and Bias in LLMs: Navigating Multifaceted Persona Representation appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleSparse Maximal Update Parameterization (SμPar): Optimizing Sparse Neural Networks for Superior Training Dynamics and Efficiency
    Next Article Are AI-RAG Solutions Really Hallucination-Free? Researchers at Stanford University Assess the Reliability of AI in Legal Research: Hallucinations and Accuracy Challenges

    Related Posts

    Development

    February 2025 Baseline monthly digest

    May 17, 2025
    Development

    Learn A1 Level Spanish

    May 17, 2025
    Leave A Reply Cancel Reply

    Hostinger

    Continue Reading

    The best smart home devices of 2024: Expert tested and reviewed

    Development

    The most competent robot vacuum I tested last year just got a major upgrade

    News & Updates

    6 Best VPNs for Australia in 2024

    Development

    Your interview is a sales call

    Learning Resources
    Hostinger

    Highlights

    The latest KB5053649 to the Beta Channel finally fixed one of most frustrating issues with Windows Tools

    March 19, 2025

    Microsoft released KB5053649 to Windows 11 Beta Channel Insiders. While the patch is small, it…

    I Love Lasagna and Latinas Shirt

    April 23, 2025

    How GitHub harnesses AI to transform customer feedback into action

    July 30, 2024

    Sick of driving? Uber is paying people $1,000 to ditch their cars

    June 27, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.