Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      June 3, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      June 3, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      June 3, 2025

      How To Prevent WordPress SQL Injection Attacks

      June 3, 2025

      All the WWE 2K25 locker codes that are currently active

      June 3, 2025

      PSA: You don’t need to spend $400+ to upgrade your Xbox Series X|S storage

      June 3, 2025

      UK civil servants saved 24 minutes per day using Microsoft Copilot, saving two weeks each per year according to a new report

      June 3, 2025

      These solid-state fans will revolutionize cooling in our PCs and laptops

      June 3, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Community News: Latest PECL Releases (06.03.2025)

      June 3, 2025
      Recent

      Community News: Latest PECL Releases (06.03.2025)

      June 3, 2025

      A Comprehensive Guide to Azure Firewall

      June 3, 2025

      Test Job Failures Precisely with Laravel’s assertFailedWith Method

      June 3, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      All the WWE 2K25 locker codes that are currently active

      June 3, 2025
      Recent

      All the WWE 2K25 locker codes that are currently active

      June 3, 2025

      PSA: You don’t need to spend $400+ to upgrade your Xbox Series X|S storage

      June 3, 2025

      UK civil servants saved 24 minutes per day using Microsoft Copilot, saving two weeks each per year according to a new report

      June 3, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»CMU Researchers Propose QueRE: An AI Approach to Extract Useful Features from a LLM

    CMU Researchers Propose QueRE: An AI Approach to Extract Useful Features from a LLM

    January 17, 2025

    Large Language Models (LLMs) have become integral to various artificial intelligence applications, demonstrating capabilities in natural language processing, decision-making, and creative tasks. However, critical challenges remain in understanding and predicting their behaviors. Treating LLMs as black boxes complicates efforts to assess their reliability, particularly in contexts where errors can have significant consequences. Traditional approaches often rely on internal model states or gradients to interpret behaviors, which are unavailable for closed-source, API-based models. This limitation raises an important question: how can we effectively evaluate LLM behavior with only black-box access? The problem is further compounded by adversarial influences and potential misrepresentation of models through APIs, highlighting the need for robust and generalizable solutions.

    To address these challenges, researchers at Carnegie Mellon University have developed QueRE (Question Representation Elicitation). This method is tailored for black-box LLMs and extracts low-dimensional, task-agnostic representations by querying models with follow-up prompts about their outputs. These representations, based on probabilities associated with elicited responses, are used to train predictors of model performance. Notably, QueRE performs comparably to or even better than some white-box techniques in reliability and generalizability.

    Unlike methods dependent on internal model states or full output distributions, QueRE relies on accessible outputs, such as top-k probabilities available through most APIs. When such probabilities are unavailable, they can be approximated through sampling. QueRE’s features also enable evaluations such as detecting adversarially influenced models and distinguishing between architectures and sizes, making it a versatile tool for understanding and utilizing LLMs.

    Technical Details and Benefits of QueRE

    QueRE operates by constructing feature vectors derived from elicitation questions posed to the LLM. For a given input and the model’s response, these questions assess aspects such as confidence and correctness. Questions like “Are you confident in your answer?” or “Can you explain your answer?” enable the extraction of probabilities that reflect the model’s reasoning.

    The extracted features are then used to train linear predictors for various tasks:

    1. Performance Prediction: Evaluating whether a model’s output is correct at an instance level.
    2. Adversarial Detection: Identifying when responses are influenced by malicious prompts.
    3. Model Differentiation: Distinguishing between different architectures or configurations, such as identifying smaller models misrepresented as larger ones.

    By relying on low-dimensional representations, QueRE supports strong generalization across tasks. Its simplicity ensures scalability and reduces the risk of overfitting, making it a practical tool for auditing and deploying LLMs in diverse applications.

    Results and Insights

    Experimental evaluations demonstrate QueRE’s effectiveness across several dimensions. In predicting LLM performance on question-answering (QA) tasks, QueRE consistently outperformed baselines relying on internal states. For instance, on open-ended QA benchmarks like SQuAD and Natural Questions (NQ), QueRE achieved an Area Under the Receiver Operating Characteristic Curve (AUROC) exceeding 0.95. Similarly, it excelled in detecting adversarially influenced models, outperforming other black-box methods.

    QueRE also proved robust and transferable. Its features were successfully applied to out-of-distribution tasks and different LLM configurations, validating its adaptability. The low-dimensional representations facilitated efficient training of simple models, ensuring computational feasibility and robust generalization bounds.

    Another notable result was QueRE’s ability to use random sequences of natural language as elicitation prompts. These sequences often matched or exceeded the performance of structured queries, highlighting the method’s flexibility and potential for diverse applications without extensive manual prompt engineering.

    Conclusion

    QueRE offers a practical and effective approach to understanding and optimizing black-box LLMs. By transforming elicitation responses into actionable features, QueRE provides a scalable and robust framework for predicting model behavior, detecting adversarial influences, and differentiating architectures. Its success in empirical evaluations suggests it is a valuable tool for researchers and practitioners aiming to enhance the reliability and safety of LLMs.

    As AI systems evolve, methods like QueRE will play a crucial role in ensuring transparency and trustworthiness. Future work could explore extending QueRE’s applicability to other modalities or refining its elicitation strategies for enhanced performance. For now, QueRE represents a thoughtful response to the challenges posed by modern AI systems.


    Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 65k+ ML SubReddit.

    🚨 Recommend Open-Source Platform: Parlant is a framework that transforms how AI agents make decisions in customer-facing scenarios. (Promoted)

    The post CMU Researchers Propose QueRE: An AI Approach to Extract Useful Features from a LLM appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleVLC Introduce i Sottotitoli Generati da AI!
    Next Article Meet Tensor Product Attention (TPA): Revolutionizing Memory Efficiency in Language Models

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    June 3, 2025
    Machine Learning

    This AI Paper Introduces LLaDA-V: A Purely Diffusion-Based Multimodal Large Language Model for Visual Instruction Tuning and Multimodal Reasoning

    June 3, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Russian Hackers Use Legit Remote Monitoring Software to Spy on Ukraine and Allies

    Development

    5 gadgets I can’t go off-grid without (and why they make such a big difference)

    News & Updates

    Devices everywhere: What the rise in edge investment means for your career

    Development

    560 million Ticketmaster customer data for sale? – Week in security with Tony Anscombe

    Development

    Highlights

    Development

    Researchers at Arizona State University Evaluates ReAct Prompting: The Role of Example Similarity in Enhancing Large Language Model Reasoning

    May 28, 2024

    Large Language Models (LLMs) have advanced rapidly, especially in Natural Language Processing (NLP) and Natural…

    45 Visual Studio Code Shortcuts for Boosting Your Productivity

    August 13, 2024

    CVE-2025-43835 – Ktsvetkov WP-Cyr CSRF

    May 19, 2025

    Forget cheap multitools. My favorite brand is repairable with a 25-year warranty

    April 23, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.