Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»How are you testing your Generative AI ? What testing strategies have you discovered

    How are you testing your Generative AI ? What testing strategies have you discovered

    November 26, 2024

    Generative AI is becoming the new norm, widely used and more accessible to the public via platforms like ChatGPT or Meta AI, which appear on social media platforms like WhatsApp and Instagram Messenger.
    Despite its being fundamentally a transformers that break sentences into tokens and predict the next word, the implications and applications are vast. However, these GPT models currently lack human-like understanding. Which might cause reliability issues and others, but considering its capabilities the new trend of agentic AI is on rise this highlights the importance of having a well-defined testing approach.

    I wanted to ask:

    1. What are the patterns or testing strategies you are following beyond basic testing strategies?
    2. What’s your approach to identify and fix, do you follow any checkmarks ?
      • AI Hallucination
      • Fairness and Bias
      • Security & Ethical Issue
      • Coherence and relevance
      • Robustness and Reliability
      • Explainability and Interpretability
      • Include others you have Identified

    Here are some of my observations:
    Example 1: AI Hallucination

    Issue: Generating factually incorrect or nonsensical outputs, The response provided has data that is not reliable however its sounds plausible or true.

    Solution: Fact-checking, Human-in-the-loop, Prompt engineering, Training data quality, Model fine-tuning, Post-processing

    Example 2: Bias and Fairness

    Issue: Based on the data, Generating outputs that unfairly favor certain groups.

    Solution: Bias audits, Fairness metrics, Diverse training data

    Example 3: Adherence to Instructions

    Issue: With tools like Meta AI Agents and similar others in Salesforce, we need to check if the response adheres to the instructions, as sometimes it fails to follow the guidelines and guardrails.

    Solution: It might be an issue with the instruction, but we need to go back to basics and test against each instruction to check if it is followed or not.
    This might become hectic any alternate

    Example 4: Not in Coherence Knowledge Article Boundaries

    Issue: GPT models used as chatbots with a set of knowledge articles sometimes provide results outside the set of knowledge articles as a reference.

    Solution: Coherence metrics, Prompt design, Feedback

    Example 5: Chain of Thought

    Issue: In some cases, the generative AI assumes continuity with earlier conversations within the window period, which might cause unnecessary references.

    Solution: There should be instructions to cross-verify and provide a note.
    Most of these issues can be addressed with effective prompt engineering. However, I am curious about your methods for breaking these issues and any observations you have identified.

    Source: Read More

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleFollow Up: We Officially Have a CSS Logo!
    Next Article The Importance of Code Reviews: Tips and Best Practices

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 17, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-40906 – MongoDB BSON Serialization BSON::XS Multiple Vulnerabilities

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Use zero-shot large language models on Amazon Bedrock for custom named entity recognition

    Development

    CodeSOD: Message Oriented Database

    News & Updates

    US car dealerships reeling from massive cyberattack: 3 things customers should know

    Development

    What is Subgraph OS? What are its Features? How does it work, and What are its Advantages?

    Development
    Hostinger

    Highlights

    How to Combine currentcolor with Relative Color Syntax in CSS

    February 11, 2025

    Post Content Source: Read More 

    prc – parvarish rehabilitation treatment center

    November 16, 2024

    L’arrivo di Cinnamon 6.4 in LMDE 6 “Faye”

    January 19, 2025

    Texting while driving? AI traffic cameras are watching you in these 5 states

    February 25, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.