Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Upwork Freelancers vs Dedicated React.js Teams: What’s Better for Your Project in 2025?

      August 1, 2025

      Is Agile dead in the age of AI?

      August 1, 2025

      Top 15 Enterprise Use Cases That Justify Hiring Node.js Developers in 2025

      July 31, 2025

      The Core Model: Start FROM The Answer, Not WITH The Solution

      July 31, 2025

      Finally, a sleek gaming laptop I can take to the office (without sacrificing power)

      August 1, 2025

      These jobs face the highest risk of AI takeover, according to Microsoft

      August 1, 2025

      Apple’s tariff costs and iPhone sales are soaring – how long until device prices are too?

      August 1, 2025

      5 ways to successfully integrate AI agents into your workplace

      August 1, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Enhancing Laravel Queries with Reusable Scope Patterns

      August 1, 2025
      Recent

      Enhancing Laravel Queries with Reusable Scope Patterns

      August 1, 2025

      Everything We Know About Livewire 4

      August 1, 2025

      Everything We Know About Livewire 4

      August 1, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      YouTube wants to use AI to treat “teens as teens and adults as adults” — with the most age-appropriate experiences and protections

      August 1, 2025
      Recent

      YouTube wants to use AI to treat “teens as teens and adults as adults” — with the most age-appropriate experiences and protections

      August 1, 2025

      Sam Altman is afraid of OpenAI’s GPT-5 creation — “The Manhattan Project feels very fast, like there are no adults in the room”

      August 1, 2025

      9 new features that arrived on the Windows 11 Insider Program during the second half of July 2025

      August 1, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»AI agents unifying structured and unstructured data: Transforming support analytics and beyond with Amazon Q Plugins

    AI agents unifying structured and unstructured data: Transforming support analytics and beyond with Amazon Q Plugins

    July 31, 2025

    As organizations seek to derive greater value from their AWS Support data, operational teams are looking for ways to transform raw support cases and health events into actionable insights. While traditional analytics tools can provide basic reporting capabilities, teams need more sophisticated solutions that can understand and process natural language queries about their operational data. Retrieval-Augmented Generation (RAG) architecture forms the foundation for optimizing large language model outputs by referencing authoritative knowledge bases outside of their training data before generating responses. This architecture uses the power of semantic search and information retrieval capabilities to enhance accuracy.

    In our previous blog post, Derive meaningful and actionable operational insights from AWS Using Amazon Q Business, we introduced a RAG-based solution using Amazon Q Business. However, while this approach excels at semantic search, it can face challenges with precise numerical analysis and aggregations. In this post, we address these limitations by showing how you can enhance Amazon Q with custom plugins to improve support analytics and incident response. Our solution combines the strengths of RAG with structured data querying, enabling more accurate answers to analytical questions about your AWS Support data. Detailed deployment instructions for the solution in this post are available in our AWS Samples GitHub repository.

    Understanding RAG’s design for analytics

    RAG architecture excels at finding relevant information through context-based matching, enabling natural language interactions with external data. However, the fundamental architecture of vector similarity search, which prioritizes semantic relevance over analytical aggregation capabilities, can introduce significant inaccuracies. Let’s examine this limitation with a simple query:

    Give me total count of support cases in Feb 2025

    The following is a RAG-only response indicating that the system attempts to piece together information from various documents, reporting 190 cases through potentially hallucinated aggregations.

    Based on the provided JSON files, there were 190 AWS support cases documented in 
    February 2025, calculated from the following case counts: 17 cases from Feb 1-3, 
    12 cases from Feb 4-5, 21 cases from Feb 7-8, 21 cases from Feb 8-10, 16 cases from 
    Feb 10-11, 10 cases from Feb 11-12, 23 cases from Feb 19-20, 22 cases from Feb 20-21,
    27 cases from Feb 23-24, and 21 cases from Feb 25-26.

    However, when enhanced with structured data querying capabilities, the same query yields 958 support cases, the accurate count verified against the actual dataset:

    Based on the query results, there were a total of 958 support cases in February 2025.

    This limitation becomes even more apparent with moderately complex queries that require categorical analysis:

    Give me OpenSearch Support cases, distribution based on severity, opened in Jan 2025?

    The following is a RAG-only response showing limited or inaccurate categorical breakdown:

    Based on the support cases data from January 2025, here is the severity distribution 
    for OpenSearch cases: 
    
    Urgent: One case was opened regarding cluster health degradation and production system 
    downtime
    
    Normal: Two cases were identified:
    1. A case about OpenSearch RI Exchange
    2. A case concerning blue/green deployment upgrade from version 1.3 to 2.17
    
    The cases demonstrate varying levels of urgency, with one critical production issue 
    and two normal severity cases requiring standard response times.

    With structured data querying, we get precise severity distributions:

    Based on the query results for OpenSearch Support cases opened in January 2025,
    here's the distribution by severity:
    1. Normal severity: 4 cases
    2. Critical severity: 4 cases
    3. High severity: 3 cases
    4. Low severity: 2 cases
    5. Urgent severity: 1 case
    
    In total, there were 14 OpenSearch Support cases opened in January 2025. 
    The distribution shows that Normal and Critical severity cases were the most common, 
    followed by High severity cases. There were fever Low and Urgent severity cases 
    during this period.

    The gap widens further when dealing with multi-dimensional analysis. Consider the following multi-dimensional query expecting detailed analysis with account-service combinations and temporal patterns.

    Query:

    Find accounts with frequent high-priority support cases since January 2024, 
    excluding limit increase requests. For each account and AWS service combination, 
    show the total number of cases and how many different days had issues. 
    Only include results where there were more than 3 cases, and sort results by the 
    highest number of cases first.

    Response:

    Based on the analysis of high-priority support cases since January 2024, excluding
    limit increase requests, here are the key findings:
    
    1. The account with the most frequent high-priority cases is 111111111111, with 
    60 cases for the Connect service.
    2. The top 5 account and service combinations with the highest number of cases are:
    * Account 111111111111 (Connect): 60 cases
    * Account 222222222222 (RDS Oracle): 52 cases
    * Account 333333333333 (OpenSearch Service): 38 cases
    * Account 444444444444 (Aurora PostgreSQL): 25 cases
    * Account 555555555555 (EMR): 23 cases
    
    The services that appear most frequently in the top results are:
    * Aurora PostgreSQL
    * OpenSearch Service
    * RDS Oracle
    * EMR
    * DynamoDB
    
    Interestingly, the number of total cases matches the number of unique days with 
    cases for each account-service combination. This suggests that these accounts 
    are experiencing issues on different days rather than multiple issues on the 
    same day.
    
    There are 89 account-service combinations that have more than 3 high-priority 
    cases since January 2024.
    
    Some accounts appear multiple times with different services, indicating they 
    may be experiencing issues across various AWS offerings. For example:
    * Account 222222222222 appears with RDS Oracle (52 cases), DynamoDB (18 cases), 
    and Aurora PostgreSQL (6 cases)
    * Account 111111111111 appears with Connect (60 cases) and other services
    
    This information can be useful for identifying accounts that may need additional 
    support or proactive engagement to address recurring issues. Would you like to 
    focus on a specific aspect of this data or explore further details about 
    particular accounts or services?

    The following is a RAG-only response showing inability to process the same complex query:

    Based on the provided data source, I cannot provide a complete analysis of 
    high-priory support cases and their frequency patterns. While there are support 
    cases mentioned in the data from January 2024, the data does not contain 
    sufficient information about case priorities or enough cases per account 
    to identity patterns meeting the threshold of more than 3 cases. 
    Additionally, while some technical and customer service cases are shown, 
    the data is fragmented across different dates without providing a comprehensive 
    view of case volumes or frequencies per account and service.

    These examples demonstrate how RAG alone struggles with analytical queries of increasing complexity, while our plugin-enhanced solution delivers precise, structured analysis at every level. When working with scenarios requiring precise numerical analysis, we can enhance RAG’s capabilities through structured approaches:

    • Aggregation and pattern analysis: When user prompts include aggregation queries (such as counts, totals, or distributions), they require exact numerical computation through structured querying to provide precise results. Vector similarity search alone cannot guarantee accurate numerical aggregations, making structured metadata querying essential for these analytical use cases.
    • Context and correlation analysis: External unstructured data requires thoughtful data engineering to extract and maintain structured metadata (such as creation dates, categories, severity levels, and service types). While RAG excels at finding semantically similar content, having well-defined metadata enables precise filtering and querying capabilities. For example, when analyzing system performance issues, structured metadata about incident timing, affected services, and their dependencies enables comprehensive impact analysis through exact querying rather than relying solely on semantic matching.

    Enhancing Q Support-Insights with agentic AI

    Building on the Q Support-Insights (QSI) solution introduced in Derive meaningful and actionable operational insights from AWS Using Amazon Q Business, we’ll demonstrate how to enhance analytical capabilities through agentic AI by creating custom plugins. This enhancement preserves QSI’s base implementation while adding precise analytical processing through structured metadata querying.

    QSI overview

    The Amazon Q Support Insights (QSI) solution consists of two main components:

    1. Data collection Pipeline
      • Support Collector module using AWS Lambda functions
      • The Support data consists of AWS Support cases, Health events, and Trusted Advisor checks
      • Amazon EventBridge for automated data collection. The data pipeline enables two synchronization mechanisms:
        • Real-time case updates: Processes AWS Support cases through event-based triggers (CreateCase, AddCommunicationToCase, ResolveCase, ReopenCase).
        • Historical and daily sync: Performs initial historical data sync and refreshes AWS Trusted Advisor data daily.
      • Stores data in JSON format in centralized Amazon Simple Storage Service (Amazon S3) bucket
      • Supports multi-account data aggregation through AWS Organizations
    2. Amazon Q Business application environment
      • Amazon Q Business application deployment
      • Amazon S3 connector for data source integration
      • Web experience configuration for user interaction
      • Authentication through AWS IAM Identity Center

    Enabling query aggregation with custom plugins for Amazon Q Business 

    Custom plugins extend Amazon Q Business to combine semantic search with precise analytics capabilities. The following implementation details outline how we’ve augmented the base QSI solution:

    • Augments QSI’s natural language processing with structured query capabilities
    • Converts analytical requests into precise Amazon Athena SQL using an Amazon Bedrock large language model (LLM)
    • Executes queries against structured metadata tables
    • Provides exact numerical results alongside semantic search responses

    Metadata processing

    To enable precise querying capabilities, we’ve implemented a robust metadata processing framework that structures and indexes support data:

    • Using QSI’s existing data collection in Amazon S3
    • Extracts structured metadata from support cases into two queryable tables
    • Maintains critical fields like dates, categories, severity levels
    • Enables precise filtering and aggregation capabilities

    Cross-source analysis

    The solution enables comprehensive analysis across multiple data sources through intelligent correlation capabilities:

    • Correlates support cases with health events
    • Enables comprehensive operational assessment
    • Supports pattern detection across multiple data sources

    Architecture overview

    The following illustration shows how the enhanced Amazon Q Business solution integrates custom plugins with the existing QSI architecture. The diagram demonstrates the flow from user query through analytical processing to response generation, highlighting how structured data querying complements the RAG-based semantic search capabilities.

    Architecture

    When a user submits an analytical query like Show me OpenSearch case distribution based on severity, the following process occurs:

    1. The custom plugin detects the analytical intent and triggers a Lambda function
    2. The Lambda function uses an Amazon Bedrock LLM to transform the natural language query into precise Athena SQL
    3. The query executes against structured metadata in Athena
    4. Results are synthesized back into natural language and presented alongside traditional RAG responses

    This approach enables precise analytics while maintaining the conversational interface users expect.

    Implementation example

    Here’s a sample interaction demonstrating the enhanced capabilities. Let’s start with a basic query to understand the overall OpenSearch case volume, using structured data querying:

    Query:

    Give me OpenSearch cases in Jan 2025

    Response:

    Based on the query results for OpenSearch cases in January 2025, here's a summary:
    1. Total cases: 15
    2. Severity breakdown:
    Critical: 4 cases
    High: 3 cases
    Urgent: 1 case
    Normal: 3 cases
    Low: 2 cases
    3. Status: All cases are resolved
    4. Most common issues:
    Cluster health problems (yellow status)
    Blue/Green deployment issues
    Performance concerns (high CPU utilization)
    5. Services involved:
    OpenSearch Service Managed Cluster: 13 cases
    OpenSearch Service Ingestion: 2 cases
    
    The cases span various topics including cluster health, deployments, performance, 
    and configuration issues. Is there a specific aspect of these cases you'd like 
    more information on?

    Now, using RAG capabilities to analyze unstructured case content for architectural insights:

    Query:

    Out of these, what are the OpenSearch issues that are related to architecture 
    deficiencies? 

    Response:

    Based on the available data, several architectural issues have been identified in OpenSearch deployments:
    
    * The shard count per node exceeding recommended limits is a significant 
    architectural concern, with validation errors occurring when scaling up 
    node counts. The recommended limit is 1,000 shards per node and exceeding 
    this requires careful architectural planning. 
    
    * Master node scaling is another architectural challenge, where insufficient 
    master node capacity can limit cluster expansion. This requires proper 
    configuration of master node instance types according to the number of nodes 
    and shards in the domain. 
    
    * Resource utilization issues, particularly around JVM Memory Pressure and 
    CPU utilization, indicate potential architectural shortcomings. 
    The system architecture should be designed to maintain JVM Memory Pressure 
    below 80% and CPU Utilization below 85% for optimal performance. 
    
    * The system's handling of blue/green deployments shows architectural limitations, 
    as these operations cause performance impacts and increased latency during data 
    migration to new nodes. 

    Continuing with RAG-based analysis of case details to evaluate implementation patterns:

    Query:

    Given the high shard count, JVM, or other data can we make inferences about whether 
    they are using this correctly?

    Response:

    Based on the data discussed, there are strong indicators that the OpenSearch system 
    is not being used optimally:
    
    JVM Memory Management Issues:
    1. Memory pressure consistently exceeded recommended thresholds (above 78% and 
    eventually over 90%)
    2. The 64GB initial heap size configuration proved inadequate for the workload
    3. Circuit breaking exceptions and cluster blocks occurred due to memory issues
    
    Shard Management Problems:
    1. The system appears to have exceeded the recommended limit of 1,000 shards 
    per node
    2. Too many small shards created unnecessary system strain
    Large shards complicated failure recovery processes
    
    Resource Utilization:
    1. Failed to maintain recommended thresholds:
    2. JVM Memory Pressure should be below 80%
    3. CPU Utilization should be below 85%
    4. Continuous memory pressure issues
    
    Failed snapshots
    These indicators strongly suggest improper sizing and resource allocation for 
    the workload, pointing to incorrect implementation or configuration of the 
    OpenSearch system. Better capacity planning and adherence to best practices 
    would be necessary for optimal performance.

    Finally, here’s another example of structured data analysis with a complex query demonstrating multi-dimensional aggregation capabilities:

    Query:

    For high-severity support cases since January 1, 2024, show the number of 
    total cases and after-hours cases for each account and service combination. 
    Define after-hours as before 1 PM UTC, after 1 AM UTC, or times falling within 
    the weekend. Only include results with more than 3 after-hours cases. Sort by 
    the highest number of after-hours cases. Exclude limit increase requests.

    Response:

    Based on the analysis of high-severity support cases since January 1, 2024, 
    here's a summary of the results, focusing on accounts and services with more 
    than 3 after-hours cases:
    
    1. The account with the highest number of after-hours cases (48) is 111111111111, 
    for the Connect service. They have a total of 66 high-severity cases.
    2. The second highest is account 333333333333 with 40 after-hours cases for the 
    OpenSearch Service Managed Cluster, out of 49 total cases.
    3. Account 222222222222 has 38 after-hours cases for RDS Oracle, out of 64 
    total cases.
    4. EMR service for account 555555555555 has 19 after-hours cases out of 23 
    total cases.
    5. Bedrock service for account 666666666666 has 17 after-hours cases out of 
    24 total cases.
    
    The results are sorted by the highest number of after-hours cases, and limit 
    increase requests have been excluded as requested. After-hours cases are 
    defined as those created before 1 PM UTC, after 1 AM UTC, or times falling 
    within the weekend.
    
    Would you like more details on a specific account or service, or do you need 
    a further analysis of this data?

    The response combines precise numerical analysis with semantic understanding, providing actionable insights for operational reviews.

    Benefits and impact

    This plugin architecture delivers several key improvements:

    • Precise analytics: Exact counts and distributions replace approximate semantic matching
    • Contextual analysis: Maintains analytical context across conversation threads
    • Architectural understanding: Better correlation of related issues through structured analysis

    Deploy the Amazon Q Business application

    The following is a simplified deployment process. For detailed instructions, see the Amazon Q Business application creation module.

    Prerequisites

    1. AWS CloudShell is recommended since, it comes pre-installed with the required libraries and tools. Alternatively, you can use a local machine with the AWS Command Line Interface (AWS CLI) installed and configured with valid credentials.
    2. Two S3 buckets:
      • Support data bucket for storing AWS Support case data
      • Resource bucket as temporary storage for Lambda resources for deployment (can be deleted after deployment)
    3. IAM Identity Center instance configured
    4. The solution needs AWS Support data collected using the Support Data Pipeline. You can deploy now and add data later, but functionality depends on data availability in your S3 bucket.
    5. Access to Anthropic’s Cloud 3-5 Sonnet through Amazon Bedrock. See Add or remove access to Amazon Bedrock foundation models
    6. The default database should exist in Athena. If not, you can create one using Athena Query Editor to create the database.

    Deployment steps

    You can us the following script to deploy the Q solution. No manual steps are needed—the script handles stack creation and configuration automatically.

    # Clone the repository
    git clone https://github.com/aws-samples/support-insights-with-amazon-q.git
    cd q_application
    chmod +x deploy_q_stacks.sh
    ./deploy_q_stacks.sh

    Clean up

    To remove the resources, delete the S3 buckets and CloudFormation stacks. Delete the CloudFormation stacks in the following order:

    1. case-metadata-stack
    2. amazon-q-stack
    3. custom-plugin-stack

    Note that this won’t delete the existing S3 buckets, you must manually delete the S3 buckets.

    Conclusion

    By combining RAG’s semantic understanding with precise analytical capabilities through plugins, we’ve transformed Amazon Q Business into a powerful operational analytics platform. In the examples in this post, you can see how organizations can use this enhancement to derive more accurate and actionable insights from their AWS Support data, supporting better operational decision-making and proactive issue resolution. While demonstrated through support data analytics for operational improvements, these patterns apply across domains that combine structured and unstructured data sources.

    Learn more

    • Explore the Amazon Q documentation to understand more about building custom plugins
    • Check out these related resources:
      • Getting Started with Amazon Q Business
      • Plugins for Amazon Q Business
      • Amazon Q Business FAQs
      • About the AWS Support API

    For questions and feedback, visit the AWS re:Post or contact AWS Support.


    About the authors

    Chitresh Saxena is a Sr. AI/ML specialist TAM specializing in generative AI solutions and dedicated to helping customers successfully adopt AI/ML on AWS. He excels at understanding customer needs and provides technical guidance to build, launch, and scale AI solutions that solve complex business problems.

    Kevin Morgan is a Sr. Enterprise Support Manager at AWS who helps customers accelerate their cloud adoption journey through hands-on leadership and technical guidance. As a member of the NextGen Developer Experience TFC, he specializes in Builder Experience, CloudOps and DevOps. Outside of work, Kevin enjoys being a Game Master for D&D and is a retro computing enthusiast.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleStructured outputs with Amazon Nova: A guide for builders
    Next Article Amazon Strands Agents SDK: A technical deep dive into agent architectures and observability

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    August 1, 2025
    Machine Learning

    TransEvalnia: A Prompting-Based System for Fine-Grained, Human-Aligned Translation Evaluation Using LLMs

    August 1, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    CVE-2025-54381 – BentoML SSRF Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Code Implementation to Building a Model Context Protocol (MCP) Server and Connecting It with Claude Desktop

    Machine Learning

    I found the best Prime Day deal on our top Xbox and PC controller — and it’s not on Amazon

    News & Updates

    CVE-2025-53384 – Apache HTTP Server Information Disclosure

    Common Vulnerabilities and Exposures (CVEs)

    Highlights

    Learning Resources

    15+ Best Free Titles Templates for After Effects in 2025

    April 14, 2025

    Title sequences are essential to any video, as they help convey important information and set…

    CVE-2025-32793 – Cilium Wireguard Transparent Encryption Race Condition

    April 21, 2025

    13 Arrested in Romania Over Phishing-Based UK Tax Fraud Scheme

    July 15, 2025

    Why Enterprises Are Embracing React Native for Cross-Platform Excellence

    April 23, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.