Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Upwork Freelancers vs Dedicated React.js Teams: What’s Better for Your Project in 2025?

      August 1, 2025

      Is Agile dead in the age of AI?

      August 1, 2025

      Top 15 Enterprise Use Cases That Justify Hiring Node.js Developers in 2025

      July 31, 2025

      The Core Model: Start FROM The Answer, Not WITH The Solution

      July 31, 2025

      Anthropic beats OpenAI as the top LLM provider for business – and it’s not even close

      August 2, 2025

      I bought Samsung’s Galaxy Watch Ultra 2025 – here’s why I have buyer’s remorse

      August 2, 2025

      I can admit when I’m wrong — this 75% wireless gaming keyboard is way better than I thought it would be

      August 2, 2025

      This is Microsoft’s canceled Windows-based Surface Duo — the dual-screen Windows Phone from 2018 that we never got

      August 2, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The details of TC39’s last meeting

      August 2, 2025
      Recent

      The details of TC39’s last meeting

      August 2, 2025

      Enhancing Laravel Queries with Reusable Scope Patterns

      August 1, 2025

      Everything We Know About Livewire 4

      August 1, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      I can admit when I’m wrong — this 75% wireless gaming keyboard is way better than I thought it would be

      August 2, 2025
      Recent

      I can admit when I’m wrong — this 75% wireless gaming keyboard is way better than I thought it would be

      August 2, 2025

      This is Microsoft’s canceled Windows-based Surface Duo — the dual-screen Windows Phone from 2018 that we never got

      August 2, 2025

      Looking for an Ubuntu Manual? Try This Book

      August 2, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Build conversational interfaces for structured data using Amazon Bedrock Knowledge Bases

    Build conversational interfaces for structured data using Amazon Bedrock Knowledge Bases

    June 17, 2025

    Organizations manage extensive structured data in databases and data warehouses. Large language models (LLMs) have transformed natural language processing (NLP), yet converting conversational queries into structured data analysis remains complex. Data analysts must translate business questions into SQL queries, creating workflow bottlenecks.

    Amazon Bedrock Knowledge Bases enables direct natural language interactions with structured data sources. The system interprets database schemas and context, converting natural language questions into accurate queries while maintaining data reliability standards. You can chat with your structured data by setting up structured data ingestion from AWS Glue Data Catalog tables and Amazon Redshift clusters in a few steps, using the power of Amazon Bedrock Knowledge Bases structured data retrieval.

    This post provides instructions to configure a structured data retrieval solution, with practical code examples and templates. It covers implementation samples and additional considerations, empowering you to quickly build and scale your conversational data interfaces. Through clear examples and proven methodologies, organizations can transform their data access capabilities and accelerate decision-making processes.

    Solution overview

    The solution demonstrates how to build a conversational application using Amazon Bedrock Knowledge Bases structured data retrieval. Developers often face challenges integrating structured data into generative AI applications. This includes difficulties training LLMs to convert natural language queries to SQL queries based on complex database schemas, as well as making sure appropriate data governance and security controls are in place. Amazon Bedrock Knowledge Bases alleviates these complexities by providing a managed natural language to SQL (NL2SQL) module. Amazon Bedrock Knowledge Bases offers an end-to-end managed workflow for you to build custom generative AI applications that can access and incorporate contextual information from a variety of structured and unstructured data sources. Using advanced NLP, Amazon Bedrock Knowledge Bases can transform natural language queries into SQL queries, so you can retrieve data directly from the source without the need to move or preprocess the data.

    This solution includes Amazon Bedrock Knowledge Bases, Amazon Redshift, AWS Glue, and Amazon Simple Storage Service (Amazon S3). The solution architecture consists of two parts: a data ingestion pipeline, and a structured data retrieval application using Amazon Bedrock Knowledge Bases.

    Amazon Bedrock Knowledge Bases structured data retrieval supports Amazon Redshift as the query engine and multiple data ingestion options. The data ingestion pipeline is a one-time setup, and supports multiple ingestion options. In this post, we discuss a common data ingestion use case using Amazon S3, AWS Glue, and Amazon Redshift.

    You can configure Amazon Bedrock Knowledge Bases structured data retrieval to retrieve data from AWS Glue databases and S3 datasets. This setup uses automatic mounting of the Data Catalog in Amazon Redshift. With this ingestion option, you can seamlessly integrate existing S3 datasets and Data Catalog tables into your Retrieval Augmented Generation (RAG) applications with the access permissions configured through Lake Formation. The following diagram illustrates this pipeline.

    Data flow from Dataset to Amazon S3, AWS Glue, and Amazon Redshift

    The following screenshot shows the configuration options on the Amazon Bedrock console.

    Storage metadata configuration interface with Redshift and Glue Data Catalog options

    After the data ingestion is configured and the knowledge bases data source sync job is complete, users can ask natural language questions, and Amazon Bedrock Knowledge Bases will generate the SQL, execute the SQL against the query engine, and process it through the LLM to provide a user-friendly response. The following diagram illustrates a sample architecture of the structured data retrieval workflow.

    Amazon Bedrock Knowledge Bases structured data retrieval conversational flow

    The data retrieval workflow consists of the following steps:

    1. In a RAG application, the user can ask a natural language data analytics question through the chat interface, such as “What is the sales revenue for the Month of February 2025?”
    2. The natural language query is sent to Amazon Bedrock Knowledge Bases for data retrieval and processing.
    3. Amazon Bedrock Knowledge Bases generates a SQL query based on the underlying data schema configured during the knowledge base creation.
    4. The SQL query is executed against the query engine (Amazon Redshift) to retrieve data from a structured data store (AWS Glue tables). The query can include multiple joins and aggregation.
    5. The generated SQL response is sent to an LLM along with additional context to generate a response in natural language.
    6. The response is sent back to the user. The user can ask follow-up questions based on the retrieved response, such as “What is the product that generated highest revenue in this period?”

    Amazon Bedrock Knowledge Bases structured data retrieval supports three different APIs to meet your data retrieval requirements:

    • Retrieval and response generation – The retrieval and response generation API, similar to the solution workflow we’ve discussed, generates a SQL query, retrieves data through the query engine, and processes it through the LLM to generate a natural language response
    • Retrieval only – The retrieval only API generates a SQL query, retrieves data through the query engine, and returns the data without processing it through an LLM
    • Generate SQL queries – The generate SQL query API returns the raw SQL query that was generated by Amazon Bedrock Knowledge Bases, which can be used for review and further processing by applications

    The following screenshot shows the configuration options on the Amazon Bedrock console.

    Amazon Bedrock Knowledge Bases retrieval API

    Code resources and templates

    The solution uses the following notebooks:

    • Data ingestion notebook – Structured-rag-s3-glue-ingestion includes the step-by-step guide to ingest an open dataset to Amazon S3, configure AWS Glue tables using crawlers, and set up the Amazon Redshift Serverless query engine.
    • Structured data retrieval notebook – Structured-rag-s3-glue-retrieval walks through the implementation steps and provides sample code for configuring Amazon Bedrock Knowledge Bases structured data retrieval using Amazon S3, AWS Glue, and the Amazon Redshift query engine.

    For more details, refer to the GitHub repo.

    Prerequisites

    To implement the solution provided in this post, you must have an AWS account. Additionally, access to the required foundation models must be enabled in Amazon Bedrock.

    Set up the data ingestion pipeline

    To set up the data ingestion pipeline, we load the sample dataset in an S3 bucket and configure AWS Glue as data storage and a Redshift Serverless workgroup as the query engine. Complete the following steps in data ingestion notebook:

    1. For data ingestion, download the following sample ecommerce dataset, convert it to a pandas data frame, and upload it to an S3 bucket using Amazon SageMaker Data Wrangler.
    2. Create an AWS Glue database and table using an AWS Glue crawler by crawling the source S3 bucket with the dataset. You can update this step to crawl your own S3 bucket or use your existing Data Catalog tables as storage metadata.
    3. Use the data ingestion notebook to create a Redshift Serverless namespace and workgroup in the default VPC. If you plan to use your own Redshift Serverless workgroup or Amazon Redshift provisioned cluster, you can skip this step.

    Set up the structured data retrieval solution

    In this section, we detail the steps to set up the structured data retrieval component of the solution.

    Amazon Bedrock Knowledge Bases supports multiple data access patterns, including AWS Identity and Access Management (IAM), AWS Secrets Manager, and database users. For this post, we demonstrate the setup option with IAM access. You can use IAM access with the Redshift Serverless workgroup configured as part of the ingestion workflow or an existing Redshift Serverless or provisioned cluster to compete these steps.

    Complete the following steps in structured data retrieval notebook:

    1. Create an execution role with the necessary policies for accessing data from Amazon Redshift, AWS Glue, and the S3 bucket.
    2. Invoke the CreateKnowledgeBase API to create the knowledge base with the execution role and knowledge base configurations. In the knowledge base configuration, the AWS Glue database and tables are used as storage metadata with Amazon Redshift as the query engine.
    3. After you create the knowledge base, you must complete additional steps to make sure the IAM execution role has the necessary permissions to execute the query in Amazon Redshift and retrieve data from AWS Glue. The notebook includes the necessary instructions to create and grant database access to the execution role, and grant AWS Lake Formation permissions.
    4. The ingestion job will sync the data store schema metadata about AWS Glue database and tables with the NL2SQL module. This schema metadata will be used while generating the SQL query during structured data retrieval.
    5. After the knowledge base sync job is complete, you can use the three data retrieval APIs – retrieve and generate response, retrieval only, and generate SQL query – to query and validate the structured data retrieval solution.

    For more details, refer to Create a knowledge base by connecting to a structured data store.

    Clean up

    We have included cleanup instructions in both the data ingestion and structured data retrieval notebooks to clean up resources after the end-to-end solution is implemented and validated.

    Conclusion

    Amazon Bedrock Knowledge Bases simplifies data analysis by converting natural language questions into SQL queries, eliminating the need for specialized database expertise. The service integrates with Amazon Redshift, AWS Glue, and Amazon S3, allowing business analysts, data scientists, and operations teams to query data directly using conversation-like questions. It maintains data security through built-in governance controls and access permissions. Customers can deploy this managed service to enable users to analyze data using natural language questions, while maintaining data integrity and security standards.

    To learn more, refer to Build a knowledge base by connecting to a structured data store and Amazon Bedrock Knowledge Bases now supports structured data retrieval.


    About the authors

    George Belsian is a Senior Cloud Application Architect at Amazon Web Services, helping organizations navigate the complexities of cloud adoption, AI integration, and data-driven innovation. By transforming legacy systems into cloud-based platforms and incorporating AI/ML capabilities, he helps businesses create new opportunities for growth, optimize their processes, and deliver scalable solutions.

    Sandeep Singh is a Senior Generative AI Data Scientist at Amazon Web Services, helping businesses innovate with generative AI. He specializes in generative AI, machine learning, and system design. He has successfully delivered state-of-the-art AI/ML-powered solutions to solve complex business problems for diverse industries, optimizing efficiency and scalability.

    Mani Khanuja is a Principal Generative AI Specialist SA and author of the book Applied Machine Learning and High-Performance Computing on AWS. She leads machine learning projects in various domains such as computer vision, natural language processing, and generative AI. She speaks at internal and external conferences such AWS re:Invent, Women in Manufacturing West, YouTube webinars, and GHC 23. In her free time, she likes to go for long runs along the beach.

    Gopikrishnan Anilkumar is a Principal Technical Product Manager in AWS Agentic AI organization. He has over 10 years of product management experience across a variety of domains and is passionate about AI/ML.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleAn innovative financial services leader finds the right AI solution: Robinhood and Amazon Nova
    Next Article Understanding Web Hosting: A Beginner’s Guide

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    August 2, 2025
    Machine Learning

    Meet Trackio: The Free, Local-First, Open-Source Experiment Tracker Python Library that Simplifies and Enhances Machine Learning Workflows

    August 2, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    CVE-2025-35978 – UpdateNavi UpdateNaviInstallService Remote Code Execution

    Common Vulnerabilities and Exposures (CVEs)
    Driving Retail Loyalty with MongoDB and Cognigy

    Driving Retail Loyalty with MongoDB and Cognigy

    Databases

    CVE-2025-6019 – Libblockdev Polkit Local Privilege Escalation

    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-6578 – “Code-projects Simple Online Hotel Reservation System SQL Injection Vulnerability”

    Common Vulnerabilities and Exposures (CVEs)

    Highlights

    Linux

    Ubuntu 24.04 Now Available for OrangePi’s New RISC-V SBC

    April 25, 2025

    Of note, Ubuntu 24.04 developer images are now available for the new OrangePi RV2 RISC-V…

    The A24 Elden Ring movie is real and has a director, but I have no idea how it’s going to work

    May 23, 2025

    CVE-2025-48845 – Apache HTTP Server Authentication Bypass

    May 28, 2025

    About Impact. Presenting the Change You’ve Made Through Your Decisions as a Designer

    June 25, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.