Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 20, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 20, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 20, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 20, 2025

      GPT-5 should have a higher “degree of scientific certainty” than the current ChatGPT — but with less model switching

      May 20, 2025

      Elon Musk’s Grok 3 AI coming to Azure proves Satya Nadella’s allegiance isn’t to OpenAI, but to maximizing Microsoft’s profit gains by heeding consumer demands

      May 20, 2025

      One of the most promising open-world RPGs in years is releasing next week on Xbox and PC

      May 20, 2025

      NVIDIA’s latest driver fixes some big issues with DOOM: The Dark Ages

      May 20, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Community News: Latest PECL Releases (05.20.2025)

      May 20, 2025
      Recent

      Community News: Latest PECL Releases (05.20.2025)

      May 20, 2025

      Getting Started with Personalization in Sitecore XM Cloud: Enable, Extend, and Execute

      May 20, 2025

      Universal Design and Global Accessibility Awareness Day (GAAD)

      May 20, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      GPT-5 should have a higher “degree of scientific certainty” than the current ChatGPT — but with less model switching

      May 20, 2025
      Recent

      GPT-5 should have a higher “degree of scientific certainty” than the current ChatGPT — but with less model switching

      May 20, 2025

      Elon Musk’s Grok 3 AI coming to Azure proves Satya Nadella’s allegiance isn’t to OpenAI, but to maximizing Microsoft’s profit gains by heeding consumer demands

      May 20, 2025

      One of the most promising open-world RPGs in years is releasing next week on Xbox and PC

      May 20, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Databases»Build an AI-powered text-to-SQL chatbot using Amazon Bedrock, Amazon MemoryDB, and Amazon RDS

    Build an AI-powered text-to-SQL chatbot using Amazon Bedrock, Amazon MemoryDB, and Amazon RDS

    May 19, 2025

    In today’s data-driven environments, where information is spread across numerous sources, translating complex problem statements into precise SQL code can be a significant challenge. The growth of large language models (LLMs) has helped simplify data access. Text-to-SQL can automatically transform analytical questions into executable SQL code for enhanced data accessibility and streamlined data exploration, from analyzing sales data and monitoring performance metrics to assessing customer feedback.

    In this post, we explore how to use Amazon Relational Database Service (Amazon RDS) for PostgreSQL and Amazon Bedrock to build a generative AI text-to-SQL chatbot application using Retrieval Augmented Generation (RAG). We’ll also see how we can use Amazon MemoryDB with vector search to provide semantic caching to further accelerate this solution.

    We use Amazon RDS for PostgreSQL as our database to store structured data, while leveraging Amazon Bedrock’s foundation models (FMs) for the generative AI capabilities. You can store both your data and embeddings (vector representations of text, images, and audio) in Amazon RDS for PostgreSQL, and use Amazon Bedrock to generate embeddings and translate natural language prompts into SQL queries that interact with your stored data.

    Amazon Bedrock is a fully managed service that offers a choice of high-performing FMs from leading AI companies such as AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon through a single API, along with a broad set of capabilities to help you build generative AI applications. We will integrate Amazon Bedrock with RDS for PostgreSQL database and use Amazon Bedrock to write SQL queries for the text prompts we provide as input. We use a combination of Streamlit and Amazon API gateway up front to invoke Amazon Bedrock using AWS Lambda.

    Best practices for effective text-to-SQL

    To deliver the most effective solution tailored to real-world use cases, this solution incorporates text-to-SQL best practices for performance optimization and security. This includes using Amazon MemoryDB with vector search capabilities and the Valkey engine to use the semantic caching technique, which can improve performance and reduce calls to a foundation model. In addition to a semantic cache, this solution will also leverage parameterized SQL, a read only database role for security, and table augmented generation techniques.

    Semantic caching strategy

    Semantic caching stores data based on its contextual meaning, which means that two queries with the same meaning will return the same result. This can be useful for complex queries involving multiple tables or data sources.

    Semantically caching the text-to-SQL query responses when using LLMs enables performance and cost optimization. By semantically caching the query results, the system can retrieve and reuse the previously generated LLM responses when appropriate, significantly improving the overall response times. Furthermore, minimizing the number of LLM inferences through Amazon Bedrock not only improves the response speed but also reduces the operational cost with fewer LLM interactions. This approach allows for improved scalability of the application without a proportional cost increase.

    When implementing semantic caching for this solution, our process follows two steps:

    1. First, we check for an exact textual match in the cache.
    2. If no exact match exists, we perform a semantic similarity search to find closely related cached queries.

    This semantic caching strategy provides an efficient and cost-effective approach to using LLMs for text-to-SQL conversions.

    Parameterized SQL

    Because we’re dealing with user inputs that use Amazon Bedrock to convert user inputs to a SQL query, it’s crucial to create safeguards against SQL injection to prevent unauthorized access. For reference, see the Open Worldwide Application security project (OWASP) critical security risks to web applications.

    In this solution, we use a parameterized query in which the parameter values are separated from the SQL syntax. We achieve this by requesting a parameterized SQL statement in the prompt to the LLM alongside the user input.

    Without using parameterized queries, a third party could gain unauthorized access through user input that is executed by the database. By using parameterized queries, the database can first parse and plan the SQL query, and then execute the statement with the provided parameter values. This way, any SQL code in the user input is treated as regular data, not as executable code. Parameterized queries are considered a best practice to avoid SQL injection regardless of which database server or driver you are using.

    As an additional layer of protection in this solution, we use a database role for the integration that only has select permissions on the tables in the database.

    Table Augmented Generation

    Implementing a centralized data catalog with Table Augmented Generation (TAG) approaches offer organizations a powerful tool for enhancing text-to-SQL responses from LLMs. In this solution, we create embeddings of database metadata, including tables, columns, and relationships, and use PostgreSQL’s pgvector functionality to generate searchable embeddings. When a user submits a query, the system uses the embeddings table to identify the most relevant database elements, effectively augmenting the user’s input with crucial structural context. This augmented query is then passed to the LLM, enabling it to generate SQL queries with an understanding of the specific data landscape. By incorporating this table-aware context into the generation process, LLMs can produce highly accurate, relevant, and tailored SQL responses that align precisely with the organization’s data infrastructure. This method not only improves SQL generation or synthesis, but also enhances the system’s ability to describe query results in natural language, providing a more comprehensive and user-friendly data interaction experience.

    Solution overview

    The solution presented in this post uses a chatbot application that uses the following architecture.

    Scope of solution

    This architecture workflow includes the following steps:

    1. A database is created on an Amazon RDS for PostgreSQL database instance that contains a target dataset. For this example, we’re using US Housing Data (see prerequisites) represented by a single table in PostgreSQL.
    2. Using a Streamlit chat application, a user submits their query to an API Gateway endpoint with a Lambda function backend.
    3. The TextToSQLFunction checks Amazon MemoryDB for Valkey to see if a similar semantic cache entry exists for the user’s prompt. If there’s a cache miss, the Lambda function performs a semantic search against the embeddings table (based on a text representation of database tables, columns, and relationships) using pgvector for similar tables to the prompt.
    4. The TextToSQLFunction constructs the prompt using the table information from the previous step, along with relevant instructions for constructing SQL.
    5. The TextToSQLFunction takes the SQL it generated in a previous step and invokes a query against the PostgreSQL database using a role that only has select permissions. The function then uses the query results to again prompt Amazon Bedrock for an interpretation of the results based on the user query.

    There’s also a separate DataIndexerFunction that’s created to achieve the TAG design.

    1. A Lambda function, the DataIndexerFunction, reads from the information_schemain PostgreSQL.
    2. The DataIndexerFunction creates a text representation of the tables and relationships within the database.
    3. The DataIndexerFunction uses the text representation from the previous step to generate the embeddings using the Amazon Titan Embeddings G1 – Text model in Amazon Bedrock.
    4. The embeddings are stored in an embeddings table in the RDS for PostgreSQL database, and retrieved using the pgvector extension, in the event of a cache miss.

    Prerequisites

    The following are needed to proceed with this post:

    • An AWS account
    • A Git client to clone the source code provided
    • Docker installed and running on the local host or laptop
    • Install AWS CDK
    • The AWS Command Line Interface (AWS CLI)
    • The AWS Systems Manager Session Manager plugin
    • Amazon Bedrock model access enabled for Anthropic’s Claude 3.5 Sonnet v1 and Amazon Titan Embeddings G1 – Text in the us-west-2 AWS Region
    • Python 3.12 with the pip package manager
    • 500,000+ US Homes Data (For Sale Properties)
      • License – CC0: Public Domain

    Download the dataset and upload to Amazon S3

    1. Navigate to the 500,000+ US Homes Data dataset, and download the dataset to your local machine.
    2. Unzip the zip file, which will expand into a file called 600k US Housing Properties.csv.
    3. Upload this file to an Amazon Simple Storage Service (Amazon S3) bucket of your choice in the AWS account where you’ll be deploying the AWS Cloud Development Kit (AWS CDK) Replace <bucket-name> with the bucket in your account.
    aws s3 cp "./600K US Housing Properties.csv" s3://<bucket-name>/

    Deploy the solution

    1. Clone the repository from GitHub.
    git clone https://github.com/aws-samples/cdk-rds-pg-memdb-text-to-sql
    cd cdk-rds-pg-memdb-text-to-sql/
    1. Setup the Python environment and install dependencies.
    python3 -m venv .venv
    source .venv/bin/activate
    pip install -r requirements.txt
    1. Deploy the CDK application and specify the S3 bucket name used in the previous section as a parameter. It will take about 20-30 minutes to deploy the MemoryDB and RDS instances. Replace <account-id> and <bucket-name> with your AWS account ID and bucket name from the previous section.
    cdk bootstrap aws://<account-id>/us-west-2
    cdk deploy –-all --parameters AppStack:S3BucketName=<bucket-name>

    Note: if you receive an error at this step, ensure Docker is running on the local host or laptop, docker –-version verifies if the service is running locally.

    1. Use the py script to access the bastion host.
     python3 setup_helper.py bastion

    To perform these steps, you must have the Session Manager plugin for the AWS CLI installed as per the prerequisites. You can verify the Session Manager plugin installation by following the steps to Verify the Session Manager plugin installation.

    Run the command provided by the script to use Session Manager, a capability of AWS Systems Manager, to access the bastion host. This will also set the SECRET_NAME environment variable for the session to configure the RDS database.

    1. On the bastion host, install required dependencies.
    sudo yum install -y postgresql15 python3-pip
    python3 -m pip install "psycopg[binary]>=3.1.12" boto3 pandas numpy
    1. Copy the artifact file from the S3 bucket name used in the previous section (Download the dataset and upload to Amazon S3) to the bastion host.
    aws s3 cp "s3://<bucket-name>/600K US Housing Properties.csv" .
    1. On the same bastion host terminal session, create the following Python code by copying and pasting the code into your terminal then press enter.
    cat > ~/load.py << EOF
    import os, json, boto3, psycopg, pandas as pd, numpy as np
    
    def get_secret(secret_name: str):
        """Retrieve database connection parameters from AWS Secrets Manager."""
        sm_client = boto3.client("secretsmanager", region_name="us-west-2")
        connector_params = json.loads(sm_client.get_secret_value(SecretId=secret_name)["SecretString"])
        
        required_fields = ["host", "username", "password", "port"]
        missing = [f for f in required_fields if f not in connector_params]
        if missing:
            raise ValueError(f"Required fields not found: {', '.join(missing)}")
        
        return connector_params
    
    def main():
        csv_file = '600K US Housing Properties.csv'
        
        try:
            print("Processing and importing data...")
            connection_params = get_secret(os.environ["SECRET_NAME"])
            conn_string = f"host={connection_params['host']} port={connection_params['port']} dbname=postgres user={connection_params['username']} password={connection_params['password']}"
            
            # Column names for the COPY statement - defined as a list for safety
            columns = ['property_url', 'property_id', 'address', 'street_name', 'apartment', 'city', 'state', 'latitude', 
                      'longitude', 'postcode', 'price', 'bedroom_number', 'bathroom_number', 'price_per_unit', 'living_space', 
                      'land_space', 'land_space_unit', 'broker_id', 'property_type', 'property_status', 'year_build', 
                      'total_num_units', 'listing_age', 'RunDate', 'agency_name', 'agent_name', 'agent_phone', 'is_owned_by_zillow']
    
            
            # Connect to database
            with psycopg.connect(conn_string) as conn:
                # Process and import data in chunks
                chunksize = 1000
                total_imported = 0
    
                # Prepare the INSERT statement with placeholders
                placeholders = ", ".join(["%s"] * len(columns))
                insert_query = f"INSERT INTO us_housing_properties ({', '.join(columns)}) VALUES ({placeholders})"
    
                for chunk in pd.read_csv(csv_file, chunksize=chunksize):
                    # Replace negative values with None
                    numeric_cols = chunk.select_dtypes(include=['float64', 'int64']).columns
                    chunk[numeric_cols] = chunk[numeric_cols].apply(lambda x: x.mask(x < 0, None))
    
                    # Replace null or empty values with None
                    chunk = chunk.replace({np.nan: None, '': None})
                    # Create a list of tuples from the DataFrame chunk
                    tuples = [tuple(x) for x in chunk.to_numpy()]
    
                    
                    # Use executemany for batch insertion
                    with conn.cursor() as cur:
                        cur.executemany(insert_query, tuples)
                        conn.commit()
                    
                    total_imported += len(chunk)
                    print(f"Imported {total_imported} rows...")
                
                print(f"Successfully imported {total_imported} rows into database.")
    
        except Exception as error:
            print(f"Error: {error}")
            raise
    
    if __name__ == "__main__":
        main()
    EOF
    1. Run the Python code to input data into the RDS instance from the spreadsheet.
    python3 load.py
    1. After the script completes, navigate to the AWS Lambda console from your browser.
    2. Search for the function named DataIndexerStack-DataIndexerFunction.
    3. Open the function and navigate to the Test Choose Test. This will populate the embeddings table with database schema information.
    4. Next, also from the Lambda console, search for the function named AppStack-TextToSQLFunction.
    5. Open the function, and navigate to the Test Edit the Event JSON with the following:
    { 
      "query": "What are the top homes in San Francisco, CA?" 
    }
    1. Choose Test to trigger the Lambda function to query the RDS instance.

    Note: if you receive an error in the initial test execution, the index in MemoryDB for Valkey may still be under construction. Choose Test again to retry, and the attempt should be successful.

    Lambda Test Output

    Test the text-to-sql chatbot application

    To run the Streamlit app, perform the following steps from your local host or laptop. In this section, we generate a Streamlit secrets file to store the API key and capture the API endpoint from the deployed CDK application.

    1. Use the py script to set up Streamlit on your local machine:
    python3 setup_helper.py streamlit

    The output will look similar to the following:

    Setup Instructions for Streamlit

    1. Follow the instructions provided by the setup_helper script, with the four steps to:
      1. Retrieve the API key for the API Gateway deployment
      2. Create a ./streamlit/.streamlit/secrets.toml file with the api_key and api_endpoint values
      3. Install Streamlit
      4. Run the Streamlit app

    Try some of the following questions to see the responses from the solution:

    • What are the top 5 most expensive properties in San Francisco, CA?
    • What key factors influence home prices in the 90210 zip code?
    • List top homes in the Bay Area under $700 per sq ft and over 1000 sq ft, sorted by price

    Example output for: “What are the top homes in WA where avg sq ft is > $700 and sq ft is greater than 1000”

    Chatbot Example, WA Sq Ft

    Example output for: “What are properties in Seattle that are under $600 per sq ft and over 1000 sq ft?”

    Chatbot Example, SEA Sq Ft

    In rare cases, you may experience a Gateway Timeout (HTTP code 504), which means the app was not able to respond within the 29-second timeout. Recommended actions:

    • Retry your prompt, as the result may have already been cached
    • Request an increase in the integration timeout for API Gateway

    While this demonstration uses a single housing data table, the underlying text-to-SQL application is capable of supporting complex JOINs across multiple tables in more advanced implementations.

    Clean up

    To clean up the resources, you can use cdk destroy --all. For the sake of infrastructure management best practices, we recommend deleting the resources created in this demonstration.

    Conclusion

    In this post, we demonstrate a solution that allows you to use natural language to generate complex SQL queries with a variety of resources enabled by Amazon RDS for PostgreSQL and Amazon Bedrock. We also increased the accuracy of the generated SQL queries through effective best practices such as caching, parameterized SQL, and table augmented generation processes. Additionally, we used the metadata in the PostgreSQL’s pgvector to consider the table names asked in the query through the RAG framework. Finally, we then tested the solution in various realistic scenarios with different query complexity levels.

    To get started, open the Amazon Bedrock console or review the new APIs in the Amazon Bedrock SDK, and start creating your prompts and flows today.

    To learn more about using Amazon RDS for PostgreSQL and pgvector for AI and ML workloads, see Amazon RDS for PostgreSQL support for pgvector for simplified ML model integration.

    For customers whose data is stored in Amazon Redshift or Amazon Sagemaker Lakehouse, Amazon Bedrock Knowledge Bases now supports structured data retrieval. More information can be found here Amazon Bedrock Knowledge Bases now supports structured data retrieval.


    About the Authors

    Frank Dallezotte is a Senior Solutions Architect at AWS and is passionate about working with independent software vendors to design and build scalable applications on AWS. He has experience creating software, implementing build pipelines, and deploying these solutions in the cloud.

    Archana Srinivasan is an Enterprise Support Lead within Enterprise Support at Amazon Web Services. Archana provides strategic technical guidance for independent software vendors (ISVs) to innovate and operate their workloads efficiently on AWS cloud.

    Source: Read More

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleHow Amazon maintains accurate totals at scale with Amazon DynamoDB
    Next Article NHS Charter Urges Vendors to Improve Cybersecurity to Prevent Ransomware

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 20, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-44898 – Fortinet Wireless Access Point Stack Overflow Vulnerability

    May 20, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Workload Automation vs Service Orchestration: What’s the Difference?

    Development

    CVE-2025-46738 – Apache SEL arbitrary code execution vulnerability

    Common Vulnerabilities and Exposures (CVEs)
    Microsoft Patches 126 Flaws Including Actively Exploited Windows CLFS Vulnerability

    Microsoft Patches 126 Flaws Including Actively Exploited Windows CLFS Vulnerability

    Development

    What to Do If You Don’t Get into Outreachy: Exploring Other Paths to Open Source

    Development
    Hostinger

    Highlights

    Development

    WelDree

    April 21, 2024

    WelDree is a UI to execute Cucumber Scenarios. How this tool development ideation was born? Once a Cucumber feature file is written and implemented, it can be executed using Jenkins, BAT file, and IDEs like IntelliJ & Eclipse. Executing individual scenario from IDE is an easy job for an automation tester. However, if an non-technical
    The post WelDree appeared first on Codoid.

    Brisa 0.2.7 Release notes

    February 22, 2025

    Accessibility Color Contrast Testing: A Complete Guide for 2024

    August 1, 2024

    Choose the right throughput strategy for Amazon DynamoDB applications

    April 21, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.