Build an AI-powered text-to-SQL chatbot using Amazon Bedrock, Amazon MemoryDB, and Amazon RDS

In today’s data-driven environments, where information is spread across numerous sources, translating complex problem statements into precise SQL code can be a significant challenge. The growth of large language models (LLMs) has helped simplify data access. Text-to-SQL can automatically transform analytical questions into executable SQL code for enhanced data accessibility and streamlined data exploration, from analyzing sales data and monitoring performance metrics to assessing customer feedback.

In this post, we explore how to use Amazon Relational Database Service (Amazon RDS) for PostgreSQL and Amazon Bedrock to build a generative AI text-to-SQL chatbot application using Retrieval Augmented Generation (RAG). We’ll also see how we can use Amazon MemoryDB with vector search to provide semantic caching to further accelerate this solution.

We use Amazon RDS for PostgreSQL as our database to store structured data, while leveraging Amazon Bedrock’s foundation models (FMs) for the generative AI capabilities. You can store both your data and embeddings (vector representations of text, images, and audio) in Amazon RDS for PostgreSQL, and use Amazon Bedrock to generate embeddings and translate natural language prompts into SQL queries that interact with your stored data.

Amazon Bedrock is a fully managed service that offers a choice of high-performing FMs from leading AI companies such as AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon through a single API, along with a broad set of capabilities to help you build generative AI applications. We will integrate Amazon Bedrock with RDS for PostgreSQL database and use Amazon Bedrock to write SQL queries for the text prompts we provide as input. We use a combination of Streamlit and Amazon API gateway up front to invoke Amazon Bedrock using AWS Lambda.

Best practices for effective text-to-SQL

To deliver the most effective solution tailored to real-world use cases, this solution incorporates text-to-SQL best practices for performance optimization and security. This includes using Amazon MemoryDB with vector search capabilities and the Valkey engine to use the semantic caching technique, which can improve performance and reduce calls to a foundation model. In addition to a semantic cache, this solution will also leverage parameterized SQL, a read only database role for security, and table augmented generation techniques.

Semantic caching strategy

Semantic caching stores data based on its contextual meaning, which means that two queries with the same meaning will return the same result. This can be useful for complex queries involving multiple tables or data sources.

Semantically caching the text-to-SQL query responses when using LLMs enables performance and cost optimization. By semantically caching the query results, the system can retrieve and reuse the previously generated LLM responses when appropriate, significantly improving the overall response times. Furthermore, minimizing the number of LLM inferences through Amazon Bedrock not only improves the response speed but also reduces the operational cost with fewer LLM interactions. This approach allows for improved scalability of the application without a proportional cost increase.

When implementing semantic caching for this solution, our process follows two steps:

First, we check for an exact textual match in the cache.
If no exact match exists, we perform a semantic similarity search to find closely related cached queries.

This semantic caching strategy provides an efficient and cost-effective approach to using LLMs for text-to-SQL conversions.

Parameterized SQL

Because we’re dealing with user inputs that use Amazon Bedrock to convert user inputs to a SQL query, it’s crucial to create safeguards against SQL injection to prevent unauthorized access. For reference, see the Open Worldwide Application security project (OWASP) critical security risks to web applications.

In this solution, we use a parameterized query in which the parameter values are separated from the SQL syntax. We achieve this by requesting a parameterized SQL statement in the prompt to the LLM alongside the user input.

Without using parameterized queries, a third party could gain unauthorized access through user input that is executed by the database. By using parameterized queries, the database can first parse and plan the SQL query, and then execute the statement with the provided parameter values. This way, any SQL code in the user input is treated as regular data, not as executable code. Parameterized queries are considered a best practice to avoid SQL injection regardless of which database server or driver you are using.

As an additional layer of protection in this solution, we use a database role for the integration that only has select permissions on the tables in the database.

Table Augmented Generation

Implementing a centralized data catalog with Table Augmented Generation (TAG) approaches offer organizations a powerful tool for enhancing text-to-SQL responses from LLMs. In this solution, we create embeddings of database metadata, including tables, columns, and relationships, and use PostgreSQL’s pgvector functionality to generate searchable embeddings. When a user submits a query, the system uses the embeddings table to identify the most relevant database elements, effectively augmenting the user’s input with crucial structural context. This augmented query is then passed to the LLM, enabling it to generate SQL queries with an understanding of the specific data landscape. By incorporating this table-aware context into the generation process, LLMs can produce highly accurate, relevant, and tailored SQL responses that align precisely with the organization’s data infrastructure. This method not only improves SQL generation or synthesis, but also enhances the system’s ability to describe query results in natural language, providing a more comprehensive and user-friendly data interaction experience.

Solution overview

The solution presented in this post uses a chatbot application that uses the following architecture.

This architecture workflow includes the following steps:

A database is created on an Amazon RDS for PostgreSQL database instance that contains a target dataset. For this example, we’re using US Housing Data (see prerequisites) represented by a single table in PostgreSQL.
Using a Streamlit chat application, a user submits their query to an API Gateway endpoint with a Lambda function backend.
The TextToSQLFunction checks Amazon MemoryDB for Valkey to see if a similar semantic cache entry exists for the user’s prompt. If there’s a cache miss, the Lambda function performs a semantic search against the embeddings table (based on a text representation of database tables, columns, and relationships) using pgvector for similar tables to the prompt.
The TextToSQLFunction constructs the prompt using the table information from the previous step, along with relevant instructions for constructing SQL.
The TextToSQLFunction takes the SQL it generated in a previous step and invokes a query against the PostgreSQL database using a role that only has select permissions. The function then uses the query results to again prompt Amazon Bedrock for an interpretation of the results based on the user query.

There’s also a separate DataIndexerFunction that’s created to achieve the TAG design.

A Lambda function, the DataIndexerFunction, reads from the information_schemain PostgreSQL.
The DataIndexerFunction creates a text representation of the tables and relationships within the database.
The DataIndexerFunction uses the text representation from the previous step to generate the embeddings using the Amazon Titan Embeddings G1 – Text model in Amazon Bedrock.
The embeddings are stored in an embeddings table in the RDS for PostgreSQL database, and retrieved using the pgvector extension, in the event of a cache miss.

Prerequisites

The following are needed to proceed with this post:

An AWS account
A Git client to clone the source code provided
Docker installed and running on the local host or laptop
Install AWS CDK
The AWS Command Line Interface (AWS CLI)
The AWS Systems Manager Session Manager plugin
Amazon Bedrock model access enabled for Anthropic’s Claude 3.5 Sonnet v1 and Amazon Titan Embeddings G1 – Text in the us-west-2 AWS Region
Python 3.12 with the pip package manager
500,000+ US Homes Data (For Sale Properties)
- License – CC0: Public Domain

Download the dataset and upload to Amazon S3

Navigate to the 500,000+ US Homes Data dataset, and download the dataset to your local machine.
Unzip the zip file, which will expand into a file called 600k US Housing Properties.csv.
Upload this file to an Amazon Simple Storage Service (Amazon S3) bucket of your choice in the AWS account where you’ll be deploying the AWS Cloud Development Kit (AWS CDK) Replace <bucket-name> with the bucket in your account.

aws s3 cp "./600K US Housing Properties.csv" s3://<bucket-name>/

Deploy the solution

Clone the repository from GitHub.

git clone https://github.com/aws-samples/cdk-rds-pg-memdb-text-to-sql
cd cdk-rds-pg-memdb-text-to-sql/

Setup the Python environment and install dependencies.

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Deploy the CDK application and specify the S3 bucket name used in the previous section as a parameter. It will take about 20-30 minutes to deploy the MemoryDB and RDS instances. Replace <account-id> and <bucket-name> with your AWS account ID and bucket name from the previous section.

cdk bootstrap aws://<account-id>/us-west-2
cdk deploy –-all --parameters AppStack:S3BucketName=<bucket-name>

Note: if you receive an error at this step, ensure Docker is running on the local host or laptop, docker –-version verifies if the service is running locally.

Use the py script to access the bastion host.

 python3 setup_helper.py bastion

To perform these steps, you must have the Session Manager plugin for the AWS CLI installed as per the prerequisites. You can verify the Session Manager plugin installation by following the steps to Verify the Session Manager plugin installation.

Run the command provided by the script to use Session Manager, a capability of AWS Systems Manager, to access the bastion host. This will also set the SECRET_NAME environment variable for the session to configure the RDS database.

On the bastion host, install required dependencies.

sudo yum install -y postgresql15 python3-pip
python3 -m pip install "psycopg[binary]>=3.1.12" boto3 pandas numpy

Copy the artifact file from the S3 bucket name used in the previous section (Download the dataset and upload to Amazon S3) to the bastion host.

aws s3 cp "s3://<bucket-name>/600K US Housing Properties.csv" .

On the same bastion host terminal session, create the following Python code by copying and pasting the code into your terminal then press enter.

cat > ~/load.py << EOF
import os, json, boto3, psycopg, pandas as pd, numpy as np

def get_secret(secret_name: str):
    """Retrieve database connection parameters from AWS Secrets Manager."""
    sm_client = boto3.client("secretsmanager", region_name="us-west-2")
    connector_params = json.loads(sm_client.get_secret_value(SecretId=secret_name)["SecretString"])
    
    required_fields = ["host", "username", "password", "port"]
    missing = [f for f in required_fields if f not in connector_params]
    if missing:
        raise ValueError(f"Required fields not found: {', '.join(missing)}")
    
    return connector_params

def main():
    csv_file = '600K US Housing Properties.csv'
    
    try:
        print("Processing and importing data...")
        connection_params = get_secret(os.environ["SECRET_NAME"])
        conn_string = f"host={connection_params['host']} port={connection_params['port']} dbname=postgres user={connection_params['username']} password={connection_params['password']}"
        
        # Column names for the COPY statement - defined as a list for safety
        columns = ['property_url', 'property_id', 'address', 'street_name', 'apartment', 'city', 'state', 'latitude', 
                  'longitude', 'postcode', 'price', 'bedroom_number', 'bathroom_number', 'price_per_unit', 'living_space', 
                  'land_space', 'land_space_unit', 'broker_id', 'property_type', 'property_status', 'year_build', 
                  'total_num_units', 'listing_age', 'RunDate', 'agency_name', 'agent_name', 'agent_phone', 'is_owned_by_zillow']

        
        # Connect to database
        with psycopg.connect(conn_string) as conn:
            # Process and import data in chunks
            chunksize = 1000
            total_imported = 0

            # Prepare the INSERT statement with placeholders
            placeholders = ", ".join(["%s"] * len(columns))
            insert_query = f"INSERT INTO us_housing_properties ({', '.join(columns)}) VALUES ({placeholders})"

            for chunk in pd.read_csv(csv_file, chunksize=chunksize):
                # Replace negative values with None
                numeric_cols = chunk.select_dtypes(include=['float64', 'int64']).columns
                chunk[numeric_cols] = chunk[numeric_cols].apply(lambda x: x.mask(x < 0, None))

                # Replace null or empty values with None
                chunk = chunk.replace({np.nan: None, '': None})
                # Create a list of tuples from the DataFrame chunk
                tuples = [tuple(x) for x in chunk.to_numpy()]

                
                # Use executemany for batch insertion
                with conn.cursor() as cur:
                    cur.executemany(insert_query, tuples)
                    conn.commit()
                
                total_imported += len(chunk)
                print(f"Imported {total_imported} rows...")
            
            print(f"Successfully imported {total_imported} rows into database.")

    except Exception as error:
        print(f"Error: {error}")
        raise

if __name__ == "__main__":
    main()
EOF

Run the Python code to input data into the RDS instance from the spreadsheet.

python3 load.py

After the script completes, navigate to the AWS Lambda console from your browser.
Search for the function named DataIndexerStack-DataIndexerFunction.
Open the function and navigate to the Test Choose Test. This will populate the embeddings table with database schema information.
Next, also from the Lambda console, search for the function named AppStack-TextToSQLFunction.
Open the function, and navigate to the Test Edit the Event JSON with the following:

{ 
  "query": "What are the top homes in San Francisco, CA?" 
}

Choose Test to trigger the Lambda function to query the RDS instance.

Note: if you receive an error in the initial test execution, the index in MemoryDB for Valkey may still be under construction. Choose Test again to retry, and the attempt should be successful.

Test the text-to-sql chatbot application

To run the Streamlit app, perform the following steps from your local host or laptop. In this section, we generate a Streamlit secrets file to store the API key and capture the API endpoint from the deployed CDK application.

Use the py script to set up Streamlit on your local machine:

python3 setup_helper.py streamlit

The output will look similar to the following:

Follow the instructions provided by the setup_helper script, with the four steps to:
1. Retrieve the API key for the API Gateway deployment
2. Create a ./streamlit/.streamlit/secrets.toml file with the api_key and api_endpoint values
3. Install Streamlit
4. Run the Streamlit app

Try some of the following questions to see the responses from the solution:

What are the top 5 most expensive properties in San Francisco, CA?
What key factors influence home prices in the 90210 zip code?
List top homes in the Bay Area under $700 per sq ft and over 1000 sq ft, sorted by price

Example output for: “What are the top homes in WA where avg sq ft is > $700 and sq ft is greater than 1000”

Example output for: “What are properties in Seattle that are under $600 per sq ft and over 1000 sq ft?”

In rare cases, you may experience a Gateway Timeout (HTTP code 504), which means the app was not able to respond within the 29-second timeout. Recommended actions:

Retry your prompt, as the result may have already been cached
Request an increase in the integration timeout for API Gateway

While this demonstration uses a single housing data table, the underlying text-to-SQL application is capable of supporting complex JOINs across multiple tables in more advanced implementations.

Clean up

To clean up the resources, you can use cdk destroy --all. For the sake of infrastructure management best practices, we recommend deleting the resources created in this demonstration.

Conclusion

In this post, we demonstrate a solution that allows you to use natural language to generate complex SQL queries with a variety of resources enabled by Amazon RDS for PostgreSQL and Amazon Bedrock. We also increased the accuracy of the generated SQL queries through effective best practices such as caching, parameterized SQL, and table augmented generation processes. Additionally, we used the metadata in the PostgreSQL’s pgvector to consider the table names asked in the query through the RAG framework. Finally, we then tested the solution in various realistic scenarios with different query complexity levels.

To get started, open the Amazon Bedrock console or review the new APIs in the Amazon Bedrock SDK, and start creating your prompts and flows today.

To learn more about using Amazon RDS for PostgreSQL and pgvector for AI and ML workloads, see Amazon RDS for PostgreSQL support for pgvector for simplified ML model integration.

For customers whose data is stored in Amazon Redshift or Amazon Sagemaker Lakehouse, Amazon Bedrock Knowledge Bases now supports structured data retrieval. More information can be found here Amazon Bedrock Knowledge Bases now supports structured data retrieval.

About the Authors

Frank Dallezotte is a Senior Solutions Architect at AWS and is passionate about working with independent software vendors to design and build scalable applications on AWS. He has experience creating software, implementing build pipelines, and deploying these solutions in the cloud.

Archana Srinivasan is an Enterprise Support Lead within Enterprise Support at Amazon Web Services. Archana provides strategic technical guidance for independent software vendors (ISVs) to innovate and operate their workloads efficiently on AWS cloud.

Source: Read More

Error’d: Pickup Sticklers

From Prompt To Partner: Designing Your Custom AI Assistant

Microsoft unveils reimagined Marketplace for cloud solutions, AI apps, and more

Design Dialects: Breaking the Rules, Not the System

Building personal apps with open source and AI

What Can We Actually Do With corner-shape?

Craft, Clarity, and Care: The Story and Work of Mengchu Yao

Cailabs secures €57M to accelerate growth and industrial scale-up

Using phpinfo() to Debug Common and Not-so-Common PHP Errors and Warnings

Using phpinfo() to Debug Common and Not-so-Common PHP Errors and Warnings

Mastering PHP File Uploads: A Guide to php.ini Settings and Code Examples

The first browser with JavaScript landed 30 years ago

Build an AI-powered text-to-SQL chatbot using Amazon Bedrock, Amazon MemoryDB, and Amazon RDS

Best practices for effective text-to-SQL

Semantic caching strategy

Parameterized SQL

Table Augmented Generation

Solution overview

Prerequisites

Download the dataset and upload to Amazon S3

Deploy the solution

Test the text-to-sql chatbot application

Clean up

Conclusion

About the Authors

Using phpinfo() to Debug Common and Not-so-Common PHP Errors and Warnings

Mastering PHP File Uploads: A Guide to php.ini Settings and Code Examples

Never Tell me The Odds Shirt

What is TCP/IP Model and How Does The Protocol Work

Zhipu AI Releases GLM-4.5V: Versatile Multimodal Reasoning with Scalable Reinforcement Learning

The 5-Hour Cold Email Dilemma: How to Reclaim Your Time and Boost Outreach Efficiency

Accelerate foundation model development with one-click observability in Amazon SageMaker HyperPod

China-Linked Earth Alux Uses VARGEIT and COBEACON in Multi-Stage Cyber Intrusions

AI updates from the past week: New OpenAI models, NVIDIA AI-Q Blueprint, and Anthropic’s Google Workspace integration — April 18, 2025

Top 10 Upskilling Courses to stay ahead in 2025

Build an AI-powered text-to-SQL chatbot using Amazon Bedrock, Amazon MemoryDB, and Amazon RDS

Best practices for effective text-to-SQL

Semantic caching strategy

Parameterized SQL

Table Augmented Generation

Solution overview

Prerequisites

Download the dataset and upload to Amazon S3

Deploy the solution

Test the text-to-sql chatbot application

Clean up

Conclusion

About the Authors

Related Posts