In today’s data-driven landscape, organizations must ensure their mission-critical databases can handle demanding workloads without sacrificing performance. Amazon Relational Database Service (Amazon RDS) is a managed service offering a cost-efficient, reliable, and highly scalable managed service that satisfies such requirements.
In PostgreSQL, as with many relational databases, transactions are first logged in a Write-Ahead Log (WAL) buffer before being recorded in a separate WAL segment. This process is crucial for data recovery and replication. The performance of PostgreSQL transactions is affected by the storage system where these WAL segments reside. By default, all files, including WAL, are stored in the same location. However, PostgreSQL documentation recommends placing the WAL on separate storage to enhance performance. As detailed in the post Enhance database performance with Amazon RDS dedicated log volumes, Amazon introduced the Dedicated Log Volume (DLV) feature, enabling customers to store WAL segments on a separate volume with a fixed size of 1,024 GiB and 3,000 Provisioned IOPS.
One key advantage of DLV is that it enables write coalescing, which reduces overall IOPS demand for the WAL by grouping log writes into larger, more efficient IO operations. In typical storage setups, log and data writes are mixed in smaller, frequent operations, whereas DLV keeps them separate, allowing logs to be written in larger chunks. This separation increases throughput and IO efficiency, potentially lowering the need for high-performance storage and reducing costs without sacrificing performance.
In this post, we guide you through the process of benchmarking the performance of Amazon RDS for PostgreSQL using the Dedicated Log Volume (DLV) feature. To do this, we use pgbench – a tool for running benchmark tests on PostgreSQL databases, pgbench
repeatedly executes a defined sequence of SQL commands across multiple concurrent database sessions. By analyzing the average transaction rate, measured in transactions per second, we can gain valuable insights into the performance characteristics of your RDS for PostgreSQL deployment. This allows you to make a direct comparison between the performance of Amazon RDS for PostgreSQL with and without the Dedicated Log Volume (DLV) feature enabled.
Through our benchmarking, you’ll learn how to quantify the performance improvements delivered by DLV. Let us get started and explore the potential of Amazon RDS for PostgreSQL with Dedicated Log Volume!
Solution overview
The solution involves running a benchmarking script on an Amazon Elastic Compute Cloud (Amazon EC2). The script tests the performance against two separate RDS for PostgreSQL instances where one instance has DLV enabled and the other does not have DLV enabled.
The following diagram illustrates the high-level architecture
Prerequisites
To get started, you must complete the following prerequisites:
- Launch Amazon EC2 Linux instance . It is recommended to launch the instance in the same Virtual Private Cloud (Amazon VPC) as your RDS instance to minimize network latency.
- Install
pgbench
on the EC2 instance. Since the PostgreSQL server installation package includes thepgbench
utility by default, you can install PostgreSQL from the amazon extras library.
- You need a psql client with connectivity to your Amazon RDS for PostgreSQL instance. The previous step of PostgreSQL server installation also includes the
psql
client by default.
- Launch two Amazon RDS for PostgreSQL instances one with DLV and one without.
To create instances, use the following AWS Command Line Interface (AWS CLI) command, providing your actual AWS Region and AWS account number. The parameter --dedicated-log-volume
enables the DLV feature, and --no-dedicated-log-volume
disables it. Refer to Enabling DLV when you create a DB instance for more information.
The following code creates an RDS for PostgreSQL instance with DLV:
The --no-dedicated-log-volume
flag is not required during instance creation, as instances are created as non-DLV by default. The following code creates an RDS for PostgreSQL instance without DLV:
We used AWS Secrets Manager to store the master user account password. Refer to Improve security of Amazon RDS master database credentials using AWS Secrets Manager for more information on using Secrets Manager to improve the security of your RDS instance.
Performance benchmarking
This section discusses the benchmark we ran to compare performance between an RDS for PostgreSQL instance with DLV enabled and an instance with no DLV.
Infrastructure
The following table summarizes the infrastructure of the EC2 instance on which the benchmarking script is run.
Instance class | Operating System | vCPU | Memory (GiB) |
r5.4xlarge | Amazon Linux 2023 | 16 | 128 |
The following table summarizes the infrastructure of each RDS instance.
Instance name | Instance class | vCPU | Memory (GiB) | Storage | Dedicated Log Volume |
dlv | db.r5.16xlarge | 64 | 512 | IO2 (10,000 PIOPS) | On |
withoutdlv | db.r5.16xlarge | 64 | 512 | IO2 (10,000 PIOPS) | Off |
Database configuration parameters
Like other relational database engines, the performance of Amazon RDS for PostgreSQL is directly impacted by database configuration parameters. Parameters related to commit, checkpoint, and WAL configuration are vital for evaluating performance. To obtain statistics for WAL, it’s necessary to enable the track_wal_io_timing
parameter. Throughout the test, the default parameters of Amazon RDS for PostgreSQL 15.5 are used; the following are the list of key parameters and their settings:
Benchmarking script
To achieve the most accurate performance comparisons, we benchmarked DLV and non-DLV instances independently using the following script. Prior to benchmarks, we reset the database’s WAL statistics with the pg_stat_reset_shared
function to make sure results reflect the specific test conditions rather than previous activity. Our performance evaluation used the pgbench
tool to create a TPC-B-like workload, simulating 64 concurrent clients processing half a million transactions each against data scale (-s
) of 10,000. TPC-B is a widely recognized industry standard benchmark that focuses on measuring transaction throughput (transactions per second), providing a valuable metric for comparing the performance of different database systems.
The following is the script:
Save the script to a file named dlv_bm.sh
and make it executable. To run this script, use the following command. You can adjust the benchmarking parameters as needed:
- SCALE = scale factor i.e Multiply the number of rows generated by the scale factor
- NUM_TRANSACTIONS = Number of transactions within the interval
- NUM_CLIENTS = Number of concurrent database sessions
- NUM_THREADS = Number of worker threads
- AWS_ACCOUNT_ALIAS = AWS account alias
- AWS_REGION = The region where your resources are located.
- DLV_INSTANCE = The name of your RDS instance with DLV
- NON_DLV_INSTANCE = The name of your RDS instance without DLV
For example:
Upon completion of the script, the following three log files are generated for further analysis:
The two key log files for the comparison are the ones with the dlv-blog-run-dlv-*.log
and dlv-blog-run-withoutdlv-*.log
patterns. These represent the runs with Dedicated Log Volume enabled and disabled, respectively.
By running a diff
command on these two log files, we can directly compare the transaction performance metrics reported by pgbench.
For example:
Following is the output from the benchmark logs we did:
Monitoring DLV
This section explores various techniques to monitor an RDS for PostgreSQL instance with DLV enabled.
Monitor DLV using Amazon CloudWatch
Keeping an eye on DLV that stores transaction logs (WALs) is just as important as monitoring other database volumes. Because the speed of writing WALs directly affects transaction performance, monitoring IOPS, latency, and throughput of the DLV is crucial for understanding overall database performance. You can use Amazon CloudWatch to monitor these metrics. The post Enhance database performance with Amazon RDS dedicated log volumes shows relevant CloudWatch metrics for DLV usage monitoring.
Monitor DLV using PostgreSQL’s pg_stat_wal
In addition to CloudWatch metrics, the PostgreSQL view pg_stat_wal, available in version 14 and later, can help you monitor WAL performance in your RDS for PostgreSQL instance. This view is part of PostgreSQL’s Cumulative Statistics System and provides detailed statistics about WAL activity.
Evaluating performance
To assess the impact of DLV on database performance, we reviewed following metrics:
- Runtime
- Transactions per second (TPS) and latency average recorded by
pgbench
- PostgreSQL’s cumulative statistic,
wal_sync_time
andwal_write_time
, captured frompg_stat_wal
- Graphs from Amazon RDS Performance Insights
Runtime
The runtime represents the duration required for processing all 32 million transactions using pgbench
, with each client handling half a million transactions. The runtime underwent a significant reduction, nearly halving from 77 minutes for instances without DLV to just 40 minutes for instances with DLV enabled. This indicated a 92.5% improvement in processing time, emphasizing the substantial improvement powered by the instance with DLV.
Metric | Without DLV | With DLV | % Improvement |
Time taken by the benchmark (minutes) | 77 | 40 | 92.5% |
TPS and latency
The following table shows the performance improvement by the instance with DLV enabled. TPS experienced an increase of 91.83%, nearly doubling the system’s capacity to process transactions. Simultaneously, latency, a measure of response time, was cut in half, with an improvement of by 95.56%. This reduction in latency translates to a more responsive and efficient system. Overall, the implementation of DLV has led to a considerable enhancement in system performance, making it faster and more capable of handling a higher volume of tasks.
Metric | Without DLV | With DLV | % Improvement |
TPS | 6984 | 13397 | 91.83% |
Latency average (milliseconds) | 8.8 | 4.5 | 95.56% |
WAL statistics
We closely examined key WAL statistics from pg_stat_wal
like wal_write_time
(total time spent writing WAL buffers to disk) and wal_sync_time
(total time spent syncing WAL files). DLV demonstrated improvement in both metrics: 102.6% for sync time and 9.2% for write time in our benchmark. This translates to faster transaction commits and a noticeable boost in overall database responsiveness, making DLV a good fit for optimizing PostgreSQL in high-demand scenarios.
Metric | Without DLV | With DLV | % Improvement |
wal_write_time (milliseconds) |
56656.796 | 51882.614 | 9.2 |
wal_sync_time (milliseconds) |
4431621.33 | 2186910.078 | 102.6 |
RDS Performance Insights
The following are graphs captured from RDS Performance Insights during the time period of benchmarking.
The first graph shows database load and top waits with DLV.
The following graph show database load and top waits without DLV.
The comparison in Performance Insights data highlighted a significant difference in resource consumption between DLV and non-DLV instances. The DLV-enabled instance consistently exhibited a lower average active session count, indicating a decrease in the overall demand for resources on the database. This improvement was attributed to DLV’s ability to offload the demanding task of WAL segment writing to a separate volume, freeing up the main database storage for other operations.
The standard RDS instance (without DLV) showed a higher prevalence of wait events, particularly those related to lightweight locks (LWLock
). This often signals I/O contention, where the database is struggling to keep up with data write demands. In contrast, the DLV-enabled instance experienced a drastic reduction in these wait events, leading to more efficient data retrieval.
The wait details are as follows:
- LWLock:WALWrite – This wait event was substantially lower in the DLV-enabled instance (12.97% of AAS) compared to the non-DLV instance (35.69% of AAS), highlighting DLV’s effectiveness in mitigating WAL-related bottlenecks.
- IO:WALSync – This wait event is higher in the non-DLV instance with 0.83% of AAS compared to the DLV instance with 0.65% of AAS. Although it’s less notable, this wait event was still slightly higher in the non-DLV instance, further supporting DLV’s role in improving WAL write performance.
Benchmarking results confirmed DLV’s effectiveness in mitigating WAL-related wait events, aligning with its intended function. This validation demonstrates DLV’s success in offloading WAL writes, leading to enhanced database performance.
Clean up
If you are testing this solution, complete the following steps to remove resources and avoid charges
- On the Amazon EC2 console, select the Linux EC2 instance, and on the Instance state menu, choose Terminate instance.
- On the Amazon RDS console, select the RDS for PostgreSQL instances, and on the Actions menu, choose Delete.
Conclusion
In this post, we demonstrated how DLV can empower your RDS database with enhanced write performance, reducing WAL contention, a common bottleneck in write-intensive workloads. By offloading WAL writes to a dedicated volume, DLV frees up your main database resources, enabling smoother data access and overall improved responsiveness. This translates to faster write operations, reduced latency, and a more efficient database environment, which is especially crucial for applications demanding consistent high performance.
Furthermore, DLV alleviates storage bottlenecks by isolating WAL activity, making sure critical write operations aren’t hindered by other database processes. This results in a more stable and predictable performance, even under heavy load. If your RDS workload involves frequent writes or demands low-latency transactions, enabling DLV is a strategic move to unlock your database’s full potential and achieve optimal performance under pressure.
DLV is compatible with PIOPS storage types (io1 and io2 Block Express) and are provisioned with a fixed size of 1,024 GiB and 3,000 Provisioned IOPS. They are supported in Amazon RDS for PostgreSQL versions 13.10 and higher, 14.7 and higher, and 15.2 and higher, across all Regions.
Do you have follow-up questions or feedback? Leave a comment. We’d love to hear your thoughts and suggestions.
About the Authors
Naga Appani is a Database Engineer at Amazon Web Services and Subject Matter Expert for RDS PostgreSQL, he has over a decade of experience in working with relational databases. At AWS, He is mainly focused on PostgreSQL deployments, he works with developers in bringing new features and assist customers in resolving critical issues with Aurora and RDS PostgreSQL Database systems.
Belma Canik is a Senior Database Specialist Technical Account Manager (STAM) at Amazon Web Services. She helps customers run optimized workloads on AWS and make the best out of their cloud journey especially in Amazon RDS and Aurora Database Engines.
Sid Vantair is a Solutions Architect with AWS covering Strategic accounts. With over a decade of experience in working with relational databases, he thrives on resolving complex technical issues to overcome customer hurdles. Outside of work, he cherishes spending time with his family and fostering inquisitiveness in his children.
Source: Read More