The purpose of this post is to qualitatively assess the performance impact, if any, of the Amazon Relational Database Service (Amazon RDS) for PostgreSQL deployment options available, and discuss the potential trade-offs between performance and availability with OLTP-intensive workloads. Online transaction processing (OLTP) is a type of database workload where many small transactions are processed by the database systems. “Online†refers to how such systems are expected to respond to user requests and process them in real time (process transactions).
Highly available database systems, like Amazon RDS Multi-AZ, must have no single point of failure; therefore, the database storage must be replicated to multiple physical locations—Availability Zones, in this case. Replicating transactional databases can be challenging because of the additional latency and complexity it adds.
Benchmarking your database is a crucial activity that serves several purposes:
Define baseline performances of database systems
Compare different database engines
Compare different architecture for the same database engine
Assess the performance impact of any configuration change
In this post, we present a qualitative performance comparison between RDS for PostgreSQL Single-AZ DB instance, Multi-AZ DB instance, and Amazon RDS Multi-AZ DB Cluster deployments.
OLTP benchmarking with HammerDB
For our benchmarking, we use HammerDB and TPC-C. HammerDB is the leading benchmarking and load testing software for the world’s most popular databases, supporting Oracle Database, Microsoft SQL Server, IBM Db2, MySQL, MariaDB, and PostgreSQL.
HammerDB is a fair-use implementation of the popular TPC-C and TPC-H benchmarks. It’s a free and open source software, with source code hosted by the TPC on GitHub and managed by the TPC-OSS subcommittee.
Amazon RDS Single-AZ database instance
An Amazon RDS Single-AZ DB instance is a managed instance running in a single Availability Zone, and Database files are stored on Amazon Elastic Block Store (Amazon EBS) volumes. Database files are not replicated to any other Availability Zone. Both read and write requests from an application, potentially running in multiple AZs, subnets, or even other AWS accounts, are routed to the single instance. The performance of the database is not influenced by any replication methodologies, because none are used. They are governed by the performance of the virtual machine, the network, and the EBS volumes involved.
The following diagram illustrates the high-level architecture of a Single-AZ DB instance, which is available for all Amazon RDS engines.
Amazon RDS Multi-AZ database instance
Amazon RDS Multi-AZ DB instance deployment, automatically provisions and maintains a synchronous standby replica in another availability zone. Use Amazon RDS Multi-AZ synchronous replication technologies to keep data on your standby database instance up to date with the primary. Running a DB instance with high availability can improve availability during planned system maintenance, and during unplanned outages or failures.
This synchronous replication affects database performance because every write-ahead log (WAL) segment, as well as any database page into the data files, written on the primary instance must be sent to the standby instance in another availability zone where it’s written to the EBS volume belonging to that standby instance, which adds additional network and IO latency to every write query. An OLTP workload generates many small transactions, which become as many small writes to the disk.
The following diagram illustrates a Multi-AZ DB instance architecture.
Amazon RDS Multi-AZ DB Cluster
An Amazon RDS Multi-AZ DB cluster deployment is a semi-synchronous, highly available deployment mode of Amazon RDS, which gives you two readable replica instances. An Amazon RDS Multi-AZ DB Cluster has a writer instance and two reader instances, each in a different availability zone in one AWS region. With a RDS Multi-AZ DB cluster, Amazon RDS replicates only the WAL files from the writer instance to both of the reader instances using the DB engine’s native replication capabilities. When a change is made on the writer instance, the WAL files are sent to both reader instances in parallel. Amazon RDS Multi-AZ DB cluster deployments use semi-synchronous replication, which requires acknowledgment from at least one reader instance in order for a change to be committed. The data is then flushed to the EBS volumes (io1, io2, or gp3) associated with the respective writer and reader instances.
The following diagram illustrates an Amazon RDS Multi-AZ DB Cluster architecture.
Replica lag and flow control
Databases are logically dived in units called pages. When users query for data, the pages containing that information are loaded from the storage into the database cache. When a transaction modifies data, the pages are modified in the cache, and then the changes are appended to the write-ahead log (WAL). After the WAL has been written, the transaction is considered complete. The WAL sender then transmits the WAL entries to the replicas. On each replica, a WAL receiver process receives the WAL records, writes them to the local copy of the WAL, and then sends an acknowledgement back to the writer. Then, a WAL Replay process continuously applies the changes to the database.
The following diagram illustrates a sample architecture:
While the database instance normally writes to the WAL log in parallel, the replay process is single-threaded. Under extreme write intensive workloads, writer backends on the primary instance generate WAL segments faster than what WAL Replay can re-do on the secondary. This means that under such extreme conditions, data on secondary instances is not fully in sync with the primary.
Replica lag is the time difference between the latest transaction on the writer DB instance and the latest applied transaction on a reader DB instance. The Amazon CloudWatch metric ReplicaLag represents this time difference. For more information about CloudWatch metrics, see Monitoring Amazon RDS metrics with Amazon CloudWatch.
Increased replica lag might be an issue for the following reasons:
Reads on the replicas are not consistent with the data on the writer
Failover time increases because a replica can’t be promoted until after all pending log records are applied
To protect against runaway replica lag, Amazon RDS Multi-AZ DB Clusters have a flow control feature that allows you to specify exactly how much replica lag you are willing to tolerate. The default is 120 seconds, but you can tune the flow_control.target_standby_apply_lag parameter to suit your workload. You can also adjust this before and after planned maintenance to ensure smoother, shorter failover times.
Write commit path
Amazon RDS Multi-AZ DB Cluster deployment is available only for RDS instance families with local SSD storage, because they use an optimized write commit path. The following diagram illustrates this architecture.
In the Amazon RDS Multi-AZ DB Cluster, the transaction changes are first written to the WAL files of the writer instance that is stored on the local NVMe disk, greatly reducing the commit latency. The WAL sender on the writer instance then transmits these WAL files to the WAL receiver on the reader instances where it is again written to the local NVMe disk. The solution does the following:
Writes are committed and then acknowledged to clients only after the first (fastest) standby acknowledges the WAL files its local SSD
Data is moved asynchronously to EBS volumes (io1, io2, or gp3) attached to the individual instances for long-term durability
Reads are served from the database cache, or the EBS volume
Benchmark settings
For our benchmark, we loaded the HammerDB software on a m6i.xlarge Amazon Elastic Compute Cloud (Amazon EC2) instance running Windows Datacenter 2022. RDS instances were deployed in the same Region, keeping the primary instance in the same Availability Zone of the EC2 instance hosting the HammerDB software. We used db.r6gd.4xlarge instances for all three deployment types. We used io1 with 3000 provisioned IOPS for all deployments, you can increase the provisioned IOPS to get more NOPM but they will not scale linearly with IOPS. We created a HammerDB database with the default tpcc name with 20 warehouses
We run timed driver script test with 2 minutes of ramp up time and 10 minutes of metered test time. We disabled the “Keying and Thinking timeâ€, which is a standard in the official TPC-C testing. By doing so, every virtual user is able to drive a CPU core to its maximum
we ran tests with 10 concurrent users:
We created 10 users because we want to drive RDS primary instance consistently between 50% and 70%, consistent with a real world intensive OLTP scenario, assessing the performance impact of the various deployment options available. To eliminate the influence of a cold or warm buffer cache, instances were rebooted after each test run and the tpcc database was restored from a snapshot every time to have the AUTOVACUUM process behaves the same way on all the different test runs.
Our main KPI was New Orders per Minutes (NOPM).
Benchmark results
Based on our description of the different architectures above, it should be no surprise that Amazon RDS Single-AZ DB instances and RDS Multi-AZ DB Clusters performed best in this benchmark.
RDS Multi-AZ DB Cluster deployment offers almost the same performance as RDS Single-AZ deployment because its optimized write commit path can compensate for the additional latency introduced by the semi-synchronous replication across Availability Zones.
The RDS Multi-AZ DB instance perform less well than either RDS Single-AZ DB instances or RDS Multi-AZ DB clusters because of the synchronous replication and the need to write WAL files as well as data files across AZ and to a network based EBS volume.
We can observe that the most waits for the RDS Multi-AZ DB instance came from processes waiting for WAL buffers to be written to disk (LWLock:WALWrite).
Under the same load, we can observe that the Amazon RDS Multi-AZ DB cluster is able to better utilize the CPU, and produce a higher NOPM, because it doesn’t have to spend as much time waiting for data to be replicated to one of the replica instances. (IPC:SyncRep).
Conclusion
In this post, we showed how Amazon RDS Multi-AZ DB clusters can improve data durability, as compared to RDS Single-AZ, and improve commit latencies, as compared to RDS Multi-AZ DB instances. We also showed how Amazon RDS Multi-AZ DB clusters can be configured to ensure that replication lag doesn’t exceed desired tolerances.
Leave your questions and feedback in the comments section.
About the Author
Andrea Caldarone is a Senior Technical Account Manager at AWS. He is a database enthusiast with over 25 years of experience in relational database systems. Outside of work, Andrea plays football for an amateur team and loves to make pizza for his family and friends.
Source: Read More