Amazon Quantum Ledger Database (Amazon QLDB) is a fully managed ledger database service that provides a complete and verifiable history of all transactions committed to the ledger. At the core of an Amazon QLDB ledger is the journal, an append-only data structure that contains a sequential, immutable record of transactions stored as blocks. The blocks in the journal are chained together using cryptographic hashing. The cryptographic hash chain provides transactional data integrity using a cryptographic verification method. These two characteristics—the immutability and cryptographic verifiability of the transaction history—are unique to Amazon QLDB and the reason customers use Amazon QLDB to provide high-integrity audit and change history for their data.
In this post, we discuss how to use Amazon Aurora PostgreSQL-Compatible Edition as an alternative to Amazon QLDB for auditing and what features of Amazon Aurora PostgreSQL can replace some of the unique capabilities offered by Amazon QLDB.
Replacing the journal
In Amazon QLDB, the underlying journal stores an immutable record of all committed transactions, including query statements and data definition commands. The journal’s transaction history can be exported into an Amazon Simple Storage Service (Amazon S3) bucket and accessed for auditing purposes. Amazon Aurora PostgreSQL does not keep a permanent, immutable record of changes. Instead, that history must be generated as audit data and stored outside of the database.
Amazon Aurora PostgreSQL supports the use of the pgAudit open source audit logging extension, which provides finer-grained session and object audit logging than PostgreSQL standard logging. You can choose what events are audited, including DDL operations, reads, writes, changes to roles and privileges, runs of functions, and user-initiated operations like vacuum and checkpointing. Log output is sent to the standard postgres.log file that is accessible from the Amazon RDS console and can be retained for up to 7 days. To retain audit log data permanently, you can configure your Aurora cluster to send logs to Amazon CloudWatch Logs, where they can be retained indefinitely. Amazon CloudWatch Logs provides querying, monitoring, alerting, and management capabilities for logs. CloudWatch Logs encrypts log files at rest. You can manage access to logs in CloudWatch using AWS Identity and Access Management (IAM), including the ability to prohibit deletion of logs.
CloudWatch Logs may be a difficult interface for auditors to use to perform a database audit. The post Build a centralized audit data collection for Amazon RDS for PostgreSQL using Amazon S3 and Amazon Athena provides a solution that gives auditors a straightforward interface for querying audit data. The solution uses Amazon Data Firehose to send database logs from CloudWatch Logs into Amazon S3, where it can be queried with SQL using Amazon Athena. The following diagram illustrates this architecture.
Replacing history()
The Amazon QLDB history() function provides access to all revisions of every record in a table. You can use the history() function to see how data records have changed over time. You can duplicate this functionality in Amazon Aurora PostgreSQL using audit tables that store a copy of changes made to rows in another table.
The conventional approach for audit tables is a table whose structure mirrors that of the table it audits. A trigger activates on every INSERT, UPDATE, and DELETE on the main table and sends a copy of the modified row into the audit table. Variations on this approach store the audited row as JSON, which removes the need to modify the structure of the audit table for structural changes in the source table. Table permissions are set to prohibit modification of rows in the audit table. The following diagram illustrates this design.
This approach differs from Amazon QLDB history in that, with correct permissions, users can modify data in the audit tables. Auditing data access permissions is crucial to maintaining the integrity of the data in the audit tables. Separating audit tables into a secure, dedicated audit database eliminates this problem and increases trust in the audit data. The post Create an audit trail for an Amazon Aurora PostgreSQL table provides a solution that uses AWS Database Migration Service (AWS DMS) to capture changes in the source database and store them securely in the audit database, as shown in the following diagram.
Replacing Amazon QLDB streams
Amazon QLDB streams provide a near real-time feed of ledger events into Amazon Kinesis Data Streams. Stream records contain data updates to the ledger and a record of query statements run in the database. You can use streams to replicate data from the ledger into other systems. You can analyze stream events in near real time to identify suspicious activity or changes to critical tables or data.
Data replication
When using Aurora as a primary database, you can replicate data from Aurora into other systems using AWS DMS. AWS DMS is not just for one-time data migrations. It supports ongoing replication into one or more target databases using change data capture (CDC) from the source Aurora PostgreSQL database. Unlike replicating data using Amazon QLDB streams, replicating data from Amazon Aurora PostgreSQL through AWS DMS doesn’t require custom coding. AWS DMS uses a declarative JSON syntax to map and transform data between the source and target databases.
Near real-time audits
To get near real-time auditing capabilities using Amazon Aurora PostgreSQL, you can use Database Activity Streams. Database Activity Streams provides a near real-time feed of low-level auditing information from the Aurora cluster into a Kinesis data stream. Like with Amazon QLDB streams, you can build custom Kinesis data stream consumers that filter, analyze, and process audit events. Database Activity Streams captures a much broader array of auditing data points than Amazon QLDB provides. Aurora reports DML and DDL events like Amazon QLDB, but also reports connections, changes to roles and privileges, runs of functions, and user-initiated operations like vacuum and checkpointing. One major difference between Database Activity Streams and Amazon QLDB streams is that Amazon QLDB only reports actions on committed transactions, so it is possible for users to perform undetected reads by canceling their transaction after reading. Aurora reports all actions to its activity stream, even for uncommitted transactions.
Database Activity Streams provide a trustworthy log of database administrator activities. Activation of an activity stream for an Aurora cluster is performed outside of the database and access to that operation is controlled by IAM permissions, not by database permissions. Database administrators do not have sufficient access to interfere with the collection of audit data or its publication into the Kinesis data stream. Database administrators also don’t have access to the audit data inside the Kinesis data stream and cannot interfere with the processing of that data.
In Amazon QLDB, the underlying journal stores an immutable record of all committed transactions. The journal’s transaction history can be accessed at any time by querying a table’s history() function or by re-streaming the journal through Kinesis Data Streams. This capability doesn’t exist for Database Activity Streams. After the streamed audit data expires, it is gone. To retain audit data, you can use Amazon Data Firehose to store streamed audit events in Amazon S3, where it can be retained indefinitely. The post Filter Amazon Aurora database activity stream data for segregation and monitoring provides a solution for filtering audit events and sending it to Amazon S3 using Firehose for long-term retention.
For even more information about auditing with pgAudit and Aurora Database Activity Streams, see Part 1: Audit Aurora PostgreSQL databases using Database Activity Streams and pgAudit.
Summary
In this post, we demonstrated the features of Amazon Aurora PostgreSQL that provide the auditability and change history capabilities that can replace some of the unique capabilities offered by Amazon QLDB for audit use cases.
In our next post, Migrate an Amazon QLDB Ledger to Aurora PostgreSQL, we’ll provide a solution for helping you migrate your data from an Amazon QLDB ledger into Amazon Aurora PostgreSQL.
For help planning your ledger migration or deciding if Amazon Aurora PostgreSQL is the right destination for your ledger data, contact your AWS representative.
About the Author
Dan Blaner is a Principal Solutions Architect specializing in ledger and relational databases. He enjoys learning, figuring stuff out, and helping other people figure stuff out. In his free time, he plays the bass and enjoys making bad music with good friends.
Source: Read More