Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      CodeSOD: An Echo In Here in here

      September 19, 2025

      How To Minimize The Environmental Impact Of Your Website

      September 19, 2025

      Progress adds AI coding assistance to Telerik and Kendo UI libraries

      September 19, 2025

      Wasm 3.0 standard is now officially complete

      September 19, 2025

      Development Release: Ubuntu 25.10 Beta

      September 18, 2025

      Development Release: Linux Mint 7 Beta “LMDE”

      September 18, 2025

      Distribution Release: Tails 7.0

      September 18, 2025

      Distribution Release: Security Onion 2.4.180

      September 18, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      GenStudio for Performance Marketing: What’s New and What We’ve Learned

      September 19, 2025
      Recent

      GenStudio for Performance Marketing: What’s New and What We’ve Learned

      September 19, 2025

      Agentic and Generative Commerce Can Elevate CX in B2B

      September 19, 2025

      AI Momentum and Perficient’s Inclusion in Analyst Reports – Highlights From 2025 So Far

      September 18, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Denmark’s Strategic Leap Replacing Microsoft Office 365 with LibreOffice for Digital Independence

      September 19, 2025
      Recent

      Denmark’s Strategic Leap Replacing Microsoft Office 365 with LibreOffice for Digital Independence

      September 19, 2025

      Development Release: Ubuntu 25.10 Beta

      September 18, 2025

      Development Release: Linux Mint 7 Beta “LMDE”

      September 18, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Databases»Understanding transaction visibility in PostgreSQL clusters with read replicas

    Understanding transaction visibility in PostgreSQL clusters with read replicas

    May 3, 2025

    On April 29, 2025, Jepsen published a report about transaction visibility behavior in Amazon Relational Database Service (Amazon RDS) for PostgreSQL  Multi-AZ clusters. We appreciate Jepsen’s thorough analysis and would like to provide additional context about this behavior, which exists both in Amazon RDS and community PostgreSQL. Coincidentally, our own internal testing recently found this same behavior, and we have already been working with the community to propose a fix for this long-standing issue (recognized and openly discussed by the PostgreSQL community since at least 2013).

    The reported issue relates to the order that transactions become visible (that is, their results reflected in reads) differing between the primary and replicas in cluster configurations. This issue doesn’t lead to data loss or corruption, isn’t present in Single-AZ PostgreSQL deployments, and doesn’t affect Amazon Aurora PostgreSQL Limitless Database or Amazon Aurora DSQL databases.

    In this post, we dive into the specifics of the issue to provide further clarity, discuss what classes of architectures it might affect, share workarounds, and highlight our ongoing commitment to improving community PostgreSQL in all areas, including correctness.

    Understanding the Long Fork behavior

    The report highlights what’s known as a Long Fork anomaly in database literature, which is considered a violation of Snapshot Isolation. Under this anomaly, it might be possible for two readers to observe the effects of transactions in a different order. For example, consider two concurrent transactions, T1 and T2, that modify distinct rows in a PostgreSQL database in a cluster configuration. The query Q1 issued against the primary might find that T1’s effects have already been recorded in the table whereas T2’s have not. Another query Q2 issued against a replica might observe the effects of T2 as already visible but not those of T1.

    The reason for this behavior is that on a PostgreSQL primary (in both standalone and replicated configurations), the order in which the effects of non-conflicting transactions become visible might deviate from the order in which they become durable. In other words, the visibility order of transactions doesn’t always match their logged commit order. When PostgreSQL acquires a snapshot, it records the list of transactions that were pending at that time, which are tracked in the ProcArray. The effects of those pending transactions are permanently excluded from the snapshot even after these transactions commit. At commit time, a transaction makes itself durable by recording its status in the Write-Ahead Log (WAL) and then asynchronously removes itself from the ProcArray. Therefore, if T1 and T2 commit concurrently, T1 might become durable before T2 (in WAL) but T2 might remove itself from the ProcArray before T1 does.

    This behavior has been known to the PostgreSQL community for many years, with detailed discussions occurring on the pgsql-hackers mailing list as far back as 2013. The Long Fork anomaly in community PostgreSQL affects all isolation levels (Read Committed, Repeatable Read, and Serializable). This behavior is not specific to Amazon RDS. It can be reproduced on self-managed PostgreSQL deployments.

    Example of potential impact

    To illustrate how Long Fork might manifest itself, consider a hypothetical dispute between Alice and Bob on whether the Jepsen post has ever reached the #1 spot on Hacker News. Assume that the number of page views for each post is recorded in separate rows of a relational table stored in a PostgreSQL database. The ranked list of posts is generated using a SQL query that sorts the posts by page view count. Alice and Bob access the website from different locations. Alice’s application server rendering the ranked list routes queries to the PostgreSQL primary, whereas Bob’s queries go to the replica.

    Now suppose that Alice and Bob keep refreshing their browsers, watching Jepsen’s post rise in popularity. Alice sees the post reach #1—she takes a screenshot of her webpage for the record. Bob observes the post to only reach #2. He disputes Alice’s finding and asks the maintainers of Hacker News to send him the commit log of transactions that incremented page view counters. From the log, Bob determines that the tracked post almost reached the top rank but right before it did, its page view count was beaten by another post due to someone’s concurrent click. Technically, Bob is right even though he was reading from the replica. Alice has the screenshot to prove her point, yet she witnessed a database state on the primary that she wasn’t supposed to see according to the commit history. If the replica didn’t exist and if Alice had no access to the commit logs, her claim would be valid and the observed behavior compliant with the Snapshot Isolation semantics.

    Aligning visibility order with commit order

    Although this behavior represents a deviation from formal Snapshot Isolation guarantees, it rarely impacts application correctness in practice. Most applications naturally serialize their operations through application-level constraints or by operating on related data that creates direct conflicts. PostgreSQL committers considered various solutions that were discussed on mailing lists and presented at PGConf.EU 2024. One such solution makes the visibility order match the commit order using Commit Sequence Numbers (CSNs). The proposed fix is rather involved and spans multiple patches.

    Even though the Long Fork anomaly is somewhat esoteric from the end-user perspective, fixing it is critical for bringing advanced enterprise-grade capabilities to PostgreSQL clusters. For example:

    • Support in distributed systems – Distributed PostgreSQL systems can’t use the visibility order because obtaining a consistent list of pending transactions across PostgreSQL nodes is practically infeasible. However, Aurora Limitless and Aurora DSQL implement consistent snapshots using time-based Multi-Version Concurrency Control (MVCC) instead, which avoids the Long Fork anomaly.
    • Query routing and read/write splitting – Offloading read-only queries and subqueries to synchronously updated or caught up read replicas might cause non-repeatable reads if visibility order deviates from commit order.
    • Data synchronization – Taking a copy of the database state using a snapshot query on the primary and rolling the state forward using the transaction log might cause inconsistencies.
    • Point-in-time restore – Restoring the database to a specific log sequence number (LSN) might produce a state that was never observable on the primary. This might complicate the analysis of application-caused data corruption because when a query returns incorrect results, it might be impossible to find a database state that the query ran on.
    • Storage layout optimization – Replacing the transaction identifiers in tuples by logical or clock-based commit time during query execution might make queries non-repeatable.
    • CPU utilization – Large production PostgreSQL servers support thousands of connections. In high-throughput, read-heavy workloads, a measurable fraction of CPU is spent on taking snapshots.

    AWS’s commitment to PostgreSQL

    At AWS, we’re deeply committed to PostgreSQL’s success. In 2022, we formed the PostgreSQL Contributors Team, dedicated to contributing to the core PostgreSQL engine. Our team actively participates in the PostgreSQL community’s development efforts. We employ leading database researchers advancing the state of the art in distributed databases. We maintain rigorous systems correctness practices, including formal methods for verification.

    We will continue to work with the PostgreSQL community to address the long-standing Snapshot Isolation anomaly in PostgreSQL.

    Conclusion

    In this post, we discussed the issue of transaction visibility in PostgreSQL clusters with read replicas, including what classes of architectures it might affect.

    While we work with the community on a long-term solution, consider taking the following actions:

    • Review your application’s assumptions about transaction ordering across nodes and endpoints. Applications should never rely on the implicit commit ordering of independent concurrent transactions.
    • Consider using explicit synchronization mechanisms if strict transaction ordering is required. Such mechanisms include shared counters (such as assigned ticket numbers or positions in the job queue), timestamps (such as time of the page view or stock trade execution time) or database constraints (for example, inventory must never go below zero).
    • Reach out to AWS Support with specific concerns about your deployments.

    We remain committed to transparency and will continue to work with the PostgreSQL community to advance the state of the art in database technology.

    Further reading

    To learn more about the work in the PostgreSQL community, how this problem is solved in Aurora DSQL and Aurora Limitless, and the research background, refer to the following resources:

    • Aasma, A.: High-concurrency distributed snapshots (PGConf.EU 2024)
    • Brooker, M.: AWS re:Invent 2024 – Deep dive into Amazon Aurora DSQL and its architecture
    • Sher, A. J., Wein, D.: AWS re:Invent 2024 – Achieving scale with Amazon Aurora PostgreSQL Limitless Database
    • Cerone, A., Bernardi, G. and Gotsman, A.,: A framework for transactional consistency models with atomic visibility. In 26th International Conference on Concurrency Theory (CONCUR 2015)
    • Brooker, M. and Desai, A.: Systems Correctness Practices at AWS: Leveraging Formal and Semi-formal Methods. Queue, 22(6), 2024

    About the author

    Sergey Melnik is a Senior Principal Technologist at AWS working on distributed systems, data management, and cloud computing.

    Source: Read More

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleElden Ring Nightreign classes: All 8 Nightfarer characters in FromSoftware’s co-op spinoff explained
    Next Article CVE-2025-4199 – Abundatrade Plugin for WordPress CSRF Vulnerability

    Related Posts

    Development

    GenStudio for Performance Marketing: What’s New and What We’ve Learned

    September 19, 2025
    Development

    Agentic and Generative Commerce Can Elevate CX in B2B

    September 19, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    CVE-2025-34091 – Google Chrome AppBound Encryption Padding Oracle

    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-4228 – Palo Alto Networks Cortex XDR Broker VM Privilege Escalation Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    DistroWatch Weekly, Issue 1121

    News & Updates

    KB5002700 crashes Office 2016 Word, Excel, Outlook on Windows

    Operating Systems

    Highlights

    CVE-2025-1478 – GitLab CE/EE Denial of Service Vulnerability in Board Names

    June 12, 2025

    CVE ID : CVE-2025-1478

    Published : June 12, 2025, 10:16 a.m. | 3 hours, 43 minutes ago

    Description : An issue has been discovered in GitLab CE/EE affecting all versions from 8.13 before 17.10.7, 17.11 before 17.11.3, and 18.0 before 18.0.1. A lack of input validation in Board Names could be used to trigger a denial of service.

    Severity: 6.5 | MEDIUM

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    Proof That Aliens Exist Beneath the Ocean May Come Out Shocking!

    April 21, 2025

    CVE-2025-4639 – Peergos XML XXE Vulnerability

    May 14, 2025

    Salesloft Takes Drift Offline After OAuth Token Theft Hits Hundreds of Organizations

    September 3, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.