Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Databases»Scale write performance on Amazon DocumentDB elastic clusters

    Scale write performance on Amazon DocumentDB elastic clusters

    April 16, 2024

    Amazon DocumentDB (with MongoDB compatibility) is a scalable, highly durable, and fully managed document database service that makes it straightforward to store, query, and index native JSON workloads in the cloud. Amazon DocumentDB decouples compute and storage, so each component scales independently. Amazon DocumentDB supports two types of clusters: instance-based clusters and elastic clusters.

    In this post, we demonstrate the ability of Amazon DocumentDB elastic clusters to scale a write workload by increasing the number of shards.

    Amazon DocumentDB elastic clusters

    Amazon DocumentDB elastic clusters support workloads with millions of reads/writes per second and petabytes of storage capacity. Elastic clusters simplify how you interact with Amazon DocumentDB by automatically managing the underlying infrastructure and eliminating the need to choose, manage, or upgrade instances. Amazon DocumentDB elastic clusters allow you to scale beyond the limits of instance-based clusters for write throughput and storage by distributing the database workload over multiple shards. Each shard has its own compute and storage volume distributed across three Availability Zones. Amazon DocumentDB elastic clusters shard the collections using a hash-based algorithm and partition data into smaller data sets across the shards. With Amazon DocumentDB elastic clusters, you can deploy a maximum of 32 shards. Each instance’s vCPU is configurable from 2–64 in powers of 2. Amazon DocumentDB elastic clusters provide the flexibility to change the number of shards (scale out or in) as well as the number of vCPUs (scale up or down).

    The following diagram shows the Amazon DocumentDB elastic cluster architecture:

    Benchmark configuration

    We used a simple insert benchmark application called bench02.py, with source code available on GitHub, to measure the performance on both Amazon DocumentDB instance-based and elastic clusters. The application creates and loads a collection with a 1 KB text field and three secondary indexes. The application is customizable, supporting collections with larger text fields and different numbers of secondary indexes.

    Our collection schema is as follows:

    customerId – int(10000)
    productId – int(1000000)
    quantity – int(100000)
    orderDate – date()
    textField – string

    We use customerId as the shard key for elastic clusters. A shard key is a required field in your JSON documents in sharded collections that elastic clusters use to distribute read and write traffic to the matching shard.

    We use the following secondary indexes:

    idx_customerId_orderDate – {“customerId”, “orderDate”}
    idx_customerId_productId – {“customerId”, “productId”}
    idx_productId_orderDate – {“productId”, “orderDate”}

    To test performance across the various clusters, we used the following test configurations:

    Infrastructure – One Amazon DocumentDB instance-based cluster and five elastic clusters (with 1, 2, 4, 8, and 16 shards, respectively)
    Benchmark duration – 300 seconds
    Batch size – 1,000
    Concurrency – 128 for the 1-shard and 2-shard elastic clusters, 128 for the instance-based cluster, 256 for the 4-shard elastic cluster, 512 for the 8-shard elastic cluster, and 1,024 for the 16-shard elastic cluster

    In the following sections, we detail the steps to set up the client environment and run the benchmark on instance-based and elastic clusters.

    Set up the client environment

    For our tests, we used an m6i.24xlarge Amazon Elastic Compute Cloud (Amazon EC2) instance with 96 vCPUs and 37.5 Gbps network bandwidth as the client to ensure the benchmark client doesn’t become a bottleneck. The client machine used to run the application must meet the following prerequisites:

    Python 3.7+, pymongo
    Mongo shell

    To configure and use the application, perform the following actions:

    Clone the GitHub repo on the client:

    $ git clone https://github.com/aws-samples/amazon-documentdb-samples
    $ cd amazon-documentdb-samples/samples/python-bench02

    Download the Amazon DocumentDB Certificate Authority (CA) certificate required to authenticate to your cluster:

    $ wget https://truststore.pki.rds.amazonaws.com/global/global-bundle.pem

    Run the benchmark on the Amazon DocumentDB instance-based cluster

    We configured an instance-based Amazon DocumentDB cluster with a primary and a replica instance. We chose db.r6g.8xlarge instances with 32 vCPUs for the instance type and Amazon DocumentDB engine version 5.0. The application was run with a concurrency of 128 client processes and a batch size of 1,000 for 5 minutes. See the following code:

    python bench02.py
    –uri ‘mongodb://[DOCDBUSER]:[DOCDBPASS]@[DOCDBENDPOINT]:27017/?tls=true&tlsCAFile=global-bundle.pem&replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false’
    –processes 128
    –database iibench
    –collection test
    –run-seconds 300
    –batch-size 1000
    –drop-collection
    –file-name instance

    To configure an instance-based Amazon DocumentDB cluster, refer to Creating an Amazon DocumentDB cluster.

    Run the benchmark on the Amazon DocumentDB elastic clusters

    We configured five elastic clusters with 1, 2, 4, 8, and 16 shards, respectively, each with 32 vCPU shard capacity. The application was run with a batch size of 1,000 for 5 minutes while varying the concurrency on each elastic cluster. For the elastic clusters, we specify the parameter –shard to ensure a sharded collection is created using field customerId as the shard key. See the following code:

    python bench02.py
    –uri’mongodb://[DOCDBUSER]:[DOCDBPASS]@[DOCDBENDPOINT]:27017/?tls=true’
    –processes 128
    –database iibench
    –collection test
    –run-seconds 300
    –batch-size 1000
    –drop-collection
    –shard
    –file-name elastic

    To configure an Amazon DocumentDB elastic cluster, refer to Create an elastic cluster.

    Results

    The benchmark results are the following:

    Cluster
    instance
    elastic-1
    elastic-2
    elastic-4
    elastic-8
    elastic-16

    Client Processes
    128
    128
    128
    256
    512
    1,024

    Throughput (inserts per sec)
    98,319
    80,216
    151,159
    284,208
    602,255
    1,215,902

    In the following graph, the x-axis represents the Amazon DocumentDB instance-based cluster and the various elastic cluster configurations with their respective concurrency (client processes). The y-axis represents the transaction throughput in inserts per second.

    The throughput of the single-shard elastic cluster is lower than the instance-based cluster. This is expected and can be attributed to the overhead of the request router in the elastic cluster architecture. However, we achieve write scale by increasing the shard count of the elastic clusters. We scaled the writes to over a million inserts per second on the 16-shard elastic cluster.

    Conclusion

    The capability of Amazon DocumentDB elastic clusters to distribute data and queries across multiple shards enables it to support workloads with high throughput requirements and petabyte-scale storage. As workloads increase, you can add additional shards to meet the higher demands of the application and remove shards if the workload decreases. In this post, we showed that Amazon DocumentDB elastic clusters support millions of writes per second for write-intensive workloads by increasing the number of shards.

    AWS welcomes your feedback. Leave your thoughts or questions in the comments section.

    About the Authors

    Sreejit Unny is a Senior Database Specialist Solutions Architect at AWS. He specializes in helping customers migrate and modernize databases on AWS. He collaborates closely with customers to design resilient, secure, and high-performing solutions that meet their current and future business requirements.

    Tim Callaghan is a Principal DocumentDB Specialist Solutions Architect at AWS. He enjoys working with customers looking to modernize existing data-driven applications and build new ones. Prior to joining AWS he has been both a producer and consumer of Relational and NoSQL databases for over 30 years.

    Source: Read More

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleAchieve auditability with Amazon RDS IAM authentication using attribute-based access control
    Next Article Advanced Testing Techniques with Cypress: Part 2 – Introduction to Advanced Techniques

    Related Posts

    Machine Learning

    LLMs Struggle with Real Conversations: Microsoft and Salesforce Researchers Reveal a 39% Performance Drop in Multi-Turn Underspecified Tasks

    May 17, 2025
    Machine Learning

    This AI paper from DeepSeek-AI Explores How DeepSeek-V3 Delivers High-Performance Language Modeling by Minimizing Hardware Overhead and Maximizing Computational Efficiency

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    A signal boost for JavaScript

    Development

    TUXEDO OS – Ubuntu-based Linux distribution

    Linux

    Handle tables without primary keys while creating Amazon Aurora PostgreSQL zero-ETL integrations with Amazon Redshift

    Databases

    Networking with SwiftUI [SUBSCRIBER]

    Development

    Highlights

    Development

    A Comprehensive Study by BentoML on Benchmarking LLM Inference Backends: Performance Analysis of vLLM, LMDeploy, MLC-LLM, TensorRT-LLM, and TGI

    June 10, 2024

    In large language models (LLMs), choosing the right inference backend for serving LLMs is important.…

    How to Convert Image to Text in Microsoft Word

    July 27, 2024

    CVE-2025-4086 – Mozilla Firefox and Thunderbird File Extension Disclosure Vulnerability

    April 29, 2025

    Marking a milestone: Dedication ceremony celebrates the new MIT Schwarzman College of Computing building

    July 12, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.