Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Databases»Enhanced Full Load Performance in AWS DMS Serverless

    Enhanced Full Load Performance in AWS DMS Serverless

    June 3, 2024

    With AWS Database Migration Service (AWS DMS), you can migrate your data from relational databases and data warehouses to AWS or a combination of a cloud and on-premises configurations. In June 2023, AWS DMS Serverless was released, which automatically provisions, scales, and manages migration resources to make database migrations straightforward and more cost-effective. It removes the necessity of handling infrastructure tasks like capacity estimation, provisioning, cost-optimization, and managing versions and patching.

    Until now, concurrent scaling of full-load migration tasks at a table level required you to manually configure a table mapping rule of type table-settings with the parallel-load option. A significant amount of effort is required to manually analyze a table and appropriately configure table mappings for an efficient parallel load on a table. Today, we’re excited to announce the new Enhanced Full Load Performance feature in AWS DMS Serverless. This feature automatically analyzes the table’s metadata, internally implements auto segmentation, and migrates these segments in parallel. As of this writing, this feature supports migration from Oracle as a source to Amazon Redshift as a target.

    In this post, we provide an overview of this new feature and present benchmarking results for two use cases.

    Enhanced Full Load Performance overview

    With Enhanced Full Load Performance, the AWS DMS Serverless replication orchestrator component receives a replication creation request, and then queries the source endpoint to fetch the metadata for all the tables available in the table mapping. After it gets the metadata, the replication orchestrator creates equally weighted segments per table. The replication orchestrator uses the Oracle ROWID pseudocolumn internally to create equal weighted segments, then it passes the segment information to the core engine, which does the actual migration.

    The core engine spins up subtasks, which are configurable using replication settings during replication creation and modification requests. A maximum of 49 subtasks are allowed. Each subtask is assigned a segment internally, and each migrates the data from the source to target endpoint in parallel. If the segments count is greater than the subtasks count, then as soon as a subtask completes the migration of its assigned segment, it will take the next available segment to migrate.

    Performance has also increased by avoiding writing to disk. AWS DMS Serverless fetches data from the Oracle Database and directly moves records to and from Amazon Simple Storage Service (Amazon S3). AWS DMS runs an Amazon Redshift COPY command at the directory level to copy all the data from Amazon S3 to Amazon Redshift.

    This architecture helps AWS DMS Serverless equally distribute the load between subtasks and migrate data in parallel more efficiently.

    The following are the key areas of Enhanced Full Load Performance improvements:

    Removal of manual configuration – The feature can automatically assess and distribute the load evenly among subtasks. Without this feature, when using parallel-load with partitions-auto or boundaries, you have to make sure the partitions are of equal or similar size. This is difficult to achieve for highly skewed distributions of data.
    Improved throughput – The feature offers a faster access path to the source table and an improved writing mechanism to minimize the overall time to migrate.

    The following diagram shows how AWS DMS Serverless can analyze and orchestrate parallelism with Enhanced Full Load Performance.

    The following are best practices to follow to achieve enhanced throughput between Oracle and Amazon Redshift:

    Mapping rules – This feature is automatically enabled for tables without the custom parallel-load option in table mappings. For tables with customized parallel-load options, AWS DMS Serverless will distribute the table load based on the given table mapping configurations.
    Target endpoint settings – We recommend the following target endpoint configurations:

    For dense data (not many null or duplicate values in the data), we recommend setting compression as disabled under the target endpoint settings.
    For sparse data (many null or duplicate values in the data), we recommend setting compression as enabled under the target endpoint settings.

    Replication settings – The default value for MaxFullLoadSubTasks is 8. This instructs AWS DMS on how many concurrent sessions to extract from the source table. Increasing MaxFullLoadSubTasks enables more parallelism. If your source database permits, set it to 49, which is the maximum value that is currently supported. To be resilient to CPU utilization bursts, it is recommended to choose a higher maximum DMS Capacity Units (DCU) when you increase MaxFullLoadSubTasks (for example, 49). AWS DMS Serverless will auto scale and auto provision based on the configuration. However, it will only scale up to the maximum DCU provided.

    Benchmark test

    The purpose of this benchmark is to show the effectiveness and simplicity of the new Enhanced Full Load Performance feature, regardless of whether the table has large object (LOB) columns or not. To compare the performance of the new feature with manual configuration, we performed a number of benchmark tests. We compared the time taken for each test case to migrate the entire table using the standard method in AWS DMS vs. the new approach in AWS DMS Serverless, where the table is automatically divided into equally weighted segments. We look at two examples in this post to show you the variety of situations in which the Enhanced Full Load Performance feature might be useful.

    The examples include benchmark tests for two tables: CASE1 and CASE2. The CASE1 table contained approximately 1.2 TB of scalar data types. In this table, the data is represented in 10 almost equal partitions with 1,000 columns. The CASE2 table contained about 5 TB of data, including LOBs. This table includes 150 columns with 180 partitions and 2160 sub-partitions.

    The following table summarizes the tables statistics.

    TABLE
    CASE1
    CASE2

    Table size (GB)
    1,146.74
    5,694.02

    LOB size (GB)
    NA
    2,375.46

    Number of rows
    10,643,228
    268,800,000

    Number of blocks
    139,983,421
    402,631,265

    Average row length (Bytes)
    94,413
    10,809

    Number of partitions
    10
    180

    Number of sub-partitions
    NA
    2,160

    The following summarizes the Oracle source and Amazon Redshift target resources:

    Oracle DB instance – m6i.32xlarge (vCPUs: 128, RAM: 512GiB, Network: 50Gbps, EBS: 40Gbps)
    Redshift cluster – ra3.16xlarge (nodes: 8, vCPUs: 48, RAM: 384GiB, Storage: 1 PB)

    The following table summarizes the standard AWS DMS compared to AWS DMS Serverless configuration we used in us-east-1.

    AWS DMS
    AWS DMS Serverless

    Replication instance type

    dms.r6i.32xlarge

    (vCPUs: 128, RAM: 1024GiB)

    DMS Capacity Units

    128 DCU

    (vCPUs: 32, RAM: 256 GB)

    AWS DMS pricing, USD/Hour
    Single AZ
    14.43456
    11.073

    Multi AZ
     28.86912
    22.146

    MaxFullLoadSubTasks
    49
    49

    Limited LOB size, KB
    63
    63

    Table settings manual configuration used
    “parallel-load”: {
    “type”: “partitions-auto”}
    Auto configured by Enhanced Full Load Performance

    AWS DMS supports a wide range of extracting data settings. By default, it is designed to use one thread for each table to extract data. For more information about the configuration settings, refer to Table and collection settings rules and operations.

    CASE1 test

    The CASE1 table contains 1,000 columns and 28 columns that have the NOT NULL constraint. This table includes a non-unique and non-partitioned B-Tree index based on three columns of numeric data type. The following table lists the data types and column numbers for the CASE1 table.

    Column Data Type
    Number of Columns in the Table

    BINARY_DOUBLE
    36

    BINARY_FLOAT
    35

    CHAR of 10, 100 or 500 BYTE
    105

    DATE
    36

    NCHAR of 10, 100 or 500 CHAR
    105

    NUMBER from (5,0) to (38,10) range
    252

    NVARCHAR2 10, 100 or 500 CHAR
    108

    TIMESTAMP with and without time zone
    215

    VARCHAR2 10, 100 or 500 BYTE
    108

    The data distribution in the CASE1 table is illustrated in the following chart.

    AWS DMS configuration

    For testing, we set up the partitions-auto option of the table mapping configuration, as shown in the following example code:

    {
    “rules”: [
    {
    “rule-type”: “selection”,
    “rule-id”: “324263196”,
    “rule-name”: “324263196”,
    “object-locator”: {
    “schema-name”: “SCHEMA_NAME”,
    “table-name”: “CASE1”
    },
    “rule-action”: “include”,
    “filters”: [],

    We set the table-settings rule type of the parallel-load option to partitions-auto:

    “parallel-load”: {
    “type”: “partitions-auto”
    },
    “isAutoSegmentationChecked”: false
    }
    ]
    }

    Results

    The following chart summarizes the benchmark results for the CASE1 table. It shows the average load time to complete in percentage (lower is better).

    CASE2 test

    The CASE2 table has 150 columns with two LOB data types and 30 columns with a NOT NULL constraint. The following table lists the unique data types and column numbers for the CASE2 table.

    Column Data Type
    Number of Columns in the Table

    BINARY_DOUBLE
    7

    BINARY_FLOAT
    7

    CHAR of 10 or 100 BYTE
    14

    CLOB
    1

    DATE
    7

    NCHAR of 50 or 100 CHAR
    14

    NCLOB
    1

    NUMBER from (5,0) to (38,10) range
    29

    NVARCHAR2 of 50, 100 or 500 CHAR
    21

    TIMESTAMP with and without time zone
    28

    VARCHAR2 of 50 or 100 BYTE
    21

    The following chart describes the data distribution in the CASE2 table. The distribution is shown by partitions (the sub-partition data aggregated for proper partitions).

    AWS DMS configuration

    For this use case, we used the same configuration as the AWS DMS tasks for CASE1. We set the parallel-load option to partitions-auto.

    Results

    The following chart summarizes the benchmark results for the CASE2 table. It shows the average load time to complete in percentage (lower is better).

    Summary of results

    Based on these results, the new Enhanced Full Load Performance feature in AWS DMS Serverless outperforms manual configuration. We observed in both cases that the Enhanced Full Load Performance feature simplifies configurations while improving performance. Manual configuration, especially for tables with significantly skewed distributions, requires an in-depth understanding of data distribution in the table as well as advanced configurations, which may not always result in an evenly distributed load among the subtasks.

    Conclusion

    In this post, we introduced the Enhanced Full Load Performance feature in AWS DMS Serverless. Our benchmark tests demonstrated that the new throughput feature outperforms manual table mapping rules of type table-settings with the parallel-load option. We showed how you can use Enhanced Full Load Performance to help you migrate your data faster by seamlessly analyzing your table’s metadata, internally auto segmenting the data, and loading the table in parallel. For more information about automatic table concurrency scaling, refer to AWS DMS Serverless.

    About the authors

    Nelly Susanto is a Senior Database Migration Specialist at AWS Database Migration Accelerator. She has over 10 years of technical experience focusing on migrating and replicating databases and data warehouse workloads. She is passionate about helping customers on their cloud journey.

    Denys Nour is a Database Engineer II at AWS DMS. His background is based on more than 7 years of database migration experience. He is fully committed to providing a smooth and successful cloud migration for every customer.

    Nishit Raval is a Software Engineer at AWS. He has over 10 years of experience in building software systems and leading technical teams. He enjoys problem-solving. In his free time, he loves playing sports and traveling to experience new cultures.

    Sneha Pradhan is a Software Engineer at AWS DMS with 7 years of technical experience in distributed software systems. They are passionate about solving challenging problems and integrating new technology with existing solutions.

    Source: Read More

    Hostinger
    Facebook Twitter Reddit Email Copy Link
    Previous ArticleUnit testing Apache TinkerPop transactions: From TinkerGraph to Amazon Neptune
    Next Article Right-sizing Amazon RDS for Db2 by replaying the Db2 LUW workload

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 17, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2024-47893 – VMware GPU Firmware Memory Disclosure

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    10 Reasons You’re Not Closing the Deal in Your Job Search (Free Download)

    News & Updates

    Why Should SMBs and Large Enterprises Invest in Red Team Assessments?

    Development

    New Phishing Tool GoIssue Targets GitHub Developers in Bulk Email Campaigns

    Development

    Researchers at Stanford Introduces In-Context Vectors (ICV): A Scalable and Efficient AI Approach for Fine-Tuning Large Language Models

    Development
    GetResponse

    Highlights

    News & Updates

    OpenAI CEO touts Trump’s hands-on approach with $500B Stargate project as a “welcome breath of fresh air” — Altman predicts a “vibe shift” in Silicon Valley

    February 11, 2025

    OpenAI CEO Sam Altman recently touted President Trump’s approach to tech and AI as opposed…

    Q&A: Developing software-defined vehicles

    August 13, 2024

    Online Notes Sharing System using Python Django and MySQL

    May 25, 2024

    New AI system successfully identifies Alzheimer’s disease using speech analysis

    July 4, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.