Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      From Data To Decisions: UX Strategies For Real-Time Dashboards

      September 13, 2025

      Honeycomb launches AI observability suite for developers

      September 13, 2025

      Low-Code vs No-Code Platforms for Node.js: What CTOs Must Know Before Investing

      September 12, 2025

      ServiceNow unveils Zurich AI platform

      September 12, 2025

      DistroWatch Weekly, Issue 1139

      September 14, 2025

      Building personal apps with open source and AI

      September 12, 2025

      What Can We Actually Do With corner-shape?

      September 12, 2025

      Craft, Clarity, and Care: The Story and Work of Mengchu Yao

      September 12, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Optimizely Mission Control – Part III

      September 14, 2025
      Recent

      Optimizely Mission Control – Part III

      September 14, 2025

      Learning from PHP Log to File Example

      September 13, 2025

      Online EMI Calculator using PHP – Calculate Loan EMI, Interest, and Amortization Schedule

      September 13, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      DistroWatch Weekly, Issue 1139

      September 14, 2025
      Recent

      DistroWatch Weekly, Issue 1139

      September 14, 2025

      sudo vs sudo-rs: What You Need to Know About the Rust Takeover of Classic Sudo Command

      September 14, 2025

      Dmitry — The Deep Magic

      September 13, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Databases»Data masking and performance improvements in AWS DMS 3.5.4

    Data masking and performance improvements in AWS DMS 3.5.4

    April 30, 2025

    We are excited to announce the availability of new features in AWS Database Migration Service (AWS DMS) replication engine version 3.5.4. This release includes two major enhancements: data masking for enhanced security and improved data validation performance.

    In this post, we deep dive into these two features. Refer to the release notes to see a list of all the new features available in this version.

    Data masking for enhanced security

    A data masking capability was requested by our customers to enhance data security during migrations, enabling you to transform sensitive data at the column level during migration and helping you comply with data protection regulations like GDPR. With AWS DMS, you can now create copies of data that redacts information that you need to protect at a column level.

    One of the biggest concerns for our customers during database migrations is the secure handling of sensitive information, such as account numbers, phone numbers, and email addresses. With AWS DMS 3.5.4, we have implemented three flexible data transformation rules:

    • Digits Mask
    • Digits Randomize
    • Hashing Mask

    To illustrate these transformation rules, we migrate a table called “EMPLOYEES” from an Amazon RDS for Oracle instance to Amazon RDS for PostgreSQL instance. Complete the following steps:

    1. Use the following table DDL on your source (Oracle) instance to create the EMPLOYEES table:
    CREATE TABLE EMPLOYEES (
        EMPLOYEE_ID NUMBER(6) PRIMARY KEY,
        FIRST_NAME VARCHAR2(50) NOT NULL,
        LAST_NAME VARCHAR2(50) NOT NULL,
        EMAIL VARCHAR2(100) UNIQUE,
        PHONE_NUMBER VARCHAR2(20),
        HIRE_DATE DATE NOT NULL,
        JOB_TITLE VARCHAR2(50),
        SALARY NUMBER(10,2),
        DEPARTMENT_ID NUMBER(4),
        MANAGER_ID NUMBER(6),
        ACCOUNT_NUMBER VARCHAR2(20),
        CREATED_DATE DATE DEFAULT SYSDATE
      
    );
    CREATE SEQUENCE emp_seq
        START WITH 1
        INCREMENT BY 1
        NOCACHE
        NOCYCLE;
    
    1. Insert the table EMPLOYEES with a few records.
    INSERT INTO EMPLOYEES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, JOB_TITLE, SALARY, DEPARTMENT_ID, MANAGER_ID,ACCOUNT_NUMBER)
    VALUES (emp_seq.NEXTVAL, 'John', 'Smith', 'john.smith@company.com', '555-0101', DATE '2020-01-15', 'CEO', 150000, 10, NULL,'456-123-456-789');
    
    INSERT INTO EMPLOYEES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, JOB_TITLE, SALARY, DEPARTMENT_ID, MANAGER_ID,ACCOUNT_NUMBER)
    VALUES (emp_seq.NEXTVAL, 'Sarah', 'Johnson', 'sarah.johnson@company.com', '555-0102', DATE '2020-03-20', 'IT Director', 120000, 20, 1,'666-000-111-222');
    
    INSERT INTO EMPLOYEES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, JOB_TITLE, SALARY, DEPARTMENT_ID, MANAGER_ID,ACCOUNT_NUMBER)
    VALUES (emp_seq.NEXTVAL, 'Michael', 'Brown', 'michael.brown@company.com', '555-0103', DATE '2021-02-10', 'Software Engineer', 85000, 20, 2,'777-333-444-555');
    
    INSERT INTO EMPLOYEES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, JOB_TITLE, SALARY, DEPARTMENT_ID, MANAGER_ID,ACCOUNT_NUMBER)
    VALUES (emp_seq.NEXTVAL, 'Emily', 'Davis', 'emily.davis@company.com', '555-0104', DATE '2021-06-15', 'HR Manager', 75000, 30, 1,'899-987-654-321');
    
    INSERT INTO EMPLOYEES (EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, PHONE_NUMBER, HIRE_DATE, JOB_TITLE, SALARY, DEPARTMENT_ID, MANAGER_ID,ACCOUNT_NUMBER)
    VALUES (emp_seq.NEXTVAL, 'David', 'Wilson', 'david.wilson@company.com', '555-0105', DATE '2022-01-20', 'Software Engineer', 80000, 20, 2,'567-111-222-333');
    1. Create an AWS DMS task using the option “Migrate” or “Migrate and replicate”.
    1. Set up the AWS DMS task with the following table mapping rule JSON. We mask the column ACCOUNT_NUMBER with the character #, the column PHONE_NUMBER with random numbers, and the column EMAIL with a hash. We have also used transformation rules to convert everything in lowercase which is optional.
    {
        "rules": [
            {
                "rule-type": "transformation",
                "rule-id": "171087779",
                "rule-name": "171087779",
                "rule-target": "column",
                "object-locator": {
                    "schema-name": "ADMIN",
                    "table-name": "EMPLOYEES",
                    "column-name": "ACCOUNT_NUMBER"
                },
                "rule-action": "data-masking-digits-mask",
                "value": "*",
                "old-value": null
            },
            {
                "rule-type": "transformation",
                "rule-id": "171057753",
                "rule-name": "171057753",
                "rule-target": "column",
                "object-locator": {
                    "schema-name": "ADMIN",
                    "table-name": "EMPLOYEES",
                    "column-name": "PHONE_NUMBER"
                },
                "rule-action": "data-masking-digits-randomize",
                "value": null,
                "old-value": null
            },
            {
                "rule-type": "transformation",
                "rule-id": "169940283",
                "rule-name": "169940283",
                "rule-target": "column",
                "object-locator": {
                    "schema-name": "ADMIN",
                    "table-name": "EMPLOYEES",
                    "column-name": "EMAIL"
                },
                "rule-action": "data-masking-hash-mask",
                "value": null,
                "old-value": null
            },
            {
                "rule-type": "transformation",
                "rule-id": "169926638",
                "rule-name": "169926638",
                "rule-target": "column",
                "object-locator": {
                    "schema-name": "%",
                    "table-name": "%",
                    "column-name": "%"
                },
                "rule-action": "convert-lowercase",
                "value": null,
                "old-value": null
            },
            {
                "rule-type": "transformation",
                "rule-id": "169918368",
                "rule-name": "169918368",
                "rule-target": "table",
                "object-locator": {
                    "schema-name": "%",
                    "table-name": "%"
                },
                "rule-action": "convert-lowercase",
                "value": null,
                "old-value": null
            },
            {
                "rule-type": "transformation",
                "rule-id": "169908300",
                "rule-name": "169908300",
                "rule-target": "schema",
                "object-locator": {
                    "schema-name": "%"
                },
                "rule-action": "convert-lowercase",
                "value": null,
                "old-value": null
            },
            {
                "rule-type": "selection",
                "rule-id": "169895493",
                "rule-name": "169895493",
                "object-locator": {
                    "schema-name": "ADMIN",
                    "table-name": "EMPLOYEES"
                },
                "rule-action": "include",
                "filters": []
            }
        ]
    }

    In the following sample output, we can see the output on the PostgreSQL instance with the data masking:

    dmsdb=> select employee_id,phone_number,email,account_number from admin.employees;
     employee_id | phone_number |                              email                               | account_number
    -------------+--------------+------------------------------------------------------------------+-----------------
              20 | 685-9897     | FDC2A4ABC53872D0F934B5614DDC312DAA165895065BB00A5986849AADE8C322 | ***-***-***-***
              21 | 579-3441     | 3A4FA9FE0AA0A2B468EDF13A29A75C4E3A20650243143D834D7898D40AA0FA2F | ***-***-***-***
              22 | 156-9277     | 0985B1D142A4067E397DF5AB56B03E3BF4857FB1F229CB39B49CE06E46B7AA98 | ***-***-***-***
              23 | 238-5321     | D07EAB207F4F1366C1E35B35E33F7842FF3EEB2C80E47FDAEB0900B49EE77697 | ***-***-***-***
              24 | 536-1233     | 3D438AF13A839ACDC24FD0CE8EB8C8C45083B90A31DF3583A88131614086C3B9 | ***-***-***-***
    (5 rows)

    The following image shows the sample output on Oracle for comparison.

    Oracle output for comparison

    In the preceding example, we have shown how you can use the data masking capability to hide sensitive information. Refer to Using data masking to hide sensitive information for further information.

    Enhanced data validation performance

    Maintaining data integrity is crucial during any database migration, but it can often be a time-consuming and resource-intensive process. AWS DMS 3.5.4 addresses this challenge with the enhanced data validation feature, which uses innovative techniques like fast partition validation to streamline the validation process.

    Some of the key benefits of enhanced data validation are:

    • Redistribution of resource usage from a replication instance to the AWS DMS source and target endpoint
    • Potential decrease of network usage
    • Efficient for wide tables without LOB data types

    The enhanced data validation feature is now available for specific AWS DMS migration paths, including Oracle to PostgreSQL, SQL Server to PostgreSQL, Oracle to Oracle, and SQL Server to SQL Server. To use this feature, make sure your environment meets the prerequisites.

    You can confirm AWS DMS is using enhanced data validation by reviewing the Amazon CloudWatch logs, which will show messages like the following:

    2025-02-12T21:23:26 [VALIDATOR ]I: Fast validation of table 'dbo'.'customer' : partition : 178 (partition_validator.c:1001)

    To quantify the performance improvements, we conducted benchmarking using HammerDB with the settings as shown in the following screenshot.

    HammerDB settings

    We created a full load and change data capture (CDC) task with validation disabled to migrate approximately 93 million records (15 GB in size) from an Amazon Relational Database Service (Amazon RDS) for SQL Server to Amazon Aurora PostgreSQL-Compatible Edition across a total of nine tables as a baseline.

    We then ran two validation-only tasks: one on AWS DMS 3.5.3 and one on AWS DMS 3.5.4 using r6i.xlarge instances. To speed up validation, we increased PartitionSize to 100,000 and ThreadCount to 15:

    "ValidationSettings": {
    "PartitionSize": 100000,
    "ThreadCount": 15,
    "ValidationOnly": true
    }

    The following screenshots show resource consumption on the AWS DMS replication instance running on engine version 3.5.4.

    CPU utilization on engine version 3.5.4

    Task memory usage on engine version 3.5.4

    The following screenshots show resource consumption on the AWS DMS replication instance running running on engine version 3.5.3.

    CPU utilization on engine version 3.5.3

    Task memory usage on engine version 3.5.3

    We can see a 91% reduction in TaskMemoryUsage of a validation-only task when run on AWS DMS 3.5.4 as compared to AWS DMS 3.5.3, and a 95% reduction in the CPU utilization of the underlying AWS DMS replication instance. For customers who want to run a separate validation-only task, you can use this feature and use the compute and memory of the AWS DMS replication instance in a more resourceful manner.

    Conclusion

    In this post, we discussed the transformation rules for data masking and enhanced data validation in AWS DMS 3.5.4. By implementing data masking features, you can ensure sensitive information remains protected throughout your database migration journey. With enhanced data validation feature, you have all the benefits of running validation with less resource consumption on DMS replication instance. Try these features and let us know how it helped your use case in the comments section.


    About the authors

    Suchindranath HegdeSuchindranath Hegde is a Senior Data Migration Specialist Solutions Architect at Amazon Web Services. He works with our customers to provide guidance and technical assistance on data migration to the AWS Cloud using AWS DMS.

    Mahesh KansaraMahesh Kansara is a database engineering manager at Amazon Web Services. He closely works with development and engineering teams to improve the migration and replication service. He also works with our customers to provide guidance and technical assistance on various database and analytical projects, helping them improving the value of their solutions when using AWS.

    Leonid SlepukhinLeonid Slepukhin is a Senior Database Engineer with the Database Migration Service (DMS) team at Amazon Web Services. He works on developing core features for AWS DMS and specialize in helping both internal and external customers resolve complex database migration and replication challenges. His focus areas include enhancing DMS capabilities and providing technical expertise to ensure successful database migrations to AWS cloud.

    Sridhar RamasubramanianSridhar Ramasubramanian is a database engineer with the AWS Database Migration Service team. He works on improving the DMS service to better suit the needs of AWS customers.

    Source: Read More

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleTop 8 Retail Arbitrage Apps By Category
    Next Article Smashing Security podcast #415: Hacking hijinks at the hospital, and WASPI scams

    Related Posts

    Repurposing Protein Folding Models for Generation with Latent Diffusion
    Artificial Intelligence

    Repurposing Protein Folding Models for Generation with Latent Diffusion

    September 14, 2025
    Artificial Intelligence

    Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment

    September 14, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    Rilasciata Oracle Linux 9.6: Scopri le Novità e i Miglioramenti nella Sicurezza e nelle Prestazioni

    Linux

    Toot, Toot: Linux Mastodon App Tuba Gets a Huge Update

    Linux

    CVE-2022-50226 – Linux Kernel – CCP Kernel Memory Leak Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Former head of PlayStation says Microsoft’s Xbox Game Pass is “dangerous” because “big companies dictate what games can be created,” and he could not be more wrong

    News & Updates

    Highlights

    Microsoft Copilot (web), Windows 11 app gets ChatGPT’s multi-file analysis feature for free

    August 31, 2025

    Microsoft has quietly rolled out ChatGPT’s advanced multi-file analysis support to Copilot on the web…

    I was skeptical of clip-style earbuds, then I took this budget pair on a run

    April 1, 2025

    What Is Conversational AI- Complete Guide

    May 7, 2025

    Xbox has become a Game Pass machine and nothing more — Is it enough to justify Microsoft’s console over a costly gaming PC?

    May 6, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.