Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      June 4, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      June 4, 2025

      How To Prevent WordPress SQL Injection Attacks

      June 4, 2025

      Smashing Animations Part 4: Optimising SVGs

      June 4, 2025

      I test AI tools for a living. Here are 3 image generators I actually use and how

      June 4, 2025

      The world’s smallest 65W USB-C charger is my latest travel essential

      June 4, 2025

      This Spotlight alternative for Mac is my secret weapon for AI-powered search

      June 4, 2025

      Tech prophet Mary Meeker just dropped a massive report on AI trends – here’s your TL;DR

      June 4, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Beyond AEM: How Adobe Sensei Powers the Full Enterprise Experience

      June 4, 2025
      Recent

      Beyond AEM: How Adobe Sensei Powers the Full Enterprise Experience

      June 4, 2025

      Simplify Negative Relation Queries with Laravel’s whereDoesntHaveRelation Methods

      June 4, 2025

      Cast Model Properties to a Uri Instance in 12.17

      June 4, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      My Favorite Obsidian Plugins and Their Hidden Settings

      June 4, 2025
      Recent

      My Favorite Obsidian Plugins and Their Hidden Settings

      June 4, 2025

      Rilasciata /e/OS 3.0: Nuova Vita per Android Senza Google, Più Privacy e Controllo per l’Utente

      June 4, 2025

      Rilasciata Oracle Linux 9.6: Scopri le Novità e i Miglioramenti nella Sicurezza e nelle Prestazioni

      June 4, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Databases»How Skello uses AWS DMS to synchronize data from a monolithic application to microservices

    How Skello uses AWS DMS to synchronize data from a monolithic application to microservices

    January 21, 2025

    This is a guest post co-written by Skello.

    Skello is a human resources (HR) software-as-a-service (SaaS) platform that focuses on employee scheduling and workforce management. It caters to various sectors, including hospitality, retail, healthcare, construction, and industry. Skello features include schedule creation, time tracking, and payroll preparation. As of 2024, Skello supports 20,000 customers and has 400,000 daily users on the platform in Europe.

    Planning and optimizing shifting teams involves many parameters to consider: legal constraints linked to collective agreements, the structure of the establishment, the way the payroll works, and so on. Skello’s value lies in its ability to coordinate and optimize the management of schedules between teams: considering individual requests, managing unforeseen events, or changing activity flows.

    From a technical point of view, these business functionalities are broken down into microservices. It’s a big challenge! How can these services communicate with each other while keeping each other’s data up to date?

    In 2021, we, the Skello IT team, migrated our platform to Amazon Web Services (AWS) due to the availability of existing solutions for hosting our monolithic application and the limitations of our former platform in scaling to accommodate our growing customer base. Prior to the migration, our database was already hosted on an AWS account, but it didn’t adhere to the Well-Architected Framework multi-account strategy, consisting of a single server without optimization or security measures in place.

    After the migration to AWS, we implemented security standards for our database, such as encryption of data at rest and in transit or network isolation. The application stack was deployed across multiple Availability Zones (a Multi-AZ deployment) with a comprehensive backup strategy and automated backups with a retention period. Our post-migration architecture on Amazon Elastic Compute Cloud (Amazon EC2) for the monolith complied with AWS architectural recommendations. We implemented an Application Load Balancer with scaling policies configured based on traffic patterns and business metrics.

    After our monolith was deployed on AWS, we began deploying new projects as microservices, each with its own database according to the component, such as Amazon DynamoDB or Amazon Relational Database Service (Amazon RDS). Our focus then shifted to splitting the monolith into microservices to reduce dependencies within the platform. The migration from a monolithic to a microservice architecture is an ambitious project, requiring the maintenance of synchronization and availability across the entire platform until the full microservice architecture is achieved. During this period, data must be synchronized in real time between the monolith components and the serverless components.

    In this post, we show how Skello uses AWS Database Migration Service (AWS DMS) to synchronize data from an monolithic architecture to microservices and perform data ingestion from the monolithic architecture and microservices to our data lake.

    Solution overview

    We had the following problem: we needed to keep the data in sync between the monolith and the microservices in AWS deployed in the production account and data account. We have data for application usage (client or Skello platform) and from internal sources (business and product). The solutions architects and the data teams are owners of the application data. Internal data is used by different teams at Skello (sales or product). This data is useful for our business dashboards and tracking key performance indicators (KPIs). Data engineers are owners of this data and propose the corresponding data formats.

    We synchronize data applications with change data capture (CDC) so that the monolith’s database updates the microservices with AWS DMS. The following diagram illustrates our solution architecture.

    After the data is loaded in the Amazon Kinesis Data Streams application, the AWS Lambda functions belonging to the microservices retrieve this data on specific shards. Kinesis Data Streams is a serverless streaming data service that makes it straightforward to capture, process, and store data streams at any scale.

    AWS DMS is a managed migration and replication service that helps move your database and analytics workloads to AWS quickly, securely, and with minimal downtime and zero data loss. AWS DMS provides a fast and straightforward way to replicate data across multiple target databases.

    The monolithic database of Skello is based on Amazon RDS for PostgreSQL. It’s a historical database at Skello and the central point of the platform. Amazon RDS is a managed database service that makes it straightforward to set up, operate, and scale a relational database in the cloud.

    In the following sections, we discuss the details of setting up the solution and its key components.

    AWS DMS replication instance

    To migrate the application data to several microservices, we need to use an AWS DMS replication instance to support the data migration.

    We have the possibility to be in a Multi-AZ deployment, which allows high availability in production. We need to create different resources (replica instance, the replication tasks to deploy to the target) to deploy the AWS DMS service. The following Terraform code is the configuration we used to deploy the replication instance, although we could achieve the same result with AWS CloudFormation.

    # ----- DMS Replication Instance -----
    # Terraform registry: 
    # https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/dms_replication_instance.html
    # Description: Provides a DMS (Data Migration Service) replication subnet group resource. DMS replication subnet groups can be created, updated, deleted, and imported.
    
    resource "aws_dms_replication_instance" "replication_instance" {
        allocated_storage	=	local.env_vars.replication_instance_storage
        apply_immediately	=	local.env_vars.apply_immediately
        auto_minor_version_upgrade =	local.env_vars.auto_minor_version_upgrade
        engine_version		=	local.env_vars.replication_instance_version
        multi_az		=	local.env_vars.multi_az
        kms_key_arn =   local.env_vars.kms_key_arn
        publicly_accessible	=	false
        
        replication_instance_class =	local.env_vars.replication_instance_class
        replication_instance_id =	"${local.application}-${local.workspace}"
        replication_subnet_group_id =	aws_dms_replication_subnet_group.dms_security_group.id
        allow_major_version_upgrade =	local.env_vars.allow_major_version_upgrade
    
        vpc_security_group_ids =    [
            data.aws_security_group.rds_security_group_replication_instance.id
        ]
    
        depends_on = [
            aws_iam_role_policy_attachment.dms_access_for_endpoint_AmazonDMSKinesisRole,
            aws_iam_role_policy_attachment.dms_cloudwatch_logs_role_AmazonDMSCloudWatchLogsRole,
            jj.dms_vpc_role_aws_AmazonDMSVPCManagementRole
        ]
    }

    By using the source, we point to the database name source with the connection information so that our AWS DMS replication instance has network access and permission to the source.

    # ----- DMS Endpoints -------
    # Terraform registry: 
    # https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/dms_endpoint
    # Description: Create an endpoint for the source database monolith RDS
    resource "aws_dms_endpoint" "monolith_rds_endpoint" {
        database_name	=	data.aws_ssm_parameter.db_name.value
        endpoint_id	=	"${local.rds_endpoint_name}-${local.workspace}"
        endpoint_type	=	"source"
        engine_name	=	"postgres"
        Port		=	"5432"
        server_name	=	local.env_vars.rds_url
        ssl_mode	=	"none"
        kms_key_arn =   local.env_vars.kms_key_arn
        Username	=	data.aws_ssm_parameter.db_user.value
        password	=	data.aws_ssm_parameter.db_password.value
    }

    AWS DMS tasks

    The migration of data is mainly done through AWS DMS tasks. The migration is divided into three parts:

    1. Source data migration involves the configuration of the data source to make sure the task connects to the correct source endpoint. This process is important for the success of integration and data transfer.
    2. We apply the changes made by Terraform (as shown in the preceding code snippet) through the terraform apply command. This command executes the actions proposed in a Terraform plan to create, update, or destroy infrastructure resources.
    3. We start the replication of data on the target.

    Our CDC task is based on a source endpoint (PostgreSQL database), the target endpoint is a Kinesis data stream, and the replication instance is the one mentioned earlier.

    We use a single Kinesis data stream specifically for the Skello app to handle transactions with microservices. AWS DMS is configured with only one replication task that replicates all tables from the Skello app RDS to Kinesis. This approach streamlines the data flow and simplifies the architecture for change data capture (CDC) and real-time data processing.

    We use Kinesis data stream on demand so the shard count is set to 0 in a Kinesis stream.

    On-demand mode automates our event management, making it suitable for workloads that are variable or unpredictable by our application traffic.

    We deployed a specific role AWS DMS uses for working with Kinesis Data Streams:

    {
    "Version": "2012-10-17",
    "Statement": [
                    {
                    "Effect": "Allow",
                    "Action": [
                        "kinesis:DescribeStream",
                        "kinesis:PutRecord",
                        "kinesis:PutRecords"
                    ],
                    "Resource": "arn:aws:kinesis:region:accountID:stream/streamName"
                    }
                    ]
    }    

    Then we created the endpoint using the following terraform code and referencing the role we just created and called dms_access_for_endpoint.

    We have another process to synchronize the application data with CDC to address internal demands with a data visualization tool. We use the same AWS DMS task and Kinesis data stream as the first CDC task from the monolith. The following diagram illustrates the CDC workflow to migrate the data from the monolith and microservices to the data lake.

    This process copies the database of the microservice to the data lake. The data microservice contains two Lambda functions. One function captures the data from the monolith and other microservices. The raw data is pushed to DynamoDB. The second function sends the formatted data to Amazon Data Firehose, which will aggregate and push the data to the data lake in an Amazon Simple Storage Service (Amazon S3) bucket. The replication model is the same for all ingestion. This process makes sure all changes made in the microservices’ databases are replicated in the data lake.

    Results from this solution

    These solutions provide us with a continuous update of our various databases with the monolith. The CDC approach is used on all our application workloads. When adding new tables, we perform a micro-interruption to automatically update the AWS DMS task through a terraform apply after the Terraform code updated. AWS DMS provides data recovery with a pointer. We are therefore confident in adding new tables.

    Full load

    Full loads are used at Skello to catch up data on new services or to copy data to our data lake to meet Skello’s internal needs.

    In this case, the ingestion of data is important for product managers to understand the data and have more key performance indicators (KPIs) on usage of the data. The ingestion comes from a single AWS DMS task that pushes data to a data lake in a specific Parquet format. The following diagram illustrates this workflow.

    This solution can be applied at any time of the day. However, we take some precautions when applying these changes when traffic is at its lowest, such as during lunchtime or the end of the day.

    Full loads solicit the source database according to the table loaded (we have some tables that can contain millions of lines, such as the shifts).

    Conclusion

    In this post, we demonstrated how we use AWS DMS to synchronize data from our monolithic architecture to microservices and to our data lake. AWS DMS provides a versatile solution that addresses our various data synchronization requirements across different use cases, while enabling a streamlined configuration through infrastructure as code (IaC) with Terraform.

    At Skello, we operate within a diverse data ecosystem. We invite you to explore more Skello blog posts about our data ecosystem, including one written by our Lead Data Engineer on data management across multiple AWS accounts.

    If you have any questions or suggestions, feel free to share them in the comments section.

    About the Author

    Mihenandi-Fuki Wony is Lead Cloud & IT at Skello. He is responsible for optimizing AWS costs and managing the platform’s infrastructure. His main challenges include maintaining a scalable platform on AWS, implementing FinOps practices, and continuously improving infrastructure-as-code for deployments. In addition to his professional activities, Mihenandi-Fuki has been involved in solidarity actions to support families and volunteer work focused on integrating young people into the professional world.


    Nicolas de Place is a Startup Solutions Architect who collaborates with emerging companies to develop effective technical solutions. He assists startups in addressing their technological challenges, with a current focus on data and AI/ML strategies. Drawing from his experience in various tech domains, Nicolas helps businesses optimize their data operations and leverage advanced technologies to drive growth and innovation.

    Source: Read More

    Hostinger
    Facebook Twitter Reddit Email Copy Link
    Previous ArticleUse Amazon Neptune Analytics to analyze relationships in your data faster, Part 1: Introducing Parquet and CSV import and export
    Next Article Built With MongoDB: Kraken Coding Revolutionizes Clinical Decision Support

    Related Posts

    Security

    HPE StoreOnce Faces Critical CVE-2025-37093 Vulnerability — Urges Immediate Patch Upgrade

    June 4, 2025
    Security

    CISA Adds Qualcomm Vulnerabilities to KEV Catalog

    June 4, 2025
    Leave A Reply Cancel Reply

    Hostinger

    Continue Reading

    What is Caps Lock Emoji and How to Use it on Windows 11

    Development

    Ferretv2: An Improved Baseline for Referring and Grounding

    Development

    Botnets: remote controls for cybercriminals

    Development

    StarRocks is a high-performance analytical database

    Linux

    Highlights

    How iFood built a platform to run hundreds of machine learning models with Amazon SageMaker Inference Machine Learning

    How iFood built a platform to run hundreds of machine learning models with Amazon SageMaker Inference

    April 8, 2025

    Headquartered in São Paulo, Brazil, iFood is a national private company and the leader in…

    BSD Release: BSD Router Project 1.994

    January 22, 2025

    Popular Linux distro for WSL is almost here, and you can help test it out TODAY

    February 17, 2025

    Freecharge lowered their identity management system cost and improved scaling by switching to Amazon DynamoDB

    May 20, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.