When planning and implementing database migrations to the AWS, one of the phases is data migration, which may involve the usage of a data migration and replication tool such as AWS Database Migration Service (AWS DMS). AWS DMS is a cloud service that enables the migration of relational databases, data warehouses, NoSQL databases, and other types of data stores in real time. AWS DMS simplifies the migration process, automating schema conversion, data replication, and maintaining data consistency. You can seamlessly transition from on-premises databases or other cloud providers to AWS with a smooth and hassle-free migration experience.
Many organizations spend time designing, planning, and implementing those database migrations manually, which is a time-consuming activity and prone to human error. Automating the AWS DMS database migration with PowerShell has several advantages, such as:
PowerShell is native to the Windows OS, but in recent versions it provides cross-platform support to Linux and MacOs.
It has built-in support for AWS services when using the AWS Tools for PowerShell, including AWS DMS. This means that you can use PowerShell to manage AWS DMS directly from the command line or in scripts.
PowerShell can interact not only with AWS services, but also with different database engines such as Microsoft SQL Server, PostgreSQL, Amazon Relational Database Service (Amazon RDS), and more. This makes it straightforward to have a single tool to perform the end-to-end migration.
In this post, we show you how to use PowerShell and the AWS Tools for PowerShell to automate the creation of an AWS DMS instance and tasks to perform a database migration of your database to AWS. We discuss how to set up the necessary resources, configure the AWS DMS replication instance and tasks, monitor the migration process, and troubleshoot errors using PowerShell commands.
Solution overview
In this post, we automate the migration of databases to Amazon RDS for SQL Server using PowerShell. The following architecture diagram presents an overview of the solution.
The process includes the following steps:
Download the PowerShell scripts from the GitHub repository to a local workstation.
Run the PowerShell scripts in AWS DMS to create the necessary resources, such as:
Security groups
Replication subnets
Replication instance
Database endpoints
Migration tasks
The AWS DMS migration tasks are invoked by the automation scripts to load data into the target database instance.
Use PowerShell scripts to troubleshoot AWS DMS migration task failures through Amazon CloudWatch.
Prerequisites
To follow along, you must have the following prerequisites:
An active AWS account with an RDS instance deployed. For a list of supported engines, refer to Amazon RDS features.
Microsoft PowerShell 3.1 through 5.1, or PowerShell Core 6.0 or later on the workstation from where you will run the scripts.
AWSPowerShell installed.
An AWS Identity and Access Management (IAM) user in the AWS target account with permissions to implement and manage the following resources:
AWS DMS
CloudWatch dashboards and alarms
IAM credentials set up via the PowerShell console.
A Microsoft SQL Server instance running either running on premises or on Amazon Elastic Compute Cloud (Amazon EC2).
An RDS for SQL Server instance running in AWS.
The AdventureWorks2016 database is restored on source environment.
The AdventureWorks2016 database schema created on the target environment.
The source and target database instances are configured to be used by AWS DMS as described in the following documentation:
Sources for data migration
Targets for data migration
Although the solution uses Amazon RDS for SQL Server as the target platform, the steps described in this post can also work for other supported endpoints, with small modifications.
Create the required AWS DMS roles
Because you’re using PowerShell to provision your AWS DMS resources for the first time, you need create two IAM roles called dms-vpc-role and dms-cloudwatch-logs-role. As a best practice, you should use these role names. Those roles allow AWS DMS to create and manage resources in the AWS account on your behalf.
To create these roles, run the following script, or run the create-iam-role.ps1 script from the GitHub repository:
#
# // Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# // SPDX-License-Identifier: MIT-0
[string]$dmsAssumeRolePolicyDocument = ‘{
“Version”: “2012-10-17”,
“Statement”: [
{
“Effect”: “Allow”,
“Principal”: {
“Service”: “dms.amazonaws.com”
},
“Action”: “sts:AssumeRole”
}
]
}’
try{
$DMSVpcRoleName = “dms-vpc-role”
New-IAMRole -RoleName $DMSVpcRoleName -AssumeRolePolicyDocument $dmsAssumeRolePolicyDocument
Start-Sleep -Seconds 5
Register-IAMRolePolicy -RoleName $DMSVpcRoleName -PolicyArn “arn:aws:iam::aws:policy/service-role/AmazonDMSVPCManagementRole”
}
catch {
Write-Error $_
}
try{
$DMSCloudWatchRoleName = “dms-cloudwatch-logs-role”
New-IAMRole -RoleName $DMSCloudWatchRoleName -AssumeRolePolicyDocument $dmsAssumeRolePolicyDocument
Start-Sleep -Seconds 5
Register-IAMRolePolicy -RoleName $DMSCloudWatchRoleName -PolicyArn “arn:aws:iam::aws:policy/service-role/AmazonDMSCloudWatchLogsRole”
}
catch {
Write-Error $_
}
You should get the following output:
PS C:TempAWS Blog> C:TempAWS Blogcreate-iam-role.ps1
Path RoleName RoleId CreateDate Description
—- ——– —— ———- ———–
/ dms-vpc-role AROA3L4QBF7IAPKWYEXEC 01/03/2024 14:15:55
/ dms-cloudwatch-logs-role AROA3L4QBF7IMWL7XJAN4 01/03/2024 14:16:01
Provision a new AWS DMS replication instance
An AWS DMS replication instance is one of the core components for the database migration. When you deploy AWS DMS for the first time, a DMS replication instance is created in a VPC instance in a VPC. You use this replication instance to perform your database migration.
Create a replication subnet group
A replication subnet group is a logical group of two or more subnets that are available as part of your VPC. Before you deploy an AWS DMS instance, you should, as a best practice, create a new replication subnet group and include only the subnets you wish to use for AWS DMS traffic. If a replication subnet group already exists, you can skip this step in your setup, saving time and effort.
To create a new replication subnet group, run the following script, or run the create-dms-replication-subnet-group.ps1 script from the GitHub repository:
# Name of the Replication subnet
$ReplicationSubnetGroupIdentifier = “dms-replication-subnet-group”
$ReplicationSubnetGroupDescription = “DMS Subnet Group”
# The Subnet ID’s you want to include as part of the DMS Replication Subnet Group
$AWSSubnets = @(“YourSubnetId1″,” YourSubnetId2″)
if (!(Get-DMSReplicationSubnetGroup | Where-Object {$_.ReplicationSubnetGroupIdentifier -eq $ReplicationSubnetGroupIdentifier})){
# Creating a new DMS Replication Subnet Group.
New-DMSReplicationSubnetGroup -ReplicationSubnetGroupIdentifier $ReplicationSubnetGroupIdentifier -ReplicationSubnetGroupDescription $ReplicationSubnetGroupDescription -SubnetId $AWSSubnets
}
The output of the command returns the properties of the replication subnet group created. Take note of the ReplicationSubnetGroupIdentifier, because you will use it later on.
PS C:TempAWS Blog> C:TempAWS Blogcreate-dms-replication-subnet-group.ps1
ReplicationSubnetGroupDescription : DMS Subnet Group
ReplicationSubnetGroupIdentifier : dms-replication-subnet-group
SubnetGroupStatus : Complete
Subnets : {Amazon.DatabaseMigrationService.Model.Subnet, Amazon.DatabaseMigrationService.Model.Subnet}
VpcId : vpc-0fe44eb01203a54bb
You can also check the status of the replication subnet group on the AWS DMS console.
Create a security group
To enhance security and improve visibility of resource communication within your AWS account, it’s recommended to create a security group to be dedicated to the AWS DMS replication instance.
To create a new security group, run the following script, or run the create-dms-security-group.ps1 script from the GitHub repository:
$SecurityGroupName = “dms-sec-group”
$Description = “Security Group for AWS DMS”
# Vpc Id where you want to deploy your DMS resources
$VpcId = “YourVpcIp”
New-EC2SecurityGroup -GroupName $SecurityGroupName -Description $Description -VpcId $VpcId
Take note of the security group ID returned by the command’s output to use in later steps:
PS C:TempAWS Blog> C:TempAWS Blogcreate-dms-security-group.ps1
sg-0f57d952a8ec41412
Create the AWS DMS replication instance
You can now provision your new AWS DMS replication instance.
Change the command parameters in the provided code according to your environment and needs. By default, the script creates an AWS DMS replication instance of instance size dms.r5.large, equipped with 50 GB of storage. These default values are suitable for the purposes of this post and can be adjusted as per your needs. The script creates a Single-AZ AWS DMS replication instance and makes sure that it’s using the latest AWS DMS version available at the time of this writing.
To create an AWS DMS instance, run the following script, or run the create-dms-replication-instance.ps1 script from the GitHub repository:
# Change this according to your environment needs
$ReplicationInstanceIdentifier = “dms-instance-01”
# This is the Replication Subnet Group created in the previous section
$ReplicationSubnetGroupIdentifier = “dms-replication-subnet-group”
# This is the Security Group created in the previous section
$VpcSecurityGroupId = “sg-0f57d952a8ec41412”
# Change this according to your environment
$ResourceIdentifier = “dms-instance-01”
$ReplicationInstanceClass = “dms.r5.large”
$AllocatedStorage = 50
$EngineVersion = “3.5.2”
$DMSInstanceObject = New-DMSReplicationInstance -ReplicationInstanceIdentifier $ReplicationInstanceIdentifier -ReplicationSubnetGroupIdentifier $ReplicationSubnetGroupIdentifier -VpcSecurityGroupId $VpcSecurityGroupId -ResourceIdentifier $ResourceIdentifier -ReplicationInstanceClass $ReplicationInstanceClass -AllocatedStorage $AllocatedStorage -EngineVersion $EngineVersion
While ($DMSInstanceObject.ReplicationInstanceStatus -eq “creating”) {
$DMSInstanceObject = Get-DMSReplicationInstance | Where-Object { $_.ReplicationInstanceIdentifier -eq $ReplicationInstanceIdentifier }
Write-Output “DMS instance creation status: $($DMSInstanceObject.ReplicationInstanceStatus)”
Start-Sleep -Seconds 10
}
Every 10 seconds, the script checks for the status of the instance creation. You should get the following output:
PS C:TempAWS Blog> C:TempAWS Blogcreate-dms-replication-instance.ps1
DMS Instance creation status: creating
DMS Instance creation status: creating
DMS Instance creation status: creating
DMS Instance creation status: creating
DMS Instance creation status: creating
When your AWS DMS replication instance provision process is complete, the AWS DMS instance creation status changes to available:
DMS Instance creation status: creating
DMS Instance creation status: creating
DMS Instance creation status: available
Create source and target endpoints
An endpoint provides connection, data store type, and location information about your data store. AWS DMS uses endpoints to connect to a data store and migrate the data from a source to a target endpoint. An endpoint provides several configuration settings depending on the endpoint type chosen.
Create a source endpoint
In our example, we create a source endpoint that connects to a SQL Server instance running on Amazon EC2. If you’re migrating from an on-premises SQL Server, the steps are the same.
For your environment, change the highlighted values to adapt to your needs. Additionally, AWS Secrets Manager can be used to store secrets, and retrieve them using AWSPowerShell as part of this script.
To create the source endpoint, run the following script, or run the create-dms-source-endpoint.ps1 script from the GitHub repository:
$EndpointSettings = @{ “EndpointIdentifier” = “source-sql2019”
“MicrosoftSQLServerSettings_DatabaseName” = “AdventureWorks2016”
“MicrosoftSQLServerSettings_Password” = Read-Host “Enter a password:” -AsSecureString
“MicrosoftSQLServerSettings_Port” = “1433”
“MicrosoftSQLServerSettings_ServerName” = “10.0.1.192”
“MicrosoftSQLServerSettings_Username” = “dms_src_usr”
“EngineName” = “sqlserver”
“EndpointType” = “source”}
$BSTR = [System.Runtime.InteropServices.Marshal]::SecureStringToBSTR($EndpointSettings.MicrosoftSQLServerSettings_Password)
$EndpointPassword = [System.Runtime.InteropServices.Marshal]::PtrToStringAuto($BSTR)
$EndpointStatus = New-DMSEndpoint -EndpointIdentifier $EndpointSettings.EndpointIdentifier`
-EndpointType $EndpointSettings.EndpointType -EngineName $EndpointSettings.EngineName `
-MicrosoftSQLServerSettings_DatabaseName $EndpointSettings.MicrosoftSQLServerSettings_DatabaseName `
-MicrosoftSQLServerSettings_Password $EndpointPassword `
-MicrosoftSQLServerSettings_Port $EndpointSettings.MicrosoftSQLServerSettings_Port `
-MicrosoftSQLServerSettings_ServerName $EndpointSettings.MicrosoftSQLServerSettings_ServerName `
-MicrosoftSQLServerSettings_Username $EndpointSettings.MicrosoftSQLServerSettings_Username
Write-Output “Endpoint Status: $($EndpointStatus.Status)”
The EndpointIdentifierparameter name may not contain underscore (_) values.
The endpoint provision is almost instant, and you should receive an active status displayed in the output:
PS C:TempAWS Blog> C:TempAWS Blogcreate-dms-source-endpoint.ps1
Endpoint Status: active
Create a target endpoint
In our example, we create the target endpoint to an RDS for SQL Server instance. Creating this endpoint is similar to creating the source endpoint with a small change in the parameter EndpointTypefrom source to target.
For your environment, change the highlighted values to adapt to your needs. Additionally, AWS Secrets Manager can be used to store secrets, and retrieve them using AWSPowerShell as part of this script.
To create the target endpoint, run the following script, or run the create-dms-target-endpoint.ps1 script from the GitHub repository:
$EndpointSettings = @{ “EndpointIdentifier” = “target-sql2019”
“MicrosoftSQLServerSettings_DatabaseName” = “AdventureWorks2016”
“MicrosoftSQLServerSettings_Password” = Read-Host “Enter a password:” -AsSecureString
“MicrosoftSQLServerSettings_Port” = “1433”
“MicrosoftSQLServerSettings_ServerName” = “RDSInstanceEndpoint” #This should be changed according to your environment needs
“MicrosoftSQLServerSettings_Username” = “dms_tgt_usr”
“EngineName” = “sqlserver”
“EndpointType” = “target”}
$BSTR = [System.Runtime.InteropServices.Marshal]::SecureStringToBSTR($EndpointSettings.MicrosoftSQLServerSettings_Password)
$EndpointPassword = [System.Runtime.InteropServices.Marshal]::PtrToStringAuto($BSTR)
$EndpointStatus = New-DMSEndpoint -EndpointIdentifier $EndpointSettings.EndpointIdentifier `
-EndpointType $EndpointSettings.EndpointType `
-EngineName $EndpointSettings.EngineName `
-MicrosoftSQLServerSettings_DatabaseName $EndpointSettings.MicrosoftSQLServerSettings_DatabaseName `
-MicrosoftSQLServerSettings_Password $EndpointPassword `
-MicrosoftSQLServerSettings_Port $EndpointSettings.MicrosoftSQLServerSettings_Port `
-MicrosoftSQLServerSettings_ServerName $EndpointSettings.MicrosoftSQLServerSettings_ServerName `
-MicrosoftSQLServerSettings_Username $EndpointSettings.MicrosoftSQLServerSettings_Username
Write-Output “Endpoint Status: $($EndpointStatus.Status)”
The endpoint provision is almost instant, and you should receive an active status right after you create it, which is displayed in the output:
PS C:TempAWS Blog> C:TempAWS Blogcreate-dms-target-endpoint.ps1
Endpoint Status: active
You can also use the AWS DMS console and check that the endpoints were created.
Perform network connectivity and endpoint tests
After you have provisioned the source and target endpoints, it’s a good practice to perform a connectivity test. You can test the endpoint connectivity with the AWS DMS instance using PowerShell. However, before you can do that, make sure to add the proper inbound rules to both source and target endpoints, allowing AWS DMS to connect to them.
Create inbound security group rules
AWS DMS needs permissions to access the source and target database instances. If you’re migrating your on-premises database to AWS, make sure there’s network connectivity from the AWS account where AWS DMS is deployed to the network where your database server is located.
Because your target database is Amazon RDS, it’s necessary to create an inbound rule in the RDS security group. This allows AWS DMS to access the TCP port specified during your Amazon RDS deployment. You can find that information on the Connectivity & security tab on the Amazon RDS console, as shown in the following screenshot.
On the same tab, you will also find the security group ID. This is the security group to be updated by the new inbound rule.
# Change to the name of your RDS Instance
$RDSInstanceIdentifier = “<YourRDSInstanceIdentifier>”
(Get-RDSDBInstance -DBInstanceIdentifier $RDSInstanceIdentifier).VpcSecurityGroups
(Get-RDSDBInstance -DBInstanceIdentifier $RDSInstanceIdentifier).Endpoint
You should receive the following results:
Status VpcSecurityGroupId
—— ——————
active sg-0a7f47c256606e4d6
Address : database-1.xxxxxxxxxxxx.eu-west-1.rds.amazonaws.com
HostedZoneId : Z29XKXDKYMONMX
Port : 1433
You also need the security group ID of the AWS DMS instance, which you saved previously when you deployed the AWS DMS instance security group. You use this as the input to your script.
The following script updates the RDS for SQL Server security group to allow AWS DMS to access the RDS instance on the given TCP port. You can also run the create-inbound-security-group-rds.ps1 script from the GitHub repository.
# This should be changed according to your environment needs
$ReplicationInstanceIdentifier = “dms-instance-01”
# This should be changed according to your environment needs
$RDSInstanceIdentifier = “database-1”
# Provide Security Group ID of your RDS Instance
$RDSSecurityGroupId = “TargetRDSSecGroupId”
$TCPPort = (Get-RDSDBInstance -DBInstanceIdentifier $RDSInstanceIdentifier).Endpoint.Port
$DMSSecurityGroupId = (Get-DMSReplicationInstance | Where-Object { $_.ReplicationInstanceIdentifier -eq $ReplicationInstanceIdentifier}).VpcSecurityGroups.VpcSecurityGroupId
$InboundRule = New-Object Amazon.EC2.Model.IpPermission
$UserIdGroupPair = New-Object Amazon.EC2.Model.UserIdGroupPair
$InboundRule.FromPort = $TCPPort
$InboundRule.ToPort = $TCPPort
$InboundRule.IpProtocol = “tcp”
$UserIdGroupPair.Description = “TCP Port to be used by DMS to access RDS for SQL Server”
$UserIdGroupPair.GroupId = $DMSSecurityGroupId
$InboundRule.UserIdGroupPair = $UserIdGroupPair
Grant-EC2SecurityGroupIngress -GroupId $RDSSecurityGroupId -IpPermission $InboundRule
After the rule is added, you can check by running the following script:
(Get-EC2SecurityGroup -GroupId $RDSSecurityGroupId).IpPermissions.UserIdGroupPairs
You should receive the following results:
PS C:TempAWS Blog> (Get-EC2SecurityGroup -GroupId $RDSSecurityGroupId).IpPermissions.UserIdGroupPairs
Description : TCP Port to be used by DMS to access RDS for SQL Server
GroupId : sg-0f57d952a8ec41412
GroupName :
PeeringStatus :
UserId : 781449244624
VpcId :
VpcPeeringConnectionId :
Test the AWS DMS endpoints
As you further progress through the deployment of an AWS DMS replication instance and its components, it’s necessary to conduct tests on the AWS DMS endpoints, to guarantee a successful connection the source and target databases can be made.
The following script runs tests against the source and target AWS DMS endpoints, and reports the connection a result message. You can also run the test-aws-dms-endpoints.ps1 script from the GitHub repository.
# This should be changed according to your environment needs
$ReplicationInstanceIdentifier = “dms-instance-01”
# This should be changed according to your environment needs
$SourceEndpointIdentifier = “source-sql2019”
# This should be changed according to your environment needs
$TargetEndpointIdentifier = “target-sql2019”
$DMSInstanceObject = Get-DMSReplicationInstance | Where-Object { $_.ReplicationInstanceIdentifier -eq $ReplicationInstanceIdentifier}
$SourceEndpointDimension = New-Object Amazon.DatabaseMigrationService.Model.Filter
$SourceEndpointObject = Get-DMSEndpoint | Where-Object {$_.EndpointIdentifier -eq $SourceEndpointIdentifier}
$SourceEndpointDimension.Name = ‘endpoint-arn’
$SourceEndpointDimension.Values = $SourceEndpointObject.EndpointArn
$TargetEndpointDimension = New-Object Amazon.DatabaseMigrationService.Model.Filter
$TargetEndpointObject = Get-DMSEndpoint | Where-Object {$_.EndpointIdentifier -eq $TargetEndpointIdentifier}
$TargetEndpointDimension.Name = ‘endpoint-arn’
$TargetEndpointDimension.Values = $TargetEndpointObject.EndpointArn
$DMSConnectionTestStatus = Test-DMSConnection -EndpointArn $SourceEndpointObject.EndpointArn -ReplicationInstanceArn $DMSInstanceObject.ReplicationInstanceArn
Write-Output “Testing DMS Endpoint connection for: [$($DMSConnectionTestStatus.EndpointIdentifier)]”
while($DMSConnectionTestStatus.Status -eq “testing”){
$DMSConnectionTestStatus = Get-DMSConnection -Filter $SourceEndpointDimension | Where-Object {$_.ReplicationInstanceIdentifier -eq $ReplicationInstanceIdentifier}
Write-Output “DMS Test Connection status: $($DMSConnectionTestStatus.Status)”
Start-Sleep -Seconds 5
}
if($null -eq $($DMSConnectionTestStatus.LastFailureMessage)){
Write-Output “DMS Test Connection Message: $($DMSConnectionTestStatus.Status)”
}
else{
Write-Output “DMS Test Connection Message: $($DMSConnectionTestStatus.LastFailureMessage)”
}
$DMSConnectionTestStatus = Test-DMSConnection -EndpointArn $TargetEndpointObject.EndpointArn -ReplicationInstanceArn $DMSInstanceObject.ReplicationInstanceArn
Write-Output “Testing DMS Endpoint connection for: [$($DMSConnectionTestStatus.EndpointIdentifier)]”
while($DMSConnectionTestStatus.Status -eq “testing”){
$DMSConnectionTestStatus = Get-DMSConnection -Filter $TargetEndpointDimension | Where-Object {$_.ReplicationInstanceIdentifier -eq $ReplicationInstanceIdentifier}
Write-Output “DMS Test Connection status: $($DMSConnectionTestStatus.Status)”
Start-Sleep -Seconds 5
}
if($null -eq $($DMSConnectionTestStatus.LastFailureMessage)){
Write-Output “DMS Test Connection Message: $($DMSConnectionTestStatus.Status)”
}
else{
Write-Output “DMS Test Connection Message: $($DMSConnectionTestStatus.LastFailureMessage)”
}
Assuming the connections are in place, you should get the following output:
PS C:TempAWS Blog> C:TempAWS Blogtest-aws-dms-endpoints.ps1
Testing DMS Endpoint connection for: [source-sql2019]
DMS Test Connection status: testing
DMS Test Connection status: testing
DMS Test Connection status: testing
DMS Test Connection status: testing
DMS Test Connection status: testing
DMS Test Connection status: testing
DMS Test Connection status: successful
DMS Test Connection Message: successful
Testing DMS Endpoint connection for: [target-sql2019]
DMS Test Connection status: testing
DMS Test Connection status: testing
DMS Test Connection status: testing
DMS Test Connection status: testing
DMS Test Connection status: testing
DMS Test Connection status: testing
DMS Test Connection status: successful
DMS Test Connection Message: successful
Create AWS DMS replication tasks
An AWS DMS task is a component that allows the schema and data to be migrated. In the next sections, you learn how to create a migration task using PowerShell.
Create a table mappings file
Table mapping uses several types of rules to specify the data source, source schema, data, and any transformations that should occur during the task. You can use table mapping to specify individual tables in a database to migrate and the schema to use for the migration. For this post, you can use the examples provided in the GitHub repository.
Create a task settings file
Each replication task has settings that you can configure according to the needs of your database migration. You can create these settings in a JSON file, and you can later modify some of them using the AWS DMS console. For this post, you use the default task settings of an AWS DMS task, with a small modification to enable Amazon CloudWatch Logs and AWS DMS validation. For this post, you can use the example provided in the GitHub repository.
Create full load with CDC replication tasks
In this first method, we create a migration task that is both full load and change data capture (CDC). To learn more about this migration method, refer to Creating a task.
In this post, you use some tables of the HumanResources, dbo, and Production schemas of the AdventureWorks2016 database. You create three AWS DMS replication tasks.
To get started, download the following files from the provided GitHub repository and save them to a local folder of your preference:
dbo-full-load-cdc
human-resources-full-load-cdc
production-full-load-cdc
task-settings
To create a new replication task, run the following script, or run the create-full-load-cdc-task.ps1 script from the GitHub repository:
#
# // Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# // SPDX-License-Identifier: MIT-0
$LocalFolder = “C:TempAWS Blog”
$Files = @(“dbo-full-load-cdc.json”, “human-resources-full-load-cdc.json”, “production-full-load-cdc.json”)
# Change this according to your environment
$TaskSettingsFile = Join-Path -Path $LocalFolder -ChildPath “task-settings.json”
[string]$TaskSettings = Get-Content -Path $TaskSettingsFile
# Change this according to your environment
$ReplicationInstanceIdentifier = “dms-instance-01”
# Change this according to your environment
$SourceEndpointName = “source-sql2019”
# Change this according to your environment
$TargetEndpointName = “target-sql2019”
$ReplicationInstanceObject = Get-DMSReplicationInstance | Where-Object { $_.ReplicationInstanceIdentifier -eq $ReplicationInstanceIdentifier }
$SourceEndpointObject = Get-DMSEndpoint | Where-Object { $_.EndpointIdentifier -eq $SourceEndpointName }
$TargetEndpointObject = Get-DMSEndpoint | Where-Object { $_.EndpointIdentifier -eq $TargetEndpointName }
$MigrationType = “full-load-and-cdc”
foreach ($file in $Files) {
$filename = Join-Path -Path $LocalFolder -ChildPath $file
$TaskName = $file.split(“.”)[0]
Write-Output “Working to create [$file]”
# Change this according to your environment
[string]$TableMapping = Get-Content $filename
$NewDMSTaskResult = New-DMSReplicationTask -ReplicationInstanceArn $ReplicationInstanceObject.ReplicationInstanceArn -CdcStartPosition $CdcStartPosition -MigrationType $MigrationType -ReplicationTaskIdentifier $TaskName -ReplicationTaskSetting $TaskSettings -SourceEndpointArn $SourceEndpointObject.EndpointArn -TableMapping $TableMapping -TargetEndpointArn $TargetEndpointObject.EndpointArn
while ($NewDMSTaskResult.Status -eq “creating”) {
$NewDMSTaskResult = Get-DMSReplicationTask | Where-Object { $_.ReplicationTaskIdentifier -eq $TaskName }
Write-Output “Task Status: $($NewDMSTaskResult.Status)”
Start-Sleep -Seconds 10
}
}
You get the following output:
PS C:TempAWS Blog> C:TempAWS Blogcreate-full-load-cdc-task.ps1
Working to create [dbo-full-load-cdc.json]
Task Status: creating
Task Status: creating
Task Status: creating
Task Status: creating
Task Status: ready
Working to create [human-resources-full-load-cdc.json]
Task Status: creating
Task Status: creating
Task Status: creating
Task Status: creating
Task Status: ready
Working to create [production-full-load-cdc.json]
Task Status: creating
Task Status: creating
Task Status: creating
Task Status: creating
Task Status: creating
Task Status: ready
You can also check the AWS DMS console to see the tasks you have created.
Start, stop, and edit AWS DMS replication tasks
For a replication task that is full-load and full-load-and-cdc, the only option available is start-replication. You can invoke the Start-DMSReplicationTask for the individual migration task you want to start.
Start an AWS DMS replication task
To start a replication task, use the following code, or run the start-full-load-cdc-task.ps1 script from the GitHub repository:
#
# // Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# // SPDX-License-Identifier: MIT-0
$Files = @(“dbo-full-load-cdc”, “human-resources-full-load-cdc”, “production-full-load-cdc”)
foreach ($file in $Files) {
$TaskName = $file.split(“.”)[0]
Write-Output “Starting task [$TaskName]”
$ReplicationTaskObject = Get-DMSReplicationTask | Where-Object { $_.ReplicationTaskIdentifier -eq $TaskName }
$StartTaskStatus = Start-DMSReplicationTask -ReplicationTaskArn $ReplicationTaskObject.ReplicationTaskArn -StartReplicationTaskType start-replication
Start-Sleep -Seconds 5
Write-Output “Starting [$($StartTaskStatus.ReplicationTaskIdentifier)]. Current Status is [$($StartTaskStatus.Status)]”
}
You should get the following output:
PS C:TempAWS Blog> C:TempAWS Blogstart-full-load-cdc-task.ps1
Starting task [dbo-full-load-cdc]
Starting [dbo-full-load-cdc]. Current Status is [starting]
Starting task [human-resources-full-load-cdc]
Starting [human-resources-full-load-cdc]. Current Status is [starting]
Starting task [production-full-load-cdc]
Starting [production-full-load-cdc]. Current Status is [starting]
After you start the replication task, you can view the status in PowerShell with the following script, or run the retrieve-replication-task-statistics.ps1 script from the GitHub repository:
#
# // Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# // SPDX-License-Identifier: MIT-0
$Tasks = @(“dbo-full-load-cdc”, “human-resources-full-load-cdc”, “production-full-load-cdc”)
foreach ($task in $Tasks) {
$TaskStatus = (Get-DMSReplicationTask | Where-Object { $_.ReplicationTaskIdentifier -eq $task}).Status
$TaskStatistics = (Get-DMSReplicationTask | Where-Object { $_.ReplicationTaskIdentifier -eq $task}).ReplicationTaskStats
Write-Output “[$task] current status: $TaskStatus”
Write-Output “Task statistics:`n Tables Loaded: $($TaskStatistics.TablesLoaded)`n Tables Loading: $($TaskStatistics.TablesLoading)`n Tables Errored: $($TaskStatistics.TablesErrored)”
}
For a task that is running, but has errors, it produces an output like the following:
PS C:TempAWS Blog> C:TempAWS Blogretrieve-replication-task-statistics.ps1
[dbo-full-load-cdc] current status: running
Task statistics:
Tables Loaded: 3
Tables Loading: 0
Tables Errored: 0
[human-resources-full-load-cdc] current status: running
Task statistics:
Tables Loaded: 2
Tables Loading: 0
Tables Errored: 4
[production-full-load-cdc] current status: running
Task statistics:
Tables Loaded: 4
Tables Loading: 0
Tables Errored: 1
To further investigate the problems with the mentioned task, you can check the task log stream in CloudWatch.
Stop an AWS DMS replication task
In the previous section, the AWS DMS replication task named human-resources-full-load-cdc showed three errors. You must stop the task to change some settings. To stop an AWS DMS replication task, run the following script, or run the stop-full-load-cdc-task.ps1 script from the GitHub repository:
# Change this according to your environment
$TaskName = “human-resources-full-load-cdc”
$ReplicationTaskObject = Get-DMSReplicationTask | Where-Object { $_.ReplicationTaskIdentifier -eq $TaskName }
$StopTaskStatus = Stop-DMSReplicationTask -ReplicationTaskArn $ReplicationTaskObject.ReplicationTaskArn
while($StopTaskStatus.Status -eq “stopping”){
$StopTaskStatus = Get-DMSReplicationTask | Where-Object { $_.ReplicationTaskIdentifier -eq $TaskName }
Write-Output “Stopping [$($StopTaskStatus.ReplicationTaskIdentifier)]. Current Status is [$($StopTaskStatus.Status)]”
Start-Sleep -Seconds 5
}
You should get the following output:
PS C:TempAWS Blog> C:TempAWS Blogstop-full-load-cdc-task.ps1
Stopping [human-resources-full-load-cdc]. Current Status is [stopping]
Stopping [human-resources-full-load-cdc]. Current Status is [stopping]
Stopping [human-resources-full-load-cdc]. Current Status is [stopping]
Stopping [human-resources-full-load-cdc]. Current Status is [stopping]
Stopping [human-resources-full-load-cdc]. Current Status is [stopping]
Stopping [human-resources-full-load-cdc]. Current Status is [stopping]
Stopping [human-resources-full-load-cdc]. Current Status is [stopping]
Stopping [human-resources-full-load-cdc]. Current Status is [stopping]
Stopping [human-resources-full-load-cdc]. Current Status is [stopping]
Stopping [human-resources-full-load-cdc]. Current Status is [stopped]
Edit an AWS DMS replication task
To edit an AWS DMS replication task, it must be at a stopped state. You change the AWS DMS replication task by enabling the debugging levels for specific components.
The task settings component is a JSON file that contains all your AWS DMS replication task configurations, such as CloudWatch log group, validation settings, full load and CDC configuration, and more.
To update any of these settings, you need first to retrieve this file, edit it, save it, and submit it back to the AWS DMS APIs. In the following script, we show how to change the logging level of the TARGET_APPLY component in the task from LOGGER_SEVERITY_DEFAULT to LOGGER_SEVERITY_DEBUG.
To download and edit an AWS DMS replication task, run the following script, or run the download-dms-replication-task-settings.ps1 script from the GitHub repository:
# Change this according to your environment
$TaskName = “human-resources-full-load-cdc”
# Change this according to your needs
$FolderToSave = “C:TempAWS Blog”
$ReplicationTaskObject = Get-DMSReplicationTask | Where-Object { $_.ReplicationTaskIdentifier -eq $TaskName }
$FullPath = -join($FolderToSave,”TaskSettings_”,(Get-Date -Format “yyyyMMdd_HHmmss”),”.json”)
Write-Output “Saving the file at: $FullPath”
$ReplicationTaskObject.ReplicationTaskSettings | Out-File $FullPath
You should get the following output:
PS C:TempAWS Blog> C:TempAWS Blogdownload-dms-replication-task-settings.ps1
Saving the file at: C:TempAWS BlogTaskSettings_20240301_161534.json
After you save the file, open it with your editor of choice, search for TARGET_APPLY, and change the severity as shown in the following screenshot.
Make sure to take note of the value of $FullPath, because it will be used in the next script to resubmit the saved JSON file to AWS DMS.
Replace {“Severity”:”LOGGER_SEVERITY_DEFAULT”,”Id”:”TARGET_APPLY”} with {“Severity”:”LOGGER_SEVERITY_DEBUG”,”Id”:”TARGET_APPLY”}.
Save these changes and submit the new task settings to the AWS DMS replication task. To submit, run the following script, or run the enable-debugging-full-load-cdc-task.ps1 script from the GitHub repository:
# After changes are done, file is re-submitted to the AWS DMS.
# Change this according to your environment
$TaskName = “human-resources-full-load-cdc”
# Change this according to the full task settings file location
$TaskSettingsFile = “C:TempAWS BlogTaskSettings_20240301_161534.json”
[string]$TaskSettings = Get-Content $TaskSettingsFile
$ReplicationTaskObject = Get-DMSReplicationTask | Where-Object { $_.ReplicationTaskIdentifier -eq $TaskName }
Edit-DMSReplicationTask -ReplicationTaskArn $ReplicationTaskObject.ReplicationTaskArn -ReplicationTaskSetting $TaskSettings | Out-Null
Start-Sleep -Seconds 2
$ReplicationTask = Get-DMSReplicationTask | Where-Object { $_.ReplicationTaskIdentifier -eq $TaskName }
While ($ReplicationTask.Status -eq “modifying”) {
$ReplicationTask = Get-DMSReplicationTask | Where-Object { $_.ReplicationTaskIdentifier -eq $TaskName }
Write-Output “Task status: $($ReplicationTask.Status)”
Start-Sleep -Seconds 5
}
The task enters into a modifying state, and when the process is complete, it remains stopped.
You get the following output:
PS C:TempAWS Blog> C:TempAWS Blogenable-debugging-full-load-cdc-task.ps1
Task status: modifying
Task status: modifying
Task status: modifying
Task status: modifying
Task status: modifying
Task status: modifying
Task status: modifying
Task status: modifying
Task status: stopped
After you modify the AWS DMS replication task, you must resume the task.
Troubleshoot AWS DMS replication tasks
Troubleshooting an AWS DMS replication task might require you to look in different places, such as the source and target endpoints, the AWS DMS replication instance, and CloudWatch for logging. The following series of posts can help you perform in-depth troubleshooting of your replication task:
Debugging Your AWS DMS Migrations: What to Do When Things Go Wrong (Part 1)
Debugging Your AWS DMS Migrations: What to Do When Things Go Wrong (Part 2)
Debugging Your AWS DMS Migrations: What to Do When Things Go Wrong? (Part 3)
The intention of this post is to provide some of the PowerShell cmdlets that can help you retrieve logs from CloudWatch.
Find the log group
Whenever you deploy an AWS DMS instance, a new log group is created in CloudWatch that uses the following structure: dms-tasks-NameOfDmsInstance. In this post, you use an AWS DMS instance called dms-instance-01, so the CloudWatch log group is dms-tasks-dms-instance-01.
The following script returns the log group that your AWS DMS replication tasks are using:
# Change this according to your environment
$ReplicationInstanceIdentifier = “dms-instance-01”
$LogGroupName = -join(‘dms-tasks-‘,$ReplicationInstanceIdentifier)
Get-CWLLogGroup -LogGroupNamePrefix $LogGroupName
You get the following output:
LogGroupName CreationTime RetentionInDays StoredBytes MetricFilterCount
———— ———— ————— ———– —————–
dms-tasks-dms-instance-01 26/10/2022 15:14:25 147147 0
Find the log streams
Because you enabled CloudWatch logging for your AWS DMS replication tasks, a log stream is created for each task. The log stream name uses the following structure: dms-task-ReplicationTaskId.
The following script returns the log stream name that your AWS DMS replication tasks are using. You can also run the find-cw-log-stream.sp1 script from the GitHub repository.
# Change this according to your environment
$ReplicationInstanceIdentifier = “dms-instance-01”
# Change this according to your environment
$TaskName = “human-resources-full-load-cdc”
$ReplicationTaskObject = Get-DMSReplicationTask | Where-Object { $_.ReplicationTaskIdentifier -eq $TaskName }
$LogGroupName = -join(‘dms-tasks-‘,$ReplicationInstanceIdentifier)
$LogStreamName = -join(‘dms-task-‘,$ReplicationTaskObject.ReplicationTaskArn.Split(“:”)[6])
Get-CWLLogStream -LogGroupName $LogGroupName -LogStreamNamePrefix $LogStreamName
You get the following output:
PS C:TempAWS Blog> C:TempAWS Blogfind-cw-log-stream.ps1
LogStreamName RetentionInDays CreationTime FirstEventTimestamp LastEventTimestamp LastIngestionTime
————- ————— ———— ——————- —————— —————–
dms-task-FHKGGXYZTTCDFPBXXQEGHCW4FSF6MXTMMPDXSHY 01/03/2024 16:06:29 01/03/2024 16:05:54 01/03/2024 16:06:27 01/03/2024 16:06:29
Retrieve the events from log streams
Now that you know the log group and log stream, you can retrieve the events that were logged by the replication task on it, and specifically filter by any errors that may have occurred.
There are two ways to retrieve the log events. If you’re using PowerShell ISE or Visual Studio Code with the PowerShell extension to run the scripts, you can use the following code, or you can run the retrieve-full-load-cdc-cw-events-outgridview.ps1 script from the GitHub repository:
# Change this according to your environment
$ReplicationInstanceIdentifier = “dms-instance-01”
# Change this according to your environment
$TaskName = “human-resources-full-load-cdc”
$ReplicationTaskObject = Get-DMSReplicationTask | Where-Object { $_.ReplicationTaskIdentifier -eq $TaskName }
$LogGroupName = -join(‘dms-tasks-‘,$ReplicationInstanceIdentifier)
$LogStreamName = -join(‘dms-task-‘,$ReplicationTaskObject.ReplicationTaskArn.Split(“:”)[6])
(Get-CWLLogEvent -LogGroupName $LogGroupName -LogStreamName $LogStreamName).Events | Out-GridView
By using the Out-GridView option, the output of your PowerShell script will show up in an interactive table in a separate window.
Additionally, you can perform search criteria on any of the columns returned in the output window. In the following screenshot, we have added the column Message as search criteria, and we’re looking only for errors.
Although this may not be the option with the best performance, because the filtering happens at the client side, this can help you initially to know what you’re looking for.
You can run the same search criteria directly at the API level to retrieve the filtered log event. You can also run the retrieve-full-load-cdc-cw-events.ps1 script from the GitHub repository.
# Change this according to your environment
$ReplicationInstanceIdentifier = “dms-instance-01”
# Change this according to your environment
$TaskName = “human-resources-full-load-cdc”
$ReplicationTaskObject = Get-DMSReplicationTask | Where-Object { $_.ReplicationTaskIdentifier -eq $TaskName }
$LogGroupName = -join(‘dms-tasks-‘,$ReplicationInstanceIdentifier)
$LogStreamName = -join(‘dms-task-‘,$ReplicationTaskObject.ReplicationTaskArn.Split(“:”)[6])
# Filtering only for errors
$FilterPattern = ‘”]E:”‘
$Data = Get-CWLFilteredLogEvent -LogStreamName $LogStreamName -LogGroupName $LogGroupName -FilterPattern $FilterPattern
$Data.Events
This produces the following output, assuming that there are errors to be displayed:
PS C:TempAWS Blog> C:TempAWS Blogretrieve-full-load-cdc-cw-events.ps1
EventId : 38118867975772177082809538298993353219742453207757226204
IngestionTime : 1709309189576
LogStreamName : dms-task-FHKGGXYZTTCDFPBXXQEGHCW4FSF6MXTMMPDXSHY
Message : 2024-03-01T16:05:57 [TABLES_MANAGER ]E: Table metadata Id (12) incorrect. [1021701] (tasktablesmanager.c:4720)
Timestamp : 1709309157000
EventId : 38118868020373667479870784582064789765039176219718058294
IngestionTime : 1709309189576
LogStreamName : dms-task-FHKGGXYZTTCDFPBXXQEGHCW4FSF6MXTMMPDXSHY
Message : 2024-03-01T16:05:59 [TARGET_LOAD ]E: SqlState: 23000 NativeError: 3621 Message: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The statement has been terminated.
[1022510] (ar_odbc_stmt.c:2864)
Timestamp : 1709309159000
EventId : 38118868020373667479870784582064789765039176219718058295
IngestionTime : 1709309189576
LogStreamName : dms-task-FHKGGXYZTTCDFPBXXQEGHCW4FSF6MXTMMPDXSHY
Message : 2024-03-01T16:05:59 [TARGET_LOAD ]E: Invalid input for table ‘HumanResources’.’Employee’ in line number 290 [1022510] (sqlserver_endpoint_imp.c:2859)
Timestamp : 1709309159000
EventId : 38118868020373667479870784582064789765039176219718058296
IngestionTime : 1709309189576
LogStreamName : dms-task-FHKGGXYZTTCDFPBXXQEGHCW4FSF6MXTMMPDXSHY
Message : 2024-03-01T16:05:59 [TARGET_LOAD ]E: Execute statement failed [1020403] (sqlserver_endpoint_imp.c:4622)
Timestamp : 1709309159000
EventId : 38118868020373667479870784582064789765039176219718058297
IngestionTime : 1709309189576
LogStreamName : dms-task-FHKGGXYZTTCDFPBXXQEGHCW4FSF6MXTMMPDXSHY
Message : 2024-03-01T16:05:59 [TARGET_LOAD ]E: Handling End of table ‘HumanResources’.’Employee’ loading failed by subtask 1 thread 1 [1022510] (endpointshell.c:3039)
Timestamp : 1709309159000
Clean up
To clean up the resources you created as part of this post, complete the following steps, or you can run the clean-up-resources.ps1 script from the GitHub repository:
Log in to your AWS account and open the AWS DMS console.
Choose Database migration tasks in the navigation pane.
Select the tasks created by post, and on the Actions menu, choose Delete.
Choose Endpoints in the navigation pane.
Select the endpoints created by this post, and on the Actions menu, choose Delete.
Choose Replication instances in the navigation pane.
Select the instance created by this post, and on the Actions menu, choose Delete.
If you no longer need the databases, remove them as well.
Conclusion
In this post, we showed you how a combination of PowerShell and the AWSPowerShell module can be a powerful tool to use in your database migration using AWS DMS. We looked at provisioning the required IAM roles and provisioning an AWS DMS replication instance, endpoints, and tasks. We also learned how to update AWS DMS replication tasks and interact with CloudWatch to investigate potential issues with the migration tasks.
If you have any questions or suggestions, leave a comment.
About the Author
Marcos Freccia is a Sr. Database Specialty Architect with the AWS Professional Services Team. He has been supporting and enabling customers on their journey to migrate and modernize their database solutions from on-premises data centres to AWS.
Source: Read More