Overview and Objectives
The objective of this guide is to outline a comprehensive approach to migrating a Microsoft SQL Server database that is hosted on-premises to Amazon Web Services (AWS) using the AWS Database Migration Service (DMS). This document provides a step-by-step guide covering all important aspects of the migration process, including the setup of the DMS infrastructure, creation of endpoints, replication tasks, monitoring, and configuration of the source MSSQL database, and the destination database.
In line with AWS recommended practices, the DMS instance will be deployed within a private subnet. The connection between the on-premises source and AWS will be established through a VPN connection, which will not be addressed in this article. As an alternative, in instances where VPN is not set up, the DMS can be deployed within a public subnet with restricted access through a security group attached to it. Access will be granted only from a designated source IP address.
The business benefits of this migration include:
Zero Downtime: With this migration process, you can move any MSSQL servers from on-premises or from other hosting providers, such as Azure or GCP, to AWS RDS with no interruption to your operations.
Reproducible Setup: By using the same setup for all customers, you can reduce the variations in deployment and improve knowledge management, reducing the risk of errors.
High-Level Solution Design
The following sections will outline the components included in the offering, provide a high-level overview of the architecture, and present a detailed plan for constructing the solution.
Networking Setup and DMS Replication instance setup
VPC Networking Setup
The following prerequisites for setting up a DMS migration must include all essential network components to ensure the product's viability as a turn-key solution for the migration process. These components are:
- Virtual Private Cloud (VPC) to host the destination MSSQL RDS instance.
- Three-tiered Subnets - frontend (public), Database and DMS replication instance (private network with internet access through a NAT gateway), Data layer (internal network without internet access for managed services such as RDS, VPC endpoints, etc.)
- Setting up VPC Endpoints for RDS, S3, and Cloudwatch,with the capability to add or remove endpoints per customer deployment, is a recommended best practice, especially when using the DMS instance version 3.4.7 This allows for flexibility in case additional services are required in the future.
DMS and RDS Replication Instance Setup
Before starting the migration, it's important to set up the MSSQL RDS instances properly, including:
- MSSQL RDS Instance
- Security Groups (one for the Database Listener for the RDS and another for the DMS security group)
DMS Replication Instance using console or automation option.
- AWS RDS target instance nodes - with pre-set recommendations for the appropriate instance type based on the workload.
When selecting the instance type, several factors should be taken into consideration, such as:
- The environment's lifecycle (development, QA, production)
- The workloads running on the instance, including memory, CPU, and network requirements
- Storage allocation to ensure the ability to migrate and host the production databases.
Pre-Migration configuration steps on the MSSQL source instance
Pre-requirements:
To enable Transactional Replication and CDC for tables without primary keys via the DMS source endpoint, sysadmin permission must be granted to the dms_user. This can be accomplished by adding the "setUpMsCdcForTables=true" flag as an extra connection attribute to the source connection endpoint in DMS version 3.4.7, which is a new and available feature that has been released for version 3.4.7.
If you would like to perform the setup manually the steps below:
- The dms_user will be created for use in the DMS source endpoint and will be granted the necessary permissions.
- Enable Transactional Replication on the instance for tables with primary keys
- Enable CDC on the source database use:
use [source_db_name] EXEC sys.sp_cdc_enable_db;
- Next, set-up MS-CDC for each of the source tables. For each table with unique keys but no primary key, run the following query to set up MS-CDC.
exec sys.sp_cdc_enable_tabl
@source_schema = N'schema_name',
@source_name = N'table_name',
@index_name = N'unique_index_name',
@role_name = NULL,
@supports_net_changes = 1
GO
- For each table with no primary key or no unique keys, run the following query to set up MS-CDC.
exec sys.sp_cdc_enable_tabl
@source_schema = N'schema_name',
@source_name = N'table_name',
@role_name = NULL
GO
Pre-migration process steps
To minimize migration downtime, the following task needs to be performed on the source database:
- A full backup must be taken of the source database, which must be using either the Bulk-Logged or Full recovery model.
BACKUP DATABASE <database_name> TO DISK = 'D:\Backups\database_name_datetime.bak
WITH NOFORMAT, NOINIT, NAME = 'database_name_datetime',
SKIP, NOREWIND, NOUNLOAD, STATS = 10 GO
- For the ongoing replication we need to capture the LSN number that will be used for the DMS replication task, take a record of the LSN
select top 1 [Current LSN
from sys.fn_dump_dblog (NULL, NULL,NULL, 1,'C:\Backups\BackupDB\databasename_datetime.bak',NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL)
where operation='LOP_BEGIN_XACT'
and [Begin Time]>= cast('yyyy-mm-ddTMM:SS:ms' as datetime); –-< Datetime of the backup completed
Here is the example of the output of the query above:
- Upload the backup files of the database to an S3 bucket for restoring the RDS instance at the destination. Before uploading, it's necessary to have access to the S3 via the AWS Console or CLI. This requires that the AWS CLI and the IAM user's credentials have access to the S3 bucket.
aws s3 cp database_name_datetime.bak s3://<s3-bucket-name>/database_name_datetime.bak
Restore backup from S3 bucket on RDS instance
Create IAM Role and manage or inline policy:
Trusted entities:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "rds.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
Custom managed or inline policy:
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"kms:DescribeKey",
"kms:GenerateDataKey",
"kms:Encrypt",
"kms:Decrypt"
],
"Resource": "arn:aws:kms:us-east-1:<aws_account_id>:key/<kms_key_id>"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::s3-example-dmsupload",
"arn:aws:s3:::s3-example-mssql-audit-logs"
]
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:ListMultipartUploadParts",
"s3:AbortMultipartUpload"
],
"Resource": [
"arn:aws:s3:::s3-example-dmsupload/*",
"arn:aws:s3:::s3-example-mssql-audit-logs/*"
]
}
]
}
- To restore the database from an S3 bucket on RDS, it is necessary to have the SQLSERVER_BACKUP_RESTORE option pre-configured in the option group and an IAM role was created, as shown in the screenshots below:
To access the MSSQL option group created during RDS setup in the AWS console, go to RDS and click on "Option Groups" on the left-hand side. Then, add the value and IAM role, and apply the changes to the RDS instance.
Example of the ARN:arn:aws:iam::<aws_account_id>: role/<role_name>
The SQLSERVER_BACKUP_RESTORE option should have the correct IAM role and policy that grants access to the S3 bucket containing the database backup files.
- Restore the database on the RDS instance
exec msdb.dbo.rds_restore_database
@restore_db_name='<DatabaseName>',
@s3_arn_to_restore_from='arn:aws:s3:::<s3_bucket_name>/databasename_datetime.bak',
@with_norecovery=0,
@type='FULL';
Check the status of the restore process
exec msdb.dbo.rds_task_status;
Run the following command on the source database to enable CDC.
use [DatabaseName]
EXEC sys.sp_cdc_enable_dbDisable Triggers on the destination RDS database schema
DISABLE TRIGGER ALL ON [destanation_db_name]
EXEC sp_MSforeachtable "ALTER TABLE ? DISABLE TRIGGER all"
Disable constraints on the target database
EXEC sp_MSforeachtable "ALTER TABLE ? NOCHECK CONSTRAINT all"
At last, we are prepared to set up the replication task. Ensure that the necessary DMS instance has been deployed and that all the security access required between the source and destination has been configured
Create Target and Source Endpoints
Create Source and Target endpoints used by the migration task per database in the AWS account.
This step is done with the following pre-requisite:
- The DMS instance has been deployed.
- The network connectivity/security for the DMS and RDS instance is configured between the source and destination.
The task can be created as a part of the DMS creation via console or automation using Terraform.
Example of the source endpoint
Example of the target endpoint
Create DMS migration type ongoing replication
The task can be created manually or via automation
When you create a task please use the following options:
- Select "Enable custom CDC start mode"
- Specify a log sequence number ( The LSN was recorded during the performance of the source backup.)
- Target table preparation mode "Do nothing"
- Include LOB columns in replication "Full LOB mode"
- LOB chunk size (kb) = 64 This is for the LOB
- Task logs = True
Once the task is created, it's time to kick it off! The DMS task will automatically enable transactional replication and change data capture (CDC) on the source database. To keep an eye on its progress, take a look at the stats tab within the task and monitor the CloudWatch logs for any error messages.
Tip: In order to minimize IO and high CPU usage on the source instance and reduce the delta, it is advisable to use the latest backup from the source for CDC replication, which can help to lower the amount of transaction log reads required from the source database.
Tip: For managing large tables, it is advisable to create individual DMS tasks for each table.
Tip: To monitor DMS tasks, you can use CloudWatch to check the DMS instance and task statistics.
Top comments (0)