DEV Community

Cover image for Data Migration to New AWS Elasticsearch Service Domain
Irtiza Ali
Irtiza Ali

Posted on

Data Migration to New AWS Elasticsearch Service Domain

This story provides guidelines to migration data to the new AWS Elasticsearch Service Domain.

Overview

Data migration to the new AWS Elasticsearch Service domain consists of two steps:

  1. Creating a manual snapshot of Elasticsearch Service domain data on the S3 bucket.

  2. Restore the snapshot from S3 in the Elasticsearch domain.

Assumption

I am assuming that you already know how to create an AWS Elasticsearch Service domain.

Manual Snapshot/Backup

  • Create a bucket in the same region where the Elasticsearch domain exists.

  • Copy the bucket arn.

  • Create an IAM role, this role will allow Elasticsearch to use S3. Initially create a role of ec2 use case (it will be changed later) without any policy. The policy will be added later.

  • Add an inline JSON policy and use the bucket arn copied in step 2:

{
   "Version": "2012-10-17",
   "Statement": [{
       "Action": [
         "s3:ListBucket"
       ],
       "Effect": "Allow",
       "Resource": [
         "arn:aws:s3:::<bucket-name>"
       ]
     },
     {
       "Action": [
         "s3:GetObject",
         "s3:PutObject",
         "s3:DeleteObject"
       ],
       "Effect": "Allow",
       "Resource": [
         "arn:aws:s3:::<bucket-name>/*"
       ]
     }
   ]
 }
Enter fullscreen mode Exit fullscreen mode
  • Add a trust relationship so that Elasticsearch can assume this role:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "es.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode
  • Create an IAM user with AWS CLI utility usage enabled. This user will be used to register the manual snapshot repository. Attach the inline JSON policy given below:
{
   "Version": "2012-10-17",
   "Statement": [
     {
       "Effect": "Allow",
       "Action": "iam:PassRole",
       "Resource": "<role-arn-created-in-step-3>"
     },
     {
       "Effect": "Allow",
       "Action": "es:ESHttpPut",
       "Resource": "<elasitcsaerch-arn>"
     }
   ]
 }
Enter fullscreen mode Exit fullscreen mode
  • Configure the user created in step-6 using its access-id and access-secret:
aws configure
Enter fullscreen mode Exit fullscreen mode

Enter data for each prompt.

  • Install pip and some packages
sudp install python-pip
sudo pip install requests-aws4auth
Enter fullscreen mode Exit fullscreen mode
  • Create a python file and paste the script given below:
import boto3
import requests
from requests_aws4auth import AWS4Auth

host = '<existing elasticsearch service domain url>'
region = '<elasticsearch service domain region>'
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

# Register repository
path = '_snapshot/<snapshot-repository-name>' # the Elasticsearch API endpoint
url = host + path

payload = {
  "type": "s3",
  "settings": {
    "bucket": "<enter bucket name created in step-1>",
    "region": "<bucket region>",
    "role_arn": "<arn of role created in step-3>"
  }
}

headers = {"Content-Type": "application/json"}

r = requests.put(url, auth=awsauth, json=payload, headers=headers)

print(r.status_code)
Enter fullscreen mode Exit fullscreen mode
  • Run the script, it will print the data given below:

Note
Make sure if configure (command:aws configure) aws user command is executed using sudo run the python script using sudo otherwise, there is no need to use sudo.

200
Enter fullscreen mode Exit fullscreen mode
  • Take the manual snapshot either by using elasticsearch api or kibana dev tool console:
PUT _snapshot/<snapshot-repository-name>/<date/snapshot-name>
Enter fullscreen mode Exit fullscreen mode
  • To check snapshot has been created successfully and the indices that are part of this snapshot:
GET _snapshot/<snapshot-repository-name>/_all?pretty
Enter fullscreen mode Exit fullscreen mode
  • Check the s3 bucket to check whether data has been created successfully.

Restore Snapshot

  • Create a new Elasticsearch Service Domain.

  • A role is required that will allow the new Elasticsearch Service Domain to access the S3 that was used to store the snapshots. but we don't need to create a new role because the role created in Step-3 of Manual Snapshot can be used here.

  • Create an IAM user with AWS CLI utility usage enabled. This user will be used to register the manual snapshot repository with a new Elasticsearch Service Domain. Attach the inline JSON policy:

{
   "Version": "2012-10-17",
   "Statement": [
     {
       "Effect": "Allow",
       "Action": "iam:PassRole",
       "Resource": "<role-arn-refered-in-step-2>"
     },
     {
       "Effect": "Allow",
       "Action": "es:ESHttp*",
       "Resource": "<new-elasitcsaerch-arn>"
     }
   ]
 }
Enter fullscreen mode Exit fullscreen mode
  • Configure the user on a system:
aws configure
Enter fullscreen mode Exit fullscreen mode
  • Install pip and packages but if already exists then no need for this step:
sudo install python-pip
sudo pip install requests-aws4auth
Enter fullscreen mode Exit fullscreen mode
  • Create a file and paste the python script given below:
import boto3
import requests
from requests_aws4auth import AWS4Auth

host = '<new existing elasticsearch service domain url>'
region = '<new elasticsearch service domain region>'
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

# Register repository
path = '_snapshot/<snapshot-repository-name-used-in-manual-snapshot>' # the Elasticsearch API endpoint
url = host + path

payload = {
  "type": "s3",
  "settings": {
    "bucket": "<enter bucket name created manual snapshot process>",
    "region": "<bucket region>",
    "role_arn": "<arn of role refered in step-2>"
  }
}

headers = {"Content-Type": "application/json"}

r = requests.put(url, auth=awsauth, json=payload, headers=headers)

print(r.status_code)
Enter fullscreen mode Exit fullscreen mode
  • Run the python script

Note
Make sure if configure (command:aws configure) aws user command is executed using sudo run the python script using sudo otherwise, there is no need to use sudo.

200
Enter fullscreen mode Exit fullscreen mode
  • To check snapshot repository is configured, check the existing snapshots by either by using elasticsearch api or kibana dev tool console:
GET _snapshot/<snapshot-repository-name>/_all?pretty
Enter fullscreen mode Exit fullscreen mode
  • It must show the snapshot that was created in the manual snapshot process.

  • To check existing indices:

GET _aliases?pretty=true
Enter fullscreen mode Exit fullscreen mode
  • Restore the snapshot either by using elasticsearch api or kibana dev tool console:
POST _snapshot/<snapshot-repository-name>/<date/snapshot-name>/_restore -d
{
  "indices": "<index-name>",
  "ignore_unavailable": false,
  "include_global_state": false
}
Enter fullscreen mode Exit fullscreen mode
  • Verify that the index has been restored:
GET _aliases?pretty=true
Enter fullscreen mode Exit fullscreen mode
  • Verify the data of the index:
GET /<index-name>/_search/
Enter fullscreen mode Exit fullscreen mode

Final Thoughts

I hope you have liked this tutorial. Do give me feedback about anything that can be improved. Thank you.

Top comments (1)

Collapse
 
umairakram206 profile image
Umair Akram

Great work!!!
We are providing best digital marketing services in Pakistan.
creativejaguars.com/