DEV Community

Franck Pachot for YugabyteDB Distributed PostgreSQL Database

Posted on • Edited on

Starting a YugabyteDB lab cluster with AWS CLI

Here are some command lines to start a YugabyteDB cluster across multiple AWS regions using the AWS CLI. Please note that this setup is for lab purposes only and features no security measures (all ports are open, and the data is transmitted over the public internet without encryption). I use this configuration solely for quick tests.

I set some environment variables, with the most important being zones, which lists the zones where I will start a node.

export AWS_PAGER=""
KEY_NAME=id_rsa.pub
KEY_NAME=lab.pub
INSTANCE_TYPE=m7i.large
VOLUMESIZE=500

zones="us-east-1a ap-northeast-1a ap-southeast-5a"

Enter fullscreen mode Exit fullscreen mode

For each zone, I create an SSH key and a security group and launch an instance using the environment variables mentioned above.

for zone in $zones
do
 export AWS_REGION=${zone%?}
 ZONE=${zone}
 # Import the key to ssh
 aws ec2 import-key-pair --key-name $KEY_NAME --public-key-material "$(base64 ~/.ssh/id_rsa.pub)"
 # Security group all open (put your network)
 aws ec2 create-security-group \
 --group-name lab-public \
 --description "Security group that allows all traffic"
 aws ec2 authorize-security-group-ingress \
    --protocol -1 \
    --port all \
    --cidr 0.0.0.0/0 \
    --group-id $(
  aws ec2 describe-security-groups \
   --filters "Name=group-name,Values=lab-public" \
   --query "SecurityGroups[*].[GroupId]" \
   --output text
  )
# run an instance
 aws ec2 run-instances \
 --count 1 \
 --instance-type $INSTANCE_TYPE \
 --key-name $KEY_NAME \
 --associate-public-ip-address \
 --placement "AvailabilityZone=${ZONE}" \
 --block-device-mappings "DeviceName=/dev/sda1,Ebs={VolumeSize=${VOLUMESIZE}}" \
 --image-id $(
 aws ec2 describe-images --owners 'aws-marketplace' \
  --filters "Name=name,Values=AlmaLinux OS 8*" "Name=architecture,Values=x86_64" \
  --query "Images | sort_by(@, &CreationDate) | [-1].[ImageId]" \
  --output text | tee /dev/stderr
  ) \
 --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=yb},{Key=Environment,Value=lab}]' \
 --security-group-ids $(aws ec2 describe-security-groups \
  --filters "Name=group-name,Values=lab-public" \
  --query "SecurityGroups[*].[GroupId]" \
  --output text | tee /dev/stderr ) \
  --user-data '#!/bin/bash
  sudo dnf update -y
  ' \
  --output text
done

Enter fullscreen mode Exit fullscreen mode

I use the tag Environment=lab to identify the instances. I list the instances running under this tag in each zone.

# describe
for zone in $zones
do
 export AWS_REGION=${zone%?}
 aws ec2 describe-instances \
 --filters "Name=tag:Environment,Values=lab" "Name=instance-state-name,Values=running" \
 --query "Reservations[*].Instances[*].[Tags[?Key=='Name'].Value|[0] , InstanceId, State.Name, Placement.AvailabilityZone, PublicDnsName ]" \
 --output text | awk '{$NF="http://"$NF":7000";print}'
done

Enter fullscreen mode Exit fullscreen mode

Image description

I've realized that I have some instances that I started before and forgot to terminate. If you care about your cloud credits, don't make the same mistake. I will show you how to terminate them at the end.

Currently, nothing is running. Here is how I start YugabyteDB on all nodes.

# ssh to install and start YugabyteDB (uses the SSH key)
join="" # will be set to join the previous node to attach to the cluster
for zone in $zones
do
 export AWS_REGION=${zone%?}
 for host in $(
 aws ec2 describe-instances \
 --filters "Name=tag:Environment,Values=lab" "Name=availability-zone,Values=${zone}" "Name=instance-state-name,Values=running" \
 --query "Reservations[*].Instances[*].[ PublicDnsName ]" \
 --output text | tee /dev/stderr
 ) ; do ssh -o StrictHostKeyChecking=no ec2-user@$host bash -c '
# install python and YugabyteDB
cd ~
sudo dnf install -y python3
[ -f ~/yugabyte/bin/yugabyted ] || {
curl -Ls https://downloads.yugabyte.com/releases/2.23.1.0/yugabyte-2.23.1.0-b220-linux-x86_64.tar.gz | tar xzvf -
cd ~/yugabyte-*.0
./bin/post_install.sh
}
# find zone and region name
TOKEN=$(curl -sX PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")
PGHOST=$(curl -sH "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/public-hostname)
ZONE=$(curl -sH "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/placement/availability-zone)
REGION=${ZONE%?}
CLOUD=aws
# start YugabyteDB
cd ~/yugabyte
set -x
sudo bash ./bin/configure_ptp.sh
./bin/yugabyted destroy
./bin/yugabyted start --advertise_address=$PGHOST --cloud_location=$CLOUD.$REGION.$ZONE ' "$join" '
./bin/yugabyted status
./bin/yugabyted connect ysql <<<"select version() ; select host, cloud, region, zone from yb_servers()"
'
  # this host can be used for the next to join to
  join="--join=$host"
 done
done

Enter fullscreen mode Exit fullscreen mode

I have specified the YugabyteDB version I want to use. It downloads the binaries and starts with yugabyted, adding a --join for the nodes after the first one is started.

When they are started, you can access the UI on port 15433.
Image description

To terminate the instances, ensure that the zones environment variable is set.

# terminate
for zone in $zones
do
 export AWS_REGION=${zone%?}
aws ec2 terminate-instances --instance-ids $(
aws ec2 describe-instances \
 --filters "Name=tag:Environment,Values=lab" \
 --query "Reservations[*].Instances[*].InstanceId" \
 --output text --no-paginate | tee /dev/stderr
) --output text
done

Enter fullscreen mode Exit fullscreen mode

If you modify the zones or the tags, ensure consistency when listing the instances.

Top comments (0)