In the previous part we created our EKS cluster. In this part we will configure the Amazon RDS Instance.
The following resources will be created:
- A Private Multi-AZ RDS PostgreSQL Instance.
- VPC Subnet Security Group.
- (Optional) Network load balancer to expose private RDS to a specific range of IP addresses.
- (Optional) A Lambda to populate NLB Target Group with RDS private IP.
Amazon RDS
- The Amazon RDS Instance used is a PostgreSQL database server.
- A multi-az option is enabled to ensure high-availability.
- The Instance is not publicly accessible and it's hosted in private subnets.
- All authentication is done through the IAM database authentication.
- Automated backup is enabled.
Create a terraform file infra/plan/rds.tf
resource "random_string" "db_suffix" {
length = 4
special = false
upper = false
}
resource "random_string" "root_username" {
length = 12
special = false
upper = true
}
resource "random_password" "root_password" {
length = 12
special = true
upper = true
}
resource "aws_db_instance" "postgresql" {
# Engine options
engine = "postgres"
engine_version = "12.5"
# Settings
name = "postgresql${var.env}"
identifier = "postgresql-${var.env}"
# Credentials Settings
username = "u${random_string.root_username.result}"
password = "p${random_password.root_password.result}"
# DB instance size
instance_class = "db.m5.large"
# Storage
storage_type = "gp2"
allocated_storage = 100
max_allocated_storage = 200
# Availability & durability
multi_az = true
# Connectivity
db_subnet_group_name = aws_db_subnet_group.sg.id
publicly_accessible = false
vpc_security_group_ids = [aws_security_group.sg.id]
port = var.rds_port
# Database authentication
iam_database_authentication_enabled = true
# Additional configuration
parameter_group_name = "default.postgres12"
# Backup
backup_retention_period = 14
backup_window = "03:00-04:00"
final_snapshot_identifier = "postgresql-final-snapshot-${random_string.db_suffix.result}"
delete_automated_backups = true
skip_final_snapshot = false
# Encryption
storage_encrypted = true
# Maintenance
auto_minor_version_upgrade = true
maintenance_window = "Sat:00:00-Sat:02:00"
# Deletion protection
deletion_protection = false
tags = {
Environment = var.env
}
}
Add the following outputs
output "rds-username" {
value = "u${random_string.root_username.result}"
}
output "rds-password" {
value = "p${random_password.root_password.result}"
}
output "private-rds-endpoint" {
value = aws_db_instance.postgresql.address
}
DB Subnet Group
We deploy the Amazon RDS Instance on private subnets.
resource "aws_db_subnet_group" "sg" {
name = "postgresql-${var.env}"
subnet_ids = [aws_subnet.private["private-rds-1"].id, aws_subnet.private["private-rds-2"].id]
tags = {
Environment = var.env
Name = "postgresql-${var.env}"
}
}
VPC Security Group
In the VPC Security group we allow:
- inbound / outbound traffic on port 5432 with RDS public subnets.
- inbound / outbound TCP network traffic between RDS private subnets.
resource "aws_security_group" "sg" {
name = "postgresql-${var.env}"
description = "Allow inbound/outbound traffic"
vpc_id = aws_vpc.main.id
ingress {
from_port = var.rds_port
to_port = var.rds_port
protocol = "tcp"
cidr_blocks = [aws_subnet.private["private-rds-1"].cidr_block]
}
ingress {
from_port = var.rds_port
to_port = var.rds_port
protocol = "tcp"
cidr_blocks = [aws_subnet.private["private-rds-2"].cidr_block]
}
ingress {
from_port = var.rds_port
to_port = var.rds_port
protocol = "tcp"
cidr_blocks = [aws_subnet.public["public-rds-1"].cidr_block]
}
ingress {
from_port = var.rds_port
to_port = var.rds_port
protocol = "tcp"
cidr_blocks = [aws_subnet.public["public-rds-2"].cidr_block]
}
egress {
from_port = 0
to_port = 65535
protocol = "tcp"
cidr_blocks = [aws_subnet.private["private-rds-1"].cidr_block]
}
egress {
from_port = 0
to_port = 65535
protocol = "tcp"
cidr_blocks = [aws_subnet.private["private-rds-2"].cidr_block]
}
egress {
from_port = var.rds_port
to_port = var.rds_port
protocol = "tcp"
cidr_blocks = [aws_subnet.public["public-rds-1"].cidr_block]
}
egress {
from_port = var.rds_port
to_port = var.rds_port
protocol = "tcp"
cidr_blocks = [aws_subnet.public["public-rds-2"].cidr_block]
}
tags = {
Name = "postgresql-${var.env}"
Environment = var.env
}
}
(Optional) Exposing the RDS instance
If you want to access the RDS instance databases from your local machine or through an external CI / CD tool, you can create an external network load balancer and target the private IP address of the RDS instance. As the private IP address in the network interface can change if an instance fails, a Lambda function can be deployed to continuously check the current private IP address, unregister the old IP address, and register a new target with the new private IP address.
Network Load Balancer
In order to reach the RDS private IP address, the RDS instance and the external network load balancer must be in the same Availability Zones. Thus, the NLB will be deployed in the same subnet as the primary RDS instance.
We create a target group with a target type of IP address. A Cloud Watch alarm has been added to monitor connectivity between NLB and RDS.
Create the file infra/plan/nlb.tf
locals {
subnet_id = aws_subnet.public["public-rds-1"].availability_zone == aws_db_instance.postgresql.availability_zone ? aws_subnet.public["public-rds-1"].id : aws_subnet.public["public-rds-2"].id
}
resource "aws_lb" "rds" {
name = "nlb-expose-rds-${var.env}"
internal = false
load_balancer_type = "network"
subnets = [local.subnet_id]
enable_deletion_protection = false
tags = {
Environment = var.env
}
}
resource "aws_lb_listener" "rds" {
load_balancer_arn = aws_lb.rds.id
port = var.rds_port
protocol = "TCP"
default_action {
target_group_arn = aws_lb_target_group.rds.id
type = "forward"
}
}
resource "aws_lb_target_group" "rds" {
name = "expose-rds-${var.env}"
port = var.rds_port
protocol = "TCP"
target_type = "ip"
vpc_id = aws_vpc.main.id
health_check {
enabled = true
protocol = "TCP"
}
tags = {
Environment = var.env
}
}
resource "aws_cloudwatch_metric_alarm" "rds-access" {
alarm_name = "rds-external-access-status"
comparison_operator = "GreaterThanOrEqualToThreshold"
evaluation_periods = "1"
metric_name = "UnHealthyHostCount"
namespace = "AWS/NetworkELB"
period = "60"
statistic = "Maximum"
threshold = 1
alarm_description = "Monitoring RDS External Access"
treat_missing_data = "breaching"
dimensions = {
TargetGroup = aws_lb_target_group.rds.arn_suffix
LoadBalancer = aws_lb.rds.arn_suffix
}
}
Complete the file infra/plan/output
output "public-rds-endpoint" {
value = "${element(split("/", aws_lb.rds.arn), 2)}-${element(split("/", aws_lb.rds.arn), 3)}.elb.${var.region}.amazonaws.com"
}
Now we need to register a target. A Lambda function can be used to perform the task. An Amazon CloudWatch event rule is added to invoke the Lambda function every 15 minutes.
Lambda function
Create the file infra/plan/lambda.tf
data "archive_file" "lambda_zip" {
type = "zip"
source_file = "${path.module}/lambda/populate-nlb-tg-with-rds-private-ip.py"
output_path = "lambda_function_payload.zip"
}
resource "aws_lambda_function" "rds" {
filename = "lambda_function_payload.zip"
function_name = "populate-nlb-tg-with-rds-private-ip"
role = aws_iam_role.iam_for_lambda.arn
handler = "populate-nlb-tg-with-rds-private-ip.handler"
source_code_hash = data.archive_file.lambda_zip.output_base64sha256
runtime = "python3.8"
timeout = 300
environment {
variables = {
RDS_PORT = var.rds_port
NLB_TG_ARN = aws_lb_target_group.rds.arn
RDS_SG_ID = aws_security_group.sg.id
RDS_ID = aws_db_instance.postgresql.id
}
}
tags = {
Environment = var.env
}
}
resource "aws_iam_role" "iam_for_lambda" {
name = "iam_for_lambda"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}
resource "aws_iam_role_policy" "lambda_nlb" {
name = "nlb-tg-access"
role = aws_iam_role.iam_for_lambda.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = [
"ec2:DescribeNetworkInterfaces",
"elasticloadbalancing:DeregisterTargets",
"elasticloadbalancing:DescribeTargetHealth",
"elasticloadbalancing:RegisterTargets",
"rds:DescribeDBInstances"
]
Effect = "Allow"
Resource = "*"
},
]
})
}
resource "aws_iam_role_policy" "lambda_logging" {
name = "lambda_logging"
role = aws_iam_role.iam_for_lambda.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
]
Effect = "Allow"
Resource = "arn:aws:logs:*:*:*"
},
]
})
}
resource "aws_cloudwatch_log_group" "lambda" {
name = "/aws/lambda/${aws_lambda_function.rds.function_name}"
retention_in_days = 1
}
resource "aws_cloudwatch_event_rule" "lambda" {
name = "populate-nlb-tg-with-rds-private-ip"
description = "Populate NLB tg with RDS private IP"
schedule_expression = "rate(15 minutes)"
}
resource "aws_cloudwatch_event_target" "lambda" {
rule = aws_cloudwatch_event_rule.lambda.name
target_id = "Lambda"
arn = aws_lambda_function.rds.arn
}
resource "aws_lambda_permission" "cloudwatch" {
statement_id = "AllowExecutionFromCloudWatch"
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.rds.function_name
principal = "events.amazonaws.com"
source_arn = aws_cloudwatch_event_rule.lambda.arn
}
The lambda function is written in python. The process is as follows:
- Getting the current registered IPs using
describe_target_health
function. - Getting the current RDS Instance availability zone using
describe_db_instances
function. - Searching the current RDS private IP using
describe_network_interfaces
function. - If no
Registry Target
has already been assigned, we create a new one. If the current register target is old, we deregister it and create a new register target with the new RDS private IP address.
Create the file infra/plan/lambda/populate-nlb-tg-with-rds-private-ip.py
import json
import os
import random
import sys
import boto3
import logging
from datetime import datetime
from botocore.exceptions import ClientError
logger = logging.getLogger()
logger.setLevel(logging.INFO)
'''
This function populates a Network Load Balancer's target group with RDS IP addresses
Configure these environment variables in your Lambda environment
1. NLB_TG_ARN - The ARN of the Network Load Balancer's target group
2. RDS_PORT
3. RDS_SG_ID - RDS VPC Security Group Id
4. RDS_ID - RDS Identifier
'''
NLB_TG_ARN = os.environ['NLB_TG_ARN']
RDS_PORT = int(os.environ['RDS_PORT'])
RDS_SG_ID = os.environ['RDS_SG_ID']
RDS_ID = os.environ['RDS_ID']
try:
elbv2client = boto3.client('elbv2')
except ClientError as e:
logger.error(e.response['Error']['Message'])
sys.exit(1)
try:
rdsclient = boto3.client('rds')
except ClientError as e:
logger.error(e.response['Error']['Message'])
sys.exit(1)
try:
ec2client = boto3.client('ec2')
except ClientError as e:
logger.error(e.response['Error']['Message'])
sys.exit(1)
def register_target(tg_arn, new_target_list):
logger.info(f"INFO: Register new_target_list:{new_target_list}")
try:
elbv2client.register_targets(
TargetGroupArn=tg_arn,
Targets=new_target_list
)
except ClientError as e:
logger.error(e.response['Error']['Message'])
def deregister_target(tg_arn, new_target_list):
try:
logger.info(f"INFO: Deregistering targets: {new_target_list}")
elbv2client.deregister_targets(
TargetGroupArn=tg_arn,
Targets=new_target_list
)
except ClientError as e:
logger.error(e.response['Error']['Message'])
def target_group_list(ip_list):
target_list = []
for ip in ip_list:
target = {
'Id': ip,
'Port': RDS_PORT,
}
target_list.append(target)
return target_list
def get_registered_ips(tg_arn):
registered_ip_list = []
try:
response = elbv2client.describe_target_health(
TargetGroupArn=tg_arn)
registered_ip_count = len(response['TargetHealthDescriptions'])
logger.info(f"INFO: Number of currently registered IP: {registered_ip_count}")
for target in response['TargetHealthDescriptions']:
registered_ip = target['Target']['Id']
registered_ip_list.append(registered_ip)
except ClientError as e:
logger.error(e.response['Error']['Message'])
return registered_ip_list
def get_rds_private_ips(rds_az):
resp = ec2client.describe_network_interfaces(Filters=[{
'Name': 'group-id',
'Values': [RDS_SG_ID]
}, {
'Name': 'availability-zone',
'Values': [rds_az]
}])
private_ip_address = []
for interface in resp['NetworkInterfaces']:
private_ip_address.append(interface['PrivateIpAddress'])
return private_ip_address
def get_rds_az():
logger.info(f"INFO: Get RDS current AZ: {RDS_ID}")
az = None
try:
response = rdsclient.describe_db_instances(
DBInstanceIdentifier=RDS_ID
)
if len(response['DBInstances']) > 0:
az = response['DBInstances'][0]['AvailabilityZone']
logger.info(f"INFO: RDS AZ is: {az}")
except ClientError as e:
logger.error(e.response['Error']['Message'])
return az
def handler(event, context):
registered_ip_list = get_registered_ips(NLB_TG_ARN)
current_rds_az = get_rds_az()
new_active_ip_set = get_rds_private_ips(current_rds_az)
registration_ip_list = []
# IPs that have not been registered
if len(registered_ip_list) == 0 or registered_ip_list != new_active_ip_set:
registration_ip_list = new_active_ip_set
if registration_ip_list:
registerTarget_list = target_group_list(registration_ip_list)
register_target(NLB_TG_ARN, registerTarget_list)
logger.info(f"INFO: Registering {registration_ip_list}")
else:
logger.info(f"INFO: No new target registered")
deregistration_ip_list = []
if registered_ip_list != new_active_ip_set:
for ip in registered_ip_list:
deregistration_ip_list.append(ip)
logger.info(f"INFO: Deregistering IP: {ip}")
deregisterTarget_list = target_group_list(deregistration_ip_list)
deregister_target(NLB_TG_ARN, deregisterTarget_list)
else:
logger.info(f"INFO: No old target deregistered")
Complete the file infra/plan/variable.tf
:
variable "rds_port" {
type = number
default = 5432
}
Let's deploy our RDS instance
cd infra/envs/dev
terraform apply ../../plan/
Before going to the next part, we will need to create the metabase
database on Amazon RDS instance:
PGPASSWORD=$(terraform output rds-password) psql --host $(terraform output public-rds-endpoint) --port 5432 --user $(terraform output rds-username) --dbname postgres
CREATE USER metabase;
GRANT rds_iam TO metabase;
CREATE DATABASE metabase;
GRANT ALL ON DATABASE metabase TO metabase;
Let's check if all the resources have been created and are working correctly
RDS Instance
VPC Security Group
Lambda
NLB Target Group
Conclusion
Our RDS instance is now available. In the next part, we'll establish a connection between a container deployed in Amazon EKS and a database created in an Amazon RDS instance.
Top comments (1)
Not sure how you use the external NLB to expose the RDS instance only to a specific IP addresses range? So if I understand, when requesting access to the DB from outside the VPN, we pass by the ITG that we associated to the public subnets and that lets through any IP ("0.0.0.0/0"). When we land in the public subnets (i.e. external zone), we face the external load balancer, that listens TCP on port 5432 and routes it to the RDS instance.
However it doesn't seem to be routed properly from outside the VPC, and psql times out when reaching the NLB public endpoint. Adding ingress rules on the security group does not solve the issue. Any suggestion about where to look for making that NLB work as expected?