The infrastructure overhaul of the service I'm currently working on is quite a challenge...
Even when we began to review the Security Group (SG) alone, we found that similar SGs were attached multiple times to a single resource... It took more time than we anticipated.
While the service could continue to operate in its original state, it's necessary to manage considering the security aspects as well. After discussing with the team, we decided to codify the infrastructure.
Usually, I mainly work on frontend and AI development, but I decided to take this opportunity to introduce Terraform.
The Purpose and Necessity of Infrastructure as Code (IaC)
- Reduction of resource creation errors due to operational mistakes
- Ensuring consistency in future test environment builds
- Version control enables tracking of change history and quick rollbacks when problems occur
Why We Chose Terraform and Its Advantages
- High readability of Terraform's code (HCL) compared to other tools (e.g. CloudFormation)
- Ease of codifying existing manually created infrastructure
- Capability to accommodate the potential introduction of GCP resources (BigQuery) in the future
Implementation Process
I will explain sequentially about the installation of Terraform, the implementation of security groups, and their application.
Installing Terraform
For MacOS, you can install from Homebrew.
$ brew install terraform
Creating a New Project
Create a new Terraform project (e.g., terraform-project).
Directory Structure
The directory structure is as follows:
-- terraform-project/
-- environments/
-- dev/
-- backend.tf
-- main.tf
-- stg/
-- backend.tf
-- main.tf
-- prod/
-- backend.tf
-- main.tf
-- modules/
-- <service-name>/
-- main.tf
-- variables.tf
-- outputs.tf
-- provider.tf
-- README.md
-- security_group/
-- main.tf
-- variables.tf
-- outputs.tf
-- provider.tf
-- README.md
-- ...other…
-- docs/
-- architecrture.drowio
-- architecrture.png
- environments/: Stores settings for each environment (dev, stg, prod). Each environment may have different settings (for example, using smaller EC2 instances in the dev environment, and larger instances in the prod environment).
- environments/{environment}/backend.tf:
backend.tf
contains the backend settings for Terraform. It manages where to store the Terraform state file (.tfstate
). - environments/{environment}/main.tf: Defines the resources and modules used in each environment. It mainly refers to the module calls and settings, not the specific resource settings.
- modules/: Stores the Terraform modules. Modules are reusable blocks of Terraform code, which allow for code organization by using modules instead of writing the same code repeatedly.
- modules/{service-name}/main.tf: Sets up the Terraform resources related to a specific service.
- modules/{service-name}/variables.tf: Defines variables used in the module.
- modules/{service-name}/outputs.tf: Defines the values outputted by the module.
- modules/{service-name}/provider.tf: Stores provider configurations. Typically, it includes provider settings shared across the entire project, such as provider version information.
- modules/{service-name}/README.md: Describes the module and its usage.
- docs/: Manages architecture diagrams in drawio and png formats.
Creating Security Group (SG) Rules
As an example of creating an SG, let's consider a Lambda in a private subnet and an RDS Proxy in a different private subnet.
Lambda's SG
The Lambda has outbound traffic to the RDS Proxy, interface VPC endpoints, and the internet (via the NAT gateway of the public subnet). For these connections, first, define aws_security_group as lambda_sg, and then associate the required aws_security_group_rule.
An important point is that you need to specify the vpc_id of the resource to attach the SG in aws_security_group. This VPC has already been created manually, and its settings are managed in Git. Therefore, vpc_id is read from the SSM Parameter Store instead of hard-coding.
In this Lambda, since inbound rules are not required, only outbound rules (type = "egress") are created. By setting aws_security_group.lambda_sg.id in the security_group_id of each aws_security_group_rule, you specify which SG to attach the rule to. If the destination is determined, specify the ID of the destination SG in source_security_group_id.
Here is the Terraform code reflecting these settings(./modules/security_group/main.tf
).
data "aws_ssm_parameter" "lambda_vpc_id" {
name = "/vpc/id/lambda"
}
resource "aws_security_group" "lambda_sg" {
name = "lambda-sg"
description = "Security group for Lambda"
vpc_id = data.aws_ssm_parameter.lambda_vpc_id.value
}
resource "aws_security_group_rule" "lambda_sg_egress_vpc" {
type = "egress"
from_port = 0
to_port = 0
protocol = "-1"
security_group_id = aws_security_group.lambda_sg.id
source_security_group_id = aws_security_group.vpc_endpoint_sg.id
}
resource "aws_security_group_rule" "lambda_sg_egress_rds_proxy" {
type = "egress"
from_port = 0
to_port = 0
protocol = "-1"
security_group_id = aws_security_group.lambda_sg.id
source_security_group_id = aws_security_group.rds_proxy_sg.id
}
resource "aws_security_group_rule" "lambda_sg_egress_https" {
type = "egress"
from_port = 443
to_port = 443
protocol = "tcp"
security_group_id = aws_security_group.lambda_sg.id
cidr_blocks = ["0.0.0.0/0"]
}
*For security reasons, in actual projects, apply access restrictions as much as possible and open only the minimum necessary ports and protocols.
RDS Proxy's SG
The RDS Proxy has inbound traffic from the Lambda and outbound traffic to the RDS.
Similar to the previously mentioned Lambda example, first define aws_security_group as rds_proxy_sg. Next, associate the required aws_security_group_rule.
This RDS Proxy requires rules that allow connections (inbound) from the Lambda and connections (outbound) to the RDS. The inbound rule (type = "ingress") allows connections from Lambda, and the outbound rule (type = "egress") allows connections to RDS.
Set aws_security_group.rds_proxy_sg.id in the security_group_id of each aws_security_group_rule, and specify the ID of the corresponding source SG in source_security_group_id.
Here is the Terraform code reflecting these settings(./modules/security_group/main.tf
).
resource "aws_security_group" "rds_proxy_sg" {
name = "rds-proxy-sg"
description = "Security group for RDS Proxy"
vpc_id = data.aws_ssm_parameter.lambda_vpc_id.value
}
resource "aws_security_group_rule" "rds_proxy_sg_ingress" {
type = "ingress"
from_port = 3306
to_port = 3306
protocol = "tcp"
security_group_id = aws_security_group.rds_proxy_sg.id
source_security_group_id = aws_security_group.lambda_sg.id
}
resource "aws_security_group_rule" "rds_proxy_sg_egress" {
type = "egress"
from_port = 0
to_port = 0
protocol = "-1"
security_group_id = aws_security_group.rds_proxy_sg.id
source_security_group_id = aws_security_group.rds_sg.id
}
Setting Region Information
Specify the region where the SG is created in this file(./modules/security_group/provider.tf
).
provider "aws" {
region = "ap-northeast-1"
}
Setting the Destination to Save the State File
The Terraform state file (.tfstate
) is an important file that saves the current state of the infrastructure. The destination of this file is managed in ./environments/{environment}/backend.tf
. Normally, it is recommended to save the state file in remote storage, and using a remote backend makes it easy to share states among team members and recover from loss or damage of the state file.
Below is an example of using S3 as backend storage. In this example, save the state file in an S3 bucket named terraform-state-1234
. The name of the S3 bucket must be unique across all AWS accounts. Also, you need to create the bucket in advance.
terraform {
backend "s3" {
bucket = "terraform-state-1234"
key = "dev/terraform.tfstate"
region = "ap-northeast-1"
}
}
With this configuration, when you run terraform apply
, the state file is automatically saved in the specified S3 bucket. Also, when you run terraform init
, Terraform retrieves the state file from the specified backend.
Defining Modules for Each Environment
When setting up the security group (SG) for the dev environment, we load the module we created in ./environments/dev/main.tf
.
module "security_group" {
source = "../../modules/security_group"
}
Loading Variables (Not required this time)
In this case, we're loading lambda_vpc_id
from the SSM Parameter Store, but it's also possible to import the value as a variable from environments/{environment}/main.tf
.
First, we define the variable in ./modules/security_group/variables.tf
.
variable "lambda_vpc_id" {
description = "The ID of the lambda-vpc where the security group will be created"
type = string
}
Next, we specify the value of the variable in ./environments/{environment}/main.tf
.
module "security_group" {
source = "../../modules/security_group"
lambda_vpc_id = "vpc-123456"
}
With this, we can load the variable as var.lambda_vpc_id
in ./modules/security_group/main.tf
.
resource "aws_security_group" "lambda_sg" {
name = "lambda-sg"
description = "Security group for Lambda"
vpc_id = var.lambda_vpc_id
}
Outputting Values (Not required this time)
By outputting things like the ID of the resource created in ./modules/security_group/main.tf
, we can reference the value from the resources of other modules.
Here's an example of output.
output "vpc_endpoint_sg_id" {
value = aws_security_group.vpc_endpoint_sg.id
description = "The ID of vpc-endpoint-sg"
}
Checking Differences and Applying Changes
With Terraform, you can check what changes before and after resources. At this stage, we will review the plan proposed by Terraform before the actual resources are created or changed.
Execute the following command in the ./environments/{environment}
directory.
$ terraform plan
When you run this command, Terraform will compare the current state and the defined configuration and display the changes that should be applied. This will prevent unintended changes.
Then, if there is no problem with the change content you checked, run the following command to apply the change.
$ terraform apply
With this apply command, Terraform will apply the planned changes to the actual resources. This process is fully automated, greatly reducing the potential for manual errors.
In this way, Terraform makes it easier to manage infrastructure by accurately understanding the current state of the infrastructure and proposing and applying changes to bring it closer to the ideal state.
Conclusion
There were several places where I found rule configuration mistakes in the process of creating the security groups. However, thanks to leveraging Terraform, correcting these errors was significantly simplified. As a next step, I'm thinking about introducing GitHub Actions workflows that automatically run terraform plan
when creating PRs. Additionally, I plan to build workflows that perform security checks when pushing to Git. By automating these processes, I aim to further enhance security and increase work efficiency.
Top comments (0)