Haripriya Veluchamy

Posted on Sep 19 • Edited on Sep 20

Deploying a Django Blog on AWS: Terraform Infrastructure as Code Best Practices

#aws #terraform #beginners #devops

In today's world of cloud computing, managing infrastructure manually is becoming a thing of the past. Infrastructure as Code (IaC) has emerged as a best practice, allowing developers to define and provision infrastructure using code rather than manual processes. In this blog post, I'll share my experience creating a fully automated deployment pipeline for a Django blog application using Terraform, GitHub Actions, and AWS. for a code checkout https://github.com/Harivelu0/e2e-django-infra-pipeline

Project Overview

Our goal was to build a production-ready Django blog with:

Automated infrastructure provisioning
Secure networking architecture
CI/CD pipeline for continuous deployment
Container-based application deployment

Let's dive into how we made this happen and the valuable lessons learned along the way.

Architecture: Security by Design

Our application architecture follows AWS best practices:

VPC with public and private subnets across multiple availability zones
Public resources: Load balancer and NAT gateway in public subnets
Private resources: ECS containers and RDS database in private subnets
Security controls: IAM roles, security groups, and secrets management

This design ensures our application is both secure and scalable. The load balancer handles inbound traffic, while the NAT gateway enables outbound internet access for our private resources.

CI/CD Pipeline: Infrastructure First, Then Application

We created two separate GitHub Actions workflows:

Infrastructure Pipeline:
- Initializes Terraform
- Plans and applies infrastructure changes
- Stores outputs in Parameter Store
Application Pipeline:
- Builds Docker image with Django code
- Pushes image to ECR
- Updates ECS task definition
- Deploys to ECS

The application pipeline is configured to run only after the infrastructure pipeline completes successfully, ensuring infrastructure is ready before deployment.

Terraform Best Practices Implemented

Terraform formed the backbone of our infrastructure provisioning. Here are the key best practices we implemented:

1. State Management

terraform {
  backend "s3" {
    bucket         = "terraform-state-bucket"
    key            = "ecs-app/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

Remote state storage: Used S3 for storing state, enabling team collaboration
State locking: Implemented DynamoDB for state locking to prevent concurrent modifications
Encryption: Enabled encryption for state files to protect sensitive information

2. Resource Organization

We organized resources logically, separating concerns:

Network infrastructure (VPC, subnets, gateways)
Security components (IAM roles, security groups)
Database resources
Container infrastructure

This makes the codebase more maintainable and easier to understand.

3. Variable Management

variable "project_name" {
  description = "Name of the project, used for resource naming"
  type        = string
}

variable "db_password" {
  description = "Password for the database"
  type        = string
  sensitive   = true
}

Descriptive variables: Created well-documented variables
Sensitive data marking: Used the sensitive flag for credentials
Default values: Provided sensible defaults where appropriate
Variable validation: Implemented validation for critical values

4. Dependency Management

resource "aws_ecs_service" "app" {
  # ...
  depends_on = [aws_lb_listener.http]
}

Explicit dependencies: Used depends_on to ensure resources are created in the correct order
Implicit dependencies: Leveraged Terraform's resource references for automatic dependency management

5. Output Management

resource "aws_ssm_parameter" "app_secret_name" {
  name        = "/${var.project_name}/app_secret_name"
  description = "Name of the secret in AWS Secrets Manager"
  type        = "String"
  value       = aws_secretsmanager_secret.app_secrets.name
}

Parameter Store integration: Stored infrastructure outputs in AWS Parameter Store
Descriptive outputs: Provided clear descriptions for all outputs
Secure output handling: Used secure methods for sensitive outputs

6. Resource Naming

resource "aws_security_group" "rds_sg" {
  name        = "${var.project_name}-rds-sg"
  description = "Allow database traffic"
  vpc_id      = aws_vpc.main.id
  # ...
}

Consistent naming convention: Used the project name as a prefix for all resources
Descriptive naming: Added resource type suffixes for clarity
Tags: Applied tags for better resource management and cost allocation

Security Best Practices

Security was a top priority throughout our implementation:

1. Network Segmentation

Private subnets: Placed database and application containers in private subnets
Public access control: Only the load balancer is accessible from the internet

2. Least Privilege Principle

resource "aws_iam_policy" "ecs_execution_secrets_policy" {
  name = "${var.project_name}-ecs-execution-secrets-policy"
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = ["secretsmanager:GetSecretValue", "secretsmanager:DescribeSecret"]
      Resource = aws_secretsmanager_secret.app_secrets.arn
    }]
  })
}

Specific IAM roles: Created role-specific IAM policies
Resource-level permissions: Limited permissions to specific resources
Separate roles: Used different roles for execution and task permissions

3. Secrets Management

resource "aws_secretsmanager_secret" "app_secrets" {
  name = "${var.project_name}-secrets-${random_string.secret_suffix.result}"
}

resource "aws_secretsmanager_secret_version" "app_secret_version" {
  secret_id = aws_secretsmanager_secret.app_secrets.id
  secret_string = jsonencode({
    DB_NAME     = var.db_name
    DB_USER     = var.db_username
    DB_PASSWORD = var.db_password
    # ...
  })
}

AWS Secrets Manager: Stored sensitive information in Secrets Manager
Randomized names: Used random suffixes for secret names
JSON structure: Organized secrets in structured JSON

4. Security Groups

resource "aws_security_group" "rds_sg" {
  # ...
  ingress {
    from_port       = 3306
    to_port         = 3306
    protocol        = "tcp"
    security_groups = [aws_security_group.ecs_sg.id]
  }
}

Precise rules: Defined specific inbound/outbound rules
Security group references: Used security group references instead of CIDR blocks
Default deny: Followed "deny by default" approach

Challenges and Solutions

During implementation, we faced several challenges:

1. Database Migrations in Private Subnets

Initially, we tried running migrations from GitHub Actions, but this required exposing the database publicly. This posed security risks and still had connectivity issues.

Solution: We created an entrypoint script in the Docker container that runs migrations at startup. Since the containers run inside the VPC, they can securely access the private database.

#!/bin/bash
# Container entrypoint script
python manage.py migrate
python manage.py createsuperuser --noinput || echo "Admin already exists"
gunicorn myapp.wsgi:application --bind 0.0.0.0:8000

2. Image Reference Management

We initially set a default Docker image value as "nginx:latest" in our Terraform code. This caused issues when our CI/CD pipeline built and pushed a Django image to ECR, but the ECS service continued using the Nginx image.

Solution: We implemented a task definition update step in our deployment pipeline that:

Gets the current task definition
Updates the image reference
Registers a new task definition
Updates the service to use this new definition

3. IAM Role Confusion

Our containers couldn't access secrets from AWS Secrets Manager due to missing permissions. We had correctly configured the task role but forgotten about the execution role.

Solution: Added appropriate permissions to both roles:

Execution role: For container setup (pulling images, accessing secrets)
Task role: For the running application

4. Terraform State Management

We occasionally encountered state lock issues when pipelines were cancelled or failed.

Solution: Implemented proper state management with S3 and DynamoDB, and learned how to handle locked states:

terraform force-unlock <lock-id>

Lessons Learned

This project taught us several valuable lessons:

Container-based migrations are superior: Running migrations within the application container is more secure and reliable than external migration scripts.
Placeholder patterns require careful handling: Using placeholder values in infrastructure code is necessary but requires proper updates during deployment.
IAM roles are easily confused: Always remember both the execution role (for container startup) and task role (for the running application).
State management is critical: Proper Terraform state management with S3 and DynamoDB prevents headaches when working in teams.
Pipeline coordination matters: Ensure infrastructure is ready before deploying applications that depend on it.

Best Practices Worth Following

These practices made the project successful:

Keep databases private: Never expose databases to the public internet, even for migrations.
Use entrypoint scripts for initialization: Handle database migrations and initial setup when containers start.
Separate infrastructure from application deployment: Use different pipelines for infrastructure provisioning and application deployment.
Use parameter store for configuration: Store infrastructure outputs in Parameter Store for use by application pipelines.
Implement proper security groups: Control traffic between components with precise security group rules.

Why Infrastructure as Code via Pipelines?

Using Terraform and GitHub Actions for infrastructure deployment provides numerous benefits:

1. Consistency and Reproducibility

Every environment is deployed exactly the same way, eliminating "it works on my machine" problems. We can recreate the entire infrastructure identically at any time.

2. Version Control for Infrastructure

Infrastructure changes are tracked in Git, providing history, rollback capabilities, and collaboration features like pull requests.

3. Automated Testing and Validation

The pipeline can validate infrastructure changes before applying them, catching errors early.

4. Reduced Human Error

Automation eliminates manual steps, reducing the chance of configuration mistakes.

5. Self-Documenting Infrastructure

The Terraform code serves as documentation for the infrastructure, showing exactly what resources exist and how they're configured.

6. Faster Deployment and Recovery

In case of issues, we can quickly redeploy the entire infrastructure without manual intervention.

Conclusion

Implementing a Django blog application on AWS using Terraform and GitHub Actions has been a rewarding journey. By following infrastructure as code principles and implementing security best practices, we've created a robust, secure, and maintainable deployment pipeline.

The combination of Terraform for infrastructure provisioning and GitHub Actions for CI/CD provides a powerful foundation for modern application deployment. While we encountered challenges along the way, each one taught us valuable lessons about cloud architecture, security, and DevOps practices.

I hope this blog post helps you in your own infrastructure automation journey. Remember, the goal isn't just to automate for automation's sake, but to create infrastructure that is reliable, secure, and maintainable.

What challenges have you faced when implementing infrastructure as code? Share your experiences in the comments!

Note: The diagram shown in this post is a simplified representation of the actual architecture. The complete code for this project is available on GitHub.

Top comments (1)

Suvrajeet Banerjee • Sep 21

Expands on portfolio dev's deployment steps with AWS specifics, but cost blindspots like DynamoDB—robust for blogs. 🛠️