In today's world of cloud computing, managing infrastructure manually is becoming a thing of the past. Infrastructure as Code (IaC) has emerged as a best practice, allowing developers to define and provision infrastructure using code rather than manual processes. In this blog post, I'll share my experience creating a fully automated deployment pipeline for a Django blog application using Terraform, GitHub Actions, and AWS. for a code checkout https://github.com/Harivelu0/e2e-django-infra-pipeline
Project Overview
Our goal was to build a production-ready Django blog with:
- Automated infrastructure provisioning
- Secure networking architecture
- CI/CD pipeline for continuous deployment
- Container-based application deployment
Let's dive into how we made this happen and the valuable lessons learned along the way.
Architecture: Security by Design
Our application architecture follows AWS best practices:
- VPC with public and private subnets across multiple availability zones
- Public resources: Load balancer and NAT gateway in public subnets
- Private resources: ECS containers and RDS database in private subnets
- Security controls: IAM roles, security groups, and secrets management
This design ensures our application is both secure and scalable. The load balancer handles inbound traffic, while the NAT gateway enables outbound internet access for our private resources.
CI/CD Pipeline: Infrastructure First, Then Application
We created two separate GitHub Actions workflows:
-
Infrastructure Pipeline:
- Initializes Terraform
- Plans and applies infrastructure changes
- Stores outputs in Parameter Store
-
Application Pipeline:
- Builds Docker image with Django code
- Pushes image to ECR
- Updates ECS task definition
- Deploys to ECS
The application pipeline is configured to run only after the infrastructure pipeline completes successfully, ensuring infrastructure is ready before deployment.
Terraform Best Practices Implemented
Terraform formed the backbone of our infrastructure provisioning. Here are the key best practices we implemented:
1. State Management
terraform {
backend "s3" {
bucket = "terraform-state-bucket"
key = "ecs-app/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-locks"
encrypt = true
}
}
- Remote state storage: Used S3 for storing state, enabling team collaboration
- State locking: Implemented DynamoDB for state locking to prevent concurrent modifications
- Encryption: Enabled encryption for state files to protect sensitive information
2. Resource Organization
We organized resources logically, separating concerns:
- Network infrastructure (VPC, subnets, gateways)
- Security components (IAM roles, security groups)
- Database resources
- Container infrastructure
This makes the codebase more maintainable and easier to understand.
3. Variable Management
variable "project_name" {
description = "Name of the project, used for resource naming"
type = string
}
variable "db_password" {
description = "Password for the database"
type = string
sensitive = true
}
- Descriptive variables: Created well-documented variables
-
Sensitive data marking: Used the
sensitive
flag for credentials - Default values: Provided sensible defaults where appropriate
- Variable validation: Implemented validation for critical values
4. Dependency Management
resource "aws_ecs_service" "app" {
# ...
depends_on = [aws_lb_listener.http]
}
-
Explicit dependencies: Used
depends_on
to ensure resources are created in the correct order - Implicit dependencies: Leveraged Terraform's resource references for automatic dependency management
5. Output Management
resource "aws_ssm_parameter" "app_secret_name" {
name = "/${var.project_name}/app_secret_name"
description = "Name of the secret in AWS Secrets Manager"
type = "String"
value = aws_secretsmanager_secret.app_secrets.name
}
- Parameter Store integration: Stored infrastructure outputs in AWS Parameter Store
- Descriptive outputs: Provided clear descriptions for all outputs
- Secure output handling: Used secure methods for sensitive outputs
6. Resource Naming
resource "aws_security_group" "rds_sg" {
name = "${var.project_name}-rds-sg"
description = "Allow database traffic"
vpc_id = aws_vpc.main.id
# ...
}
- Consistent naming convention: Used the project name as a prefix for all resources
- Descriptive naming: Added resource type suffixes for clarity
- Tags: Applied tags for better resource management and cost allocation
Security Best Practices
Security was a top priority throughout our implementation:
1. Network Segmentation
- Private subnets: Placed database and application containers in private subnets
- Public access control: Only the load balancer is accessible from the internet
2. Least Privilege Principle
resource "aws_iam_policy" "ecs_execution_secrets_policy" {
name = "${var.project_name}-ecs-execution-secrets-policy"
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = ["secretsmanager:GetSecretValue", "secretsmanager:DescribeSecret"]
Resource = aws_secretsmanager_secret.app_secrets.arn
}]
})
}
- Specific IAM roles: Created role-specific IAM policies
- Resource-level permissions: Limited permissions to specific resources
- Separate roles: Used different roles for execution and task permissions
3. Secrets Management
resource "aws_secretsmanager_secret" "app_secrets" {
name = "${var.project_name}-secrets-${random_string.secret_suffix.result}"
}
resource "aws_secretsmanager_secret_version" "app_secret_version" {
secret_id = aws_secretsmanager_secret.app_secrets.id
secret_string = jsonencode({
DB_NAME = var.db_name
DB_USER = var.db_username
DB_PASSWORD = var.db_password
# ...
})
}
- AWS Secrets Manager: Stored sensitive information in Secrets Manager
- Randomized names: Used random suffixes for secret names
- JSON structure: Organized secrets in structured JSON
4. Security Groups
resource "aws_security_group" "rds_sg" {
# ...
ingress {
from_port = 3306
to_port = 3306
protocol = "tcp"
security_groups = [aws_security_group.ecs_sg.id]
}
}
- Precise rules: Defined specific inbound/outbound rules
- Security group references: Used security group references instead of CIDR blocks
- Default deny: Followed "deny by default" approach
Challenges and Solutions
During implementation, we faced several challenges:
1. Database Migrations in Private Subnets
Initially, we tried running migrations from GitHub Actions, but this required exposing the database publicly. This posed security risks and still had connectivity issues.
Solution: We created an entrypoint script in the Docker container that runs migrations at startup. Since the containers run inside the VPC, they can securely access the private database.
#!/bin/bash
# Container entrypoint script
python manage.py migrate
python manage.py createsuperuser --noinput || echo "Admin already exists"
gunicorn myapp.wsgi:application --bind 0.0.0.0:8000
2. Image Reference Management
We initially set a default Docker image value as "nginx:latest" in our Terraform code. This caused issues when our CI/CD pipeline built and pushed a Django image to ECR, but the ECS service continued using the Nginx image.
Solution: We implemented a task definition update step in our deployment pipeline that:
- Gets the current task definition
- Updates the image reference
- Registers a new task definition
- Updates the service to use this new definition
3. IAM Role Confusion
Our containers couldn't access secrets from AWS Secrets Manager due to missing permissions. We had correctly configured the task role but forgotten about the execution role.
Solution: Added appropriate permissions to both roles:
- Execution role: For container setup (pulling images, accessing secrets)
- Task role: For the running application
4. Terraform State Management
We occasionally encountered state lock issues when pipelines were cancelled or failed.
Solution: Implemented proper state management with S3 and DynamoDB, and learned how to handle locked states:
terraform force-unlock <lock-id>
Lessons Learned
This project taught us several valuable lessons:
Container-based migrations are superior: Running migrations within the application container is more secure and reliable than external migration scripts.
Placeholder patterns require careful handling: Using placeholder values in infrastructure code is necessary but requires proper updates during deployment.
IAM roles are easily confused: Always remember both the execution role (for container startup) and task role (for the running application).
State management is critical: Proper Terraform state management with S3 and DynamoDB prevents headaches when working in teams.
Pipeline coordination matters: Ensure infrastructure is ready before deploying applications that depend on it.
Best Practices Worth Following
These practices made the project successful:
Keep databases private: Never expose databases to the public internet, even for migrations.
Use entrypoint scripts for initialization: Handle database migrations and initial setup when containers start.
Separate infrastructure from application deployment: Use different pipelines for infrastructure provisioning and application deployment.
Use parameter store for configuration: Store infrastructure outputs in Parameter Store for use by application pipelines.
Implement proper security groups: Control traffic between components with precise security group rules.
Why Infrastructure as Code via Pipelines?
Using Terraform and GitHub Actions for infrastructure deployment provides numerous benefits:
1. Consistency and Reproducibility
Every environment is deployed exactly the same way, eliminating "it works on my machine" problems. We can recreate the entire infrastructure identically at any time.
2. Version Control for Infrastructure
Infrastructure changes are tracked in Git, providing history, rollback capabilities, and collaboration features like pull requests.
3. Automated Testing and Validation
The pipeline can validate infrastructure changes before applying them, catching errors early.
4. Reduced Human Error
Automation eliminates manual steps, reducing the chance of configuration mistakes.
5. Self-Documenting Infrastructure
The Terraform code serves as documentation for the infrastructure, showing exactly what resources exist and how they're configured.
6. Faster Deployment and Recovery
In case of issues, we can quickly redeploy the entire infrastructure without manual intervention.
Conclusion
Implementing a Django blog application on AWS using Terraform and GitHub Actions has been a rewarding journey. By following infrastructure as code principles and implementing security best practices, we've created a robust, secure, and maintainable deployment pipeline.
The combination of Terraform for infrastructure provisioning and GitHub Actions for CI/CD provides a powerful foundation for modern application deployment. While we encountered challenges along the way, each one taught us valuable lessons about cloud architecture, security, and DevOps practices.
I hope this blog post helps you in your own infrastructure automation journey. Remember, the goal isn't just to automate for automation's sake, but to create infrastructure that is reliable, secure, and maintainable.
What challenges have you faced when implementing infrastructure as code? Share your experiences in the comments!
Note: The diagram shown in this post is a simplified representation of the actual architecture. The complete code for this project is available on GitHub.
Top comments (1)
Expands on portfolio dev's deployment steps with AWS specifics, but cost blindspots like DynamoDB—robust for blogs. 🛠️