Bayajid Alam Juyel

Posted on Jul 29

From Monolith to Scalable: Building a Cost-Effective Auto-Scaling Architecture on AWS

#aws #devops #systemdesign #infrastructureascode

How I transformed a simple Todo app into a fault-tolerant, auto-scaling system with zero-downtime deployment

The Challenge

You've built a simple monolithic application—let's say a Todo app called "SimplyDone"—and suddenly, your user base explodes. Your single server is struggling, users are experiencing downtime, and you're manually scrambling to deploy updates. Sound familiar?

This was exactly the scenario I faced when tasked with transforming a basic Notes/Todo application into a production-ready, scalable system that could handle sudden traffic spikes while maintaining high availability and fault tolerance. The catch? Everything needed to be automated—from infrastructure provisioning to deployment.

The Architecture Decision: Why ALB Over NGINX

When designing the load balancing strategy, I had to choose between NGINX and AWS Application Load Balancer (ALB). While NGINX is a fantastic reverse proxy, deploying it on a single instance would create a single point of failure—exactly what we're trying to avoid.

ALB, on the other hand, is:

Inherently fault-tolerant across multiple Availability Zones
Managed by AWS (no maintenance overhead)
Intelligent with health checks and traffic distribution
Cost-effective at scale

The choice was clear: ALB would handle traffic distribution while I focused on application scalability.

Infrastructure as Code: The Pulumi Approach

Instead of clicking through the AWS console, I used Pulumi with TypeScript to define the entire infrastructure. Here's why this approach rocks:

// Multi-AZ VPC setup for high availability
const vpc = new aws.ec2.Vpc("todo-infra-vpc", {
    cidrBlock: "10.10.0.0/16",
    enableDnsHostnames: true,
    enableDnsSupport: true,
});

// Auto Scaling Group with intelligent scaling
const asg = new aws.autoscaling.Group("node-app-asg", {
    vpcZoneIdentifiers: [privateSubnet1.id, privateSubnet2.id],
    targetGroupArns: [targetGroup.arn],
    healthCheckType: "ELB",
    desiredCapacity: 2,
    minSize: 1,
    maxSize: 5,
});

Key Architectural Decisions:

Multi-AZ Deployment: Spread across three Availability Zones for maximum uptime
Private/Public Subnet Isolation: Backend instances in private subnets for security
Auto Scaling: CPU-based scaling (scale up at 80%, scale down at 10%)
Health Check Integration: ELB health checks ensure only healthy instances receive traffic

The Automation Pipeline: One Command Deployment

The magic happens in the Makefile. One command (make auto-deploy) orchestrates the entire deployment:

# Complete zero-touch deployment
auto-deploy: setup-infrastructure setup-backend deploy-frontend-with-alb
    @echo "🎉 DEPLOYMENT COMPLETE! 🎉"

What happens under the hood:

Image Building: Docker images for frontend and backend are built and pushed to Docker Hub
Infrastructure Provisioning: Pulumi deploys VPC, subnets, ALB, Auto Scaling Groups
Dynamic Configuration: ALB DNS is automatically extracted and injected into frontend config
Ansible Automation: Instances are provisioned and containers deployed via bastion host

Smart Network Architecture

Since backend instances live in private subnets (security first!), direct SSH access isn't possible. Instead of manual bastion jumping, I automated everything with Ansible:

# Ansible automatically handles bastion proxy
ansible_ssh_common_args: "-o ProxyCommand=\"ssh -W %h:%p -i keyfile ubuntu@bastion-host\""

The system automatically:

Installs Docker on all instances
Configures MongoDB in the private subnet
Deploys backend containers with proper environment variables
Sets up frontend with dynamic ALB DNS configuration

Cost Optimization Strategies

Smart Scaling: The Auto Scaling Group starts with 2 instances but can scale to 5 during peak traffic, then scale back down to 1 during low usage periods.

Resource Right-Sizing: Using t2.micro instances keeps costs minimal while providing adequate performance for this workload.

Infrastructure Automation: No manual intervention means no idle developer time spent on deployments.

Real-World Performance

The deployed system achieved:

Zero manual intervention for deployments
90-second infrastructure provisioning time
Automatic health checking and failover
Cost-effective scaling based on actual demand

The Database Strategy: Pragmatic Choices

For this demonstration, I deployed MongoDB on a single EC2 instance in the private subnet. While this works for the demo scope, production deployments should consider:

AWS DocumentDB - MongoDB-compatible, fully managed by AWS with built-in scaling and backup
ElastiCache for Redis - Managed caching layer for improved performance
Multi-AZ deployment - Automatic failover and high availability
VPC endpoint integration - Secure, private connectivity without internet gateway dependency

Why DocumentDB over MongoDB Atlas?
Since we're already invested in the AWS ecosystem with VPC, ALB, and EC2, DocumentDB offers:

Seamless VPC integration - No cross-cloud networking complexity
Consistent billing - Single AWS invoice vs. multiple vendors
Native AWS IAM integration - Unified access management
Lower data transfer costs - No egress charges between AWS services

Lessons Learned: What I'd Do Differently

Container Health Checks: Implementing more sophisticated health endpoints beyond the basic /health route.
Monitoring Integration: Adding CloudWatch dashboards and alerts for better observability.
Blue-Green Deployments: For truly zero-downtime updates in production environments.
Infrastructure Testing: Automated infrastructure validation before deployment.

The Deployment Experience

The entire system can be deployed with three simple commands:

# Build and push containers
make build-all push-all

# Deploy everything
make auto-deploy

# Test the deployment
make test-deployment

The automation handles ALB DNS extraction, environment variable injection, and container orchestration seamlessly.

Case Study: E-commerce API Scaling

Let me share how these same principles apply to a different scenario: scaling an e-commerce API.

Imagine you're running a flash sale for a popular product. Traffic spikes from 100 to 10,000 concurrent users in minutes. With this architecture:

Auto Scaling Group automatically launches new backend instances
ALB distributes traffic across healthy instances
MongoDB (or preferably managed DocumentDB) handles the data layer
CloudWatch alarms trigger scaling events based on CPU/memory metrics

The system scales horizontally, maintaining response times while handling the traffic surge. When the sale ends, instances automatically scale back down, optimizing costs.

Key Takeaways

For Infrastructure Teams:

Invest in automation early—it pays dividends quickly
Choose managed services over self-hosted when possible
Design for failure from day one

For Development Teams:

Containerize applications for consistent deployment
Implement proper health checks
Build with horizontal scaling in mind

For Business Teams:

Automated scaling reduces operational costs
High availability improves customer experience
Infrastructure as Code enables rapid iteration

Next Steps: Taking It Further

Ready to implement something similar? Here's your roadmap:

Start Small: Begin with a simple containerized application
Automate Early: Use Infrastructure as Code from the beginning
Monitor Everything: Implement observability before you need it
Test Scaling: Regularly validate your scaling assumptions
Optimize Costs: Review and adjust scaling parameters based on actual usage

The beauty of this architecture lies in its simplicity and automation. With proper implementation, you can handle traffic spikes gracefully while keeping costs under control.

Have questions about implementing auto-scaling architectures? Drop a comment below or connect with me for more detailed discussions about cloud-native scaling strategies.

Tech Stack Used: React, Node.js, Express, MongoDB, Docker, AWS (EC2, ALB, VPC), Pulumi, Ansible, TypeScript

GitHub Repository: Check out the complete implementation

Follow me for more content on cloud architecture, DevOps automation, and cost-effective scaling strategies! 🚀

DEV Community