DEV Community

Cover image for From Monolith to Scalable: Building a Cost-Effective Auto-Scaling Architecture on AWS
Bayajid Alam Juyel
Bayajid Alam Juyel

Posted on

From Monolith to Scalable: Building a Cost-Effective Auto-Scaling Architecture on AWS

How I transformed a simple Todo app into a fault-tolerant, auto-scaling system with zero-downtime deployment

The Challenge

You've built a simple monolithic application—let's say a Todo app called "SimplyDone"—and suddenly, your user base explodes. Your single server is struggling, users are experiencing downtime, and you're manually scrambling to deploy updates. Sound familiar?

This was exactly the scenario I faced when tasked with transforming a basic Notes/Todo application into a production-ready, scalable system that could handle sudden traffic spikes while maintaining high availability and fault tolerance. The catch? Everything needed to be automated—from infrastructure provisioning to deployment.

The Architecture Decision: Why ALB Over NGINX

When designing the load balancing strategy, I had to choose between NGINX and AWS Application Load Balancer (ALB). While NGINX is a fantastic reverse proxy, deploying it on a single instance would create a single point of failure—exactly what we're trying to avoid.

ALB, on the other hand, is:

  • Inherently fault-tolerant across multiple Availability Zones
  • Managed by AWS (no maintenance overhead)
  • Intelligent with health checks and traffic distribution
  • Cost-effective at scale

The choice was clear: ALB would handle traffic distribution while I focused on application scalability.

Infrastructure as Code: The Pulumi Approach

Instead of clicking through the AWS console, I used Pulumi with TypeScript to define the entire infrastructure. Here's why this approach rocks:

// Multi-AZ VPC setup for high availability
const vpc = new aws.ec2.Vpc("todo-infra-vpc", {
    cidrBlock: "10.10.0.0/16",
    enableDnsHostnames: true,
    enableDnsSupport: true,
});

// Auto Scaling Group with intelligent scaling
const asg = new aws.autoscaling.Group("node-app-asg", {
    vpcZoneIdentifiers: [privateSubnet1.id, privateSubnet2.id],
    targetGroupArns: [targetGroup.arn],
    healthCheckType: "ELB",
    desiredCapacity: 2,
    minSize: 1,
    maxSize: 5,
});
Enter fullscreen mode Exit fullscreen mode

Key Architectural Decisions:

  • Multi-AZ Deployment: Spread across three Availability Zones for maximum uptime
  • Private/Public Subnet Isolation: Backend instances in private subnets for security
  • Auto Scaling: CPU-based scaling (scale up at 80%, scale down at 10%)
  • Health Check Integration: ELB health checks ensure only healthy instances receive traffic

The Automation Pipeline: One Command Deployment

The magic happens in the Makefile. One command (make auto-deploy) orchestrates the entire deployment:

# Complete zero-touch deployment
auto-deploy: setup-infrastructure setup-backend deploy-frontend-with-alb
    @echo "🎉 DEPLOYMENT COMPLETE! 🎉"
Enter fullscreen mode Exit fullscreen mode

What happens under the hood:

  1. Image Building: Docker images for frontend and backend are built and pushed to Docker Hub
  2. Infrastructure Provisioning: Pulumi deploys VPC, subnets, ALB, Auto Scaling Groups
  3. Dynamic Configuration: ALB DNS is automatically extracted and injected into frontend config
  4. Ansible Automation: Instances are provisioned and containers deployed via bastion host

Smart Network Architecture

Since backend instances live in private subnets (security first!), direct SSH access isn't possible. Instead of manual bastion jumping, I automated everything with Ansible:

# Ansible automatically handles bastion proxy
ansible_ssh_common_args: "-o ProxyCommand=\"ssh -W %h:%p -i keyfile ubuntu@bastion-host\""
Enter fullscreen mode Exit fullscreen mode

The system automatically:

  • Installs Docker on all instances
  • Configures MongoDB in the private subnet
  • Deploys backend containers with proper environment variables
  • Sets up frontend with dynamic ALB DNS configuration

Cost Optimization Strategies

Smart Scaling: The Auto Scaling Group starts with 2 instances but can scale to 5 during peak traffic, then scale back down to 1 during low usage periods.

Resource Right-Sizing: Using t2.micro instances keeps costs minimal while providing adequate performance for this workload.

Infrastructure Automation: No manual intervention means no idle developer time spent on deployments.

Real-World Performance

The deployed system achieved:

  • Zero manual intervention for deployments
  • 90-second infrastructure provisioning time
  • Automatic health checking and failover
  • Cost-effective scaling based on actual demand

The Database Strategy: Pragmatic Choices

For this demonstration, I deployed MongoDB on a single EC2 instance in the private subnet. While this works for the demo scope, production deployments should consider:

  • AWS DocumentDB - MongoDB-compatible, fully managed by AWS with built-in scaling and backup
  • ElastiCache for Redis - Managed caching layer for improved performance
  • Multi-AZ deployment - Automatic failover and high availability
  • VPC endpoint integration - Secure, private connectivity without internet gateway dependency

Why DocumentDB over MongoDB Atlas?
Since we're already invested in the AWS ecosystem with VPC, ALB, and EC2, DocumentDB offers:

  • Seamless VPC integration - No cross-cloud networking complexity
  • Consistent billing - Single AWS invoice vs. multiple vendors
  • Native AWS IAM integration - Unified access management
  • Lower data transfer costs - No egress charges between AWS services

Lessons Learned: What I'd Do Differently

  1. Container Health Checks: Implementing more sophisticated health endpoints beyond the basic /health route.
  2. Monitoring Integration: Adding CloudWatch dashboards and alerts for better observability.
  3. Blue-Green Deployments: For truly zero-downtime updates in production environments.
  4. Infrastructure Testing: Automated infrastructure validation before deployment.

The Deployment Experience

The entire system can be deployed with three simple commands:

# Build and push containers
make build-all push-all

# Deploy everything
make auto-deploy

# Test the deployment
make test-deployment
Enter fullscreen mode Exit fullscreen mode

The automation handles ALB DNS extraction, environment variable injection, and container orchestration seamlessly.

Case Study: E-commerce API Scaling

Let me share how these same principles apply to a different scenario: scaling an e-commerce API.

Imagine you're running a flash sale for a popular product. Traffic spikes from 100 to 10,000 concurrent users in minutes. With this architecture:

  1. Auto Scaling Group automatically launches new backend instances
  2. ALB distributes traffic across healthy instances
  3. MongoDB (or preferably managed DocumentDB) handles the data layer
  4. CloudWatch alarms trigger scaling events based on CPU/memory metrics

The system scales horizontally, maintaining response times while handling the traffic surge. When the sale ends, instances automatically scale back down, optimizing costs.

Key Takeaways

For Infrastructure Teams:

  • Invest in automation early—it pays dividends quickly
  • Choose managed services over self-hosted when possible
  • Design for failure from day one

For Development Teams:

  • Containerize applications for consistent deployment
  • Implement proper health checks
  • Build with horizontal scaling in mind

For Business Teams:

  • Automated scaling reduces operational costs
  • High availability improves customer experience
  • Infrastructure as Code enables rapid iteration

Next Steps: Taking It Further

Ready to implement something similar? Here's your roadmap:

  1. Start Small: Begin with a simple containerized application
  2. Automate Early: Use Infrastructure as Code from the beginning
  3. Monitor Everything: Implement observability before you need it
  4. Test Scaling: Regularly validate your scaling assumptions
  5. Optimize Costs: Review and adjust scaling parameters based on actual usage

The beauty of this architecture lies in its simplicity and automation. With proper implementation, you can handle traffic spikes gracefully while keeping costs under control.


Have questions about implementing auto-scaling architectures? Drop a comment below or connect with me for more detailed discussions about cloud-native scaling strategies.

Tech Stack Used: React, Node.js, Express, MongoDB, Docker, AWS (EC2, ALB, VPC), Pulumi, Ansible, TypeScript

GitHub Repository: Check out the complete implementation


Follow me for more content on cloud architecture, DevOps automation, and cost-effective scaling strategies! 🚀

Top comments (0)