DEV Community: Bayajid Alam Juyel

Building a Scalable Video Streaming PoC from Scratch

Bayajid Alam Juyel — Tue, 07 Oct 2025 07:49:03 +0000

CRUD-এর বাইরে project করতে গেলে সাধারণত real-time chat, calling বা map আসে। কিন্তু আমি ভাবলাম, আমরা যে প্রায় প্রতিদিন YouTube, Facebook, Instagram-এ use করি, ভিডিও দেখি; সেই ভিডিও স্ট্রিমিং নিয়ে কাজ করলে কেমন হয়। তাই scaling মাথায় রেখে কিছু article, documentation পড়ে system design করে একটি PoC বানানো শুরু করেছি।

ভিডিও স্ট্রিমিং-এ বেশিরভাগ মানুষ ভিডিও দেখে (consumer), আর খুব কম মানুষ ভিডিও upload করে। তাই শুরুটা করা যাক consumer-এর দিক থেকে, কিভাবে তাদের experience আরও smooth করা যায়। ধরুন, আপনি এখন একটি ভিডিও দেখছেন, যার duration ১০ মিনিট এবং file size ৫০০ MB। যদি সার্ভার থেকে পুরো ফাইল একসাথে লোড করতে হয়, তাহলে ৫০০ MB download না হওয়া পর্যন্ত ভিডিও চালু হবে না(Buffering)। এতে user experience খারাপ হয়ে যায়। তাই পুরো ভিডিও একসাথে পাঠানো হয় না, বরং ছোট ছোট chunk আকারে পাঠানো হয়, এটাই streaming। উদাহরণ হিসেবে ধরুন, ১০ সেকেন্ডের একটি chunk ৫ MB। এতে download হতে কম সময় লাগে, এবং আপনি ওই ১০ সেকেন্ড দেখার সময় পরের chunk লোড হয়ে যাবে। ইউটিউবের timeline-এ যে সাদা দাগ দেখা যায়, সেটিই আসলে দেখায কতটুকু ভিডিও লোড হয়েছে।
এভাবে প্রাথমিক সমস্যা সমাধান হলেও, গ্রাম বা দুর্বল ইন্টারনেটের এলাকায় এখনো সমস্যা থেকে যায়। এর জন্য ব্যবহার করা হয় adaptive bitrate streaming, যা user-এর ইন্টারনেট স্পিড অনুযায়ী ভিডিও quality intelligentভাবে adjust করে, low speed হলে low quality এবং ভালো speed হলে high quality। এজন্য DASH ব্যবহার করা হয়, ফলে end user-এর বড় সমস্যা অনেকাংশে মিটে যায়।

এবার আসা যাক uploading প্রসঙ্গে। Uploader যখন ভিডিও upload করে, তখন সেটি সরাসরি server-এ পাঠানো হয় না। প্রথমে client ভিডিও metadata নিয়ে server-এ request পাঠায়। Server সেটি verify করে (ভিডিওর format, size ইত্যাদি চেক করে), তারপর S3 থেকে একটি pre-signed URL তৈরি করে client-কে দেয়। Client সেই URL ব্যবহার করে ভিডিও সরাসরি S3-এ upload করে। এতে server-কে বড় ভিডিও ফাইল handle করতে হয় না।

Upload শেষ হলে ভিডিও store ও stream করা expensive হয়ে পড়ে, বিশেষ করে যদি ভিডিও বড় হয় (ধরা যাক ১ GB)। তাই ভিডিও compress করা হয়। এতে quality ঠিক রেখে file size নেমে আসে প্রায় ৩০০–৪০০ MB এ। এরপর ভিডিওকে বিভিন্ন resolution (1080p, 720p, 480p ইত্যাদি) এ convert করা হয় এবং প্রতিটি resolution ছোট ছোট chunk-এ ভাগ করা হয়। সবশেষে একটি manifest mpd file তৈরি হয়, যেখানে সব resolution আর chunk-এর reference থাকে।

পুরো pipeline-এর কাজ করার ধাপগুলো এমন: user ভিডিও upload করলে server থেকে pre-signed URL নিয়ে S3-এ upload করে। Upload শেষ হলে S3 একটি মেসেজ SQS-এ পাঠায়। SQS থেকে মেসেজ এলে Lambda trigger হয়। Lambda সরাসরি প্রসেস না করে ECS-এ একটি task launch করে। ওই task, ECR থেকে একটি custom container image pull করে, যা ffmpeg ব্যবহার করে ভিডিও process করার জন্য তৈরি করা হয়েছে, তারপর S3 থেকে ভিডিও নামিয়ে compress করে, বিভিন্ন resolution-এ convert করে, chunk বানায় এবং manifest mpd তৈরি করে। সবকিছু আবার S3-এ upload হয় এবং তারপর task শেষ হয়। এই পুরো সময়ে upload আর processing-এর প্রতিটি ধাপে user real-time status update পায় (socket ব্যবহার করে)।

এছাড়াও কিছু optimization করা হয়েছে। যেমন, S3-এ lifecycle policies ব্যবহার করা হয়েছে cost optimization-এর জন্য, যাতে নির্দিষ্ট সময় পর ভিডিও storage class পরিবর্তন হয় এবং খরচ কম হয়। Hot data-এর জন্য CDN ব্যবহার করা হয়েছে। ECS-এর জন্য custom policy লেখা হয়েছে, ভিডিও যদি ১GB-এর কম হয়, তবে ECS Fargate-এ process হয়। Spot এবং Regular instance-এর ratio configurable, এবং যদি Spot instance fail করে (যেমন resource unavailable), task স্বয়ংক্রিয়ভাবে Regular instance-এ retry হয়, ফলে reliability বজায় থাকে। আর ভিডিও যদি ১GB-এর বেশি হয়, তখন সরাসরি Regular Fargate ব্যবহার করা হয়। ছোট jobs-এর জন্য ECS batch mode ব্যবহার করা হয়েছে, যাতে lightweight কাজগুলো দ্রুত ও সাশ্রয়ীভাবে শেষ করা যায়।

Socket-এর backplane, caching আর rate limiting-এর জন্য Redis ব্যবহার করা হয়েছে। Backend Auto Scaling Group-এ রাখা হয়েছে, যা load অনুযায়ী auto scale in/out করে। MongoDB cluster-এ একটি Primary আর দুটি read replica আছে। Backend instance, MongoDB, Redis সব private subnet-এ রাখা হয়েছে। পুরো infrastructure Pulumi দিয়ে তৈরি করা হয়েছে এবং MongoDB cluster setup, Redis installation, frontend ও backend-এর image ECR থেকে pull করে run করাসহ অনেক automation Ansible দিয়ে করা হয়েছে।
এমন প্রজেক্ট করতে গেয়ে cloud এ কিছু টাকা-পয়সা খরচ হয়, কিন্তু infrastructure provisioning, automation, system design, scaling ইত্যাদিতে কাজ করতে যে আনন্দ পাই, তার কাছে সব পোষায়ে যায়।

This is how I collect happiness—by building stuff!

From Monolith to Scalable: Building a Cost-Effective Auto-Scaling Architecture on AWS

Bayajid Alam Juyel — Tue, 29 Jul 2025 20:17:29 +0000

How I transformed a simple Todo app into a fault-tolerant, auto-scaling system with zero-downtime deployment

The Challenge

You've built a simple monolithic application—let's say a Todo app called "SimplyDone"—and suddenly, your user base explodes. Your single server is struggling, users are experiencing downtime, and you're manually scrambling to deploy updates. Sound familiar?

This was exactly the scenario I faced when tasked with transforming a basic Notes/Todo application into a production-ready, scalable system that could handle sudden traffic spikes while maintaining high availability and fault tolerance. The catch? Everything needed to be automated—from infrastructure provisioning to deployment.

The Architecture Decision: Why ALB Over NGINX

When designing the load balancing strategy, I had to choose between NGINX and AWS Application Load Balancer (ALB). While NGINX is a fantastic reverse proxy, deploying it on a single instance would create a single point of failure—exactly what we're trying to avoid.

ALB, on the other hand, is:

Inherently fault-tolerant across multiple Availability Zones
Managed by AWS (no maintenance overhead)
Intelligent with health checks and traffic distribution
Cost-effective at scale

The choice was clear: ALB would handle traffic distribution while I focused on application scalability.

Infrastructure as Code: The Pulumi Approach

Instead of clicking through the AWS console, I used Pulumi with TypeScript to define the entire infrastructure. Here's why this approach rocks:

// Multi-AZ VPC setup for high availability
const vpc = new aws.ec2.Vpc("todo-infra-vpc", {
    cidrBlock: "10.10.0.0/16",
    enableDnsHostnames: true,
    enableDnsSupport: true,
});

// Auto Scaling Group with intelligent scaling
const asg = new aws.autoscaling.Group("node-app-asg", {
    vpcZoneIdentifiers: [privateSubnet1.id, privateSubnet2.id],
    targetGroupArns: [targetGroup.arn],
    healthCheckType: "ELB",
    desiredCapacity: 2,
    minSize: 1,
    maxSize: 5,
});

Key Architectural Decisions:

Multi-AZ Deployment: Spread across three Availability Zones for maximum uptime
Private/Public Subnet Isolation: Backend instances in private subnets for security
Auto Scaling: CPU-based scaling (scale up at 80%, scale down at 10%)
Health Check Integration: ELB health checks ensure only healthy instances receive traffic

The Automation Pipeline: One Command Deployment

The magic happens in the Makefile. One command (make auto-deploy) orchestrates the entire deployment:

# Complete zero-touch deployment
auto-deploy: setup-infrastructure setup-backend deploy-frontend-with-alb
    @echo "🎉 DEPLOYMENT COMPLETE! 🎉"

What happens under the hood:

Image Building: Docker images for frontend and backend are built and pushed to Docker Hub
Infrastructure Provisioning: Pulumi deploys VPC, subnets, ALB, Auto Scaling Groups
Dynamic Configuration: ALB DNS is automatically extracted and injected into frontend config
Ansible Automation: Instances are provisioned and containers deployed via bastion host

Smart Network Architecture

Since backend instances live in private subnets (security first!), direct SSH access isn't possible. Instead of manual bastion jumping, I automated everything with Ansible:

# Ansible automatically handles bastion proxy
ansible_ssh_common_args: "-o ProxyCommand=\"ssh -W %h:%p -i keyfile ubuntu@bastion-host\""

The system automatically:

Installs Docker on all instances
Configures MongoDB in the private subnet
Deploys backend containers with proper environment variables
Sets up frontend with dynamic ALB DNS configuration

Cost Optimization Strategies

Smart Scaling: The Auto Scaling Group starts with 2 instances but can scale to 5 during peak traffic, then scale back down to 1 during low usage periods.

Resource Right-Sizing: Using t2.micro instances keeps costs minimal while providing adequate performance for this workload.

Infrastructure Automation: No manual intervention means no idle developer time spent on deployments.

Real-World Performance

The deployed system achieved:

Zero manual intervention for deployments
90-second infrastructure provisioning time
Automatic health checking and failover
Cost-effective scaling based on actual demand

The Database Strategy: Pragmatic Choices

For this demonstration, I deployed MongoDB on a single EC2 instance in the private subnet. While this works for the demo scope, production deployments should consider:

AWS DocumentDB - MongoDB-compatible, fully managed by AWS with built-in scaling and backup
ElastiCache for Redis - Managed caching layer for improved performance
Multi-AZ deployment - Automatic failover and high availability
VPC endpoint integration - Secure, private connectivity without internet gateway dependency

Why DocumentDB over MongoDB Atlas?
Since we're already invested in the AWS ecosystem with VPC, ALB, and EC2, DocumentDB offers:

Seamless VPC integration - No cross-cloud networking complexity
Consistent billing - Single AWS invoice vs. multiple vendors
Native AWS IAM integration - Unified access management
Lower data transfer costs - No egress charges between AWS services

Lessons Learned: What I'd Do Differently

Container Health Checks: Implementing more sophisticated health endpoints beyond the basic /health route.
Monitoring Integration: Adding CloudWatch dashboards and alerts for better observability.
Blue-Green Deployments: For truly zero-downtime updates in production environments.
Infrastructure Testing: Automated infrastructure validation before deployment.

The Deployment Experience

The entire system can be deployed with three simple commands:

# Build and push containers
make build-all push-all

# Deploy everything
make auto-deploy

# Test the deployment
make test-deployment

The automation handles ALB DNS extraction, environment variable injection, and container orchestration seamlessly.

Case Study: E-commerce API Scaling

Let me share how these same principles apply to a different scenario: scaling an e-commerce API.

Imagine you're running a flash sale for a popular product. Traffic spikes from 100 to 10,000 concurrent users in minutes. With this architecture:

Auto Scaling Group automatically launches new backend instances
ALB distributes traffic across healthy instances
MongoDB (or preferably managed DocumentDB) handles the data layer
CloudWatch alarms trigger scaling events based on CPU/memory metrics

The system scales horizontally, maintaining response times while handling the traffic surge. When the sale ends, instances automatically scale back down, optimizing costs.

Key Takeaways

For Infrastructure Teams:

Invest in automation early—it pays dividends quickly
Choose managed services over self-hosted when possible
Design for failure from day one

For Development Teams:

Containerize applications for consistent deployment
Implement proper health checks
Build with horizontal scaling in mind

For Business Teams:

Automated scaling reduces operational costs
High availability improves customer experience
Infrastructure as Code enables rapid iteration

Next Steps: Taking It Further

Ready to implement something similar? Here's your roadmap:

Start Small: Begin with a simple containerized application
Automate Early: Use Infrastructure as Code from the beginning
Monitor Everything: Implement observability before you need it
Test Scaling: Regularly validate your scaling assumptions
Optimize Costs: Review and adjust scaling parameters based on actual usage

The beauty of this architecture lies in its simplicity and automation. With proper implementation, you can handle traffic spikes gracefully while keeping costs under control.

Have questions about implementing auto-scaling architectures? Drop a comment below or connect with me for more detailed discussions about cloud-native scaling strategies.

Tech Stack Used: React, Node.js, Express, MongoDB, Docker, AWS (EC2, ALB, VPC), Pulumi, Ansible, TypeScript

GitHub Repository: Check out the complete implementation

Follow me for more content on cloud architecture, DevOps automation, and cost-effective scaling strategies! 🚀

Docker ও VM-এর মধ্যে পার্থক্যগুলো কী?

Bayajid Alam Juyel — Fri, 08 Nov 2024 17:47:40 +0000

SDLC-এ Deployment টার্মটি এমন এক প্রক্রিয়া যেখানে অ্যাপ্লিকেশনের কোড ব্যবহারকারীর জন্য রান করানো হয়। Deployment-এর জন্য Docker ও Virtual Machine (VM) উভয়ই ব্যবহার করা হয়। তবে Docker ও VM-এর মধ্যে পার্থক্যগুলো কী?

Docker: একটি ওপেন সোর্স প্ল্যাটফর্ম যা ডেভেলপারদের তাদের সফটওয়্যারকে Container নামে পরিচিত একটি standard unit-এ প্যাকেজ করতে দেয়। Container সাধারণত অ্যাপ্লিকেশনের কোড, নির্দিষ্ট environment, এবং প্রয়োজনীয় লাইব্রেরি নিয়ে গঠিত হয়, যার নিজস্ব File System, Dependency Structure এবং Networking Capabilities থাকে। কনটেইনারগুলো Host OS-এর রিসোর্স সরাসরি ব্যবহার করে এবং একাধিক কনটেইনার একই OS-এর রিসোর্স শেয়ার করতে পারে।

VM (Virtual Machine): সাধারণত একটি কম্পিউটার প্রসেসর, RAM, Hard drive, Network, OS এবং বিভিন্ন সফটওয়্যার নিয়ে গঠিত হয়। Virtual Machine একটি সফটওয়্যার যা একটি হার্ডওয়্যারের মতো আচরণ করে, অর্থাৎ এটি একটি Physical Machine-এর হার্ডওয়্যার কম্পোনেন্টগুলোকে ইমুলেট করে। প্রতিটি VM-এর নিজস্ব Kernel এবং Operating System থাকে। আমরা একটি Windows কম্পিউটারের ভেতরে Linux চালাতে পারি এর কারণে।

কনটেইনার Host OS-এর রিসোর্স শেয়ার করে, যার ফলে একই OS-এ অনেকগুলো কনটেইনার চালানো সম্ভব হয় এবং তারা কম রিসোর্স ব্যবহার করে। প্রতিটি কনটেইনারের নিজস্ব অপারেটিং সিস্টেমের প্রয়োজন হয় না, যা কনটেইনারকে Lightweight করে তোলে। অন্যদিকে, Virtual Machine-এ প্রতিটি VM-এর জন্য একটি আলাদা OS এবং Kernel চলতে হয়। এর ফলে প্রতিটি VM-কে নির্দিষ্ট পরিমাণে রিসোর্স (যেমন CPU, RAM) প্রি-অ্যালোকেট করতে হয়, যা একই সিস্টেমে VM-এর সংখ্যা সীমাবদ্ধ করে।