DEV Community

Aisalkyn Aidarova
Aisalkyn Aidarova

Posted on

Part 2: Decoupled Architecture

🎬 SCENARIO: β€œYou Upload a Movie to Netflix”

You upload:

Avengers.mp4
Enter fullscreen mode Exit fullscreen mode

Now what happens behind the scenes?


πŸŽ₯ STEP 1 β€” Upload Goes to Storage

Service used:
Amazon S3

Image

Image

Image

Image

Flow:

User β†’ S3

Explain:

S3 is like Netflix’s big digital warehouse.
It stores the raw movie file.

At this point:
Nothing is processed yet.


πŸ“’ STEP 2 β€” S3 Shouts: β€œNew Video Arrived!”

Service used:
Amazon SNS

Image

Image

Image

Image

Explain:

When movie is uploaded:

S3 sends event to SNS.

SNS is like:

πŸ“£ A loudspeaker announcement:

β€œNew movie uploaded!”


πŸ“¬ STEP 3 β€” SNS Sends Task to Workers

Service used:
Amazon SQS

Image

Image

Image

Image

Explain:

SNS sends message to SQS.

SQS is like:

πŸ“¬ A task mailbox.

Inside the mailbox:

Process Avengers.mp4
Enter fullscreen mode Exit fullscreen mode

Why mailbox?

Because:

  • Maybe 1 video
  • Maybe 10,000 videos

Queue stores tasks safely.


🏭 STEP 4 β€” Workers Process the Movie

Service used:
Amazon EC2

Image

Image

Image

Image

Explain:

Workers (EC2 machines):

  • Convert video to 1080p
  • Create 720p version
  • Add subtitles
  • Compress file

They read tasks from SQS.


πŸ’₯ LIVE DEMO MOMENT (What You Did)

You show:

1️⃣ Worker running
2️⃣ Upload file
3️⃣ File moves from uploads β†’ processed

Students see:

System works.


πŸ”΄ DRAMATIC MOMENT β€” Worker Dies

You run:

pkill -f worker.py
Enter fullscreen mode Exit fullscreen mode

Now upload new video.

What happens?

File stays in uploads.

Queue has message.

Processing stopped.

Pause.

Ask students:

What if this was Netflix during a big movie release?

They understand impact immediately.


πŸ›‘ WHY WE NEED AUTO SCALING

Service used:
Auto Scaling Group

Explain:

Instead of 1 worker:

We use 5 workers.

If 1 dies:
ASG automatically launches new one.

Queue keeps tasks safe.

Processing continues.

This is:

High Availability.


🎯 WHEN DO WE NEED ALB?

Service used:
Elastic Load Balancing

ALB is for:

Users watching movies.

User β†’ ALB β†’ Web servers

But workers pulling from SQS do NOT need ALB.

Two different traffic types:

  • Web traffic β†’ ALB
  • Queue traffic β†’ ASG only

πŸŽ“ SIMPLE FINAL STORY

Tell them this:

S3 stores the movie.
SNS announces it.
SQS remembers the task.
EC2 workers process it.
ASG keeps workers alive.
ALB serves users.

That’s it.


🏁 FINAL PRODUCTION NETFLIX ARCHITECTURE

Image

Image

Image

Image

Flow:

User
β†’ ALB
β†’ Web ASG
β†’ S3
β†’ SNS
β†’ SQS
β†’ Worker ASG

This is real enterprise architecture.

πŸš€ COMPLETE END-TO-END LAB

S3 β†’ SNS β†’ SQS β†’ EC2 Worker Architecture


🟒 PART 1 β€” Open Image in Browser from S3

Service used:
Amazon S3


STEP 1 β€” Bucket Setup

Bucket name:

student-upload-bucket-aj
Enter fullscreen mode Exit fullscreen mode

Region:

us-east-2
Enter fullscreen mode Exit fullscreen mode

STEP 2 β€” Disable Block Public Access (Demo Only)

Go to:

S3 β†’ Bucket β†’ Permissions β†’ Block Public Access β†’ Edit

Uncheck:

Block all public access
Enter fullscreen mode Exit fullscreen mode

Save.


STEP 3 β€” Add Bucket Policy (Public Read)

S3 β†’ Bucket β†’ Permissions β†’ Bucket Policy

Paste:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "PublicReadObjects",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::student-upload-bucket-aj/*"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Save.


STEP 4 β€” Upload Image

aws s3 cp myimage.jpg s3://student-upload-bucket-aj/uploads/myimage.jpg
Enter fullscreen mode Exit fullscreen mode

STEP 5 β€” Open in Browser

Use:

https://student-upload-bucket-aj.s3.us-east-2.amazonaws.com/uploads/myimage.jpg
Enter fullscreen mode Exit fullscreen mode

Now image loads.


Viewing flow:

Browser β†’ S3

No EC2 involved.

🟒 PART 6 β€” Demonstrate System Working

On EC2 #2:

echo "demo test" > test1.txt
aws s3 cp test1.txt s3://student-upload-bucket-aj/uploads/test1.txt
Enter fullscreen mode Exit fullscreen mode

On EC2 #1 (Worker):

You should see:

Processing: uploads/test1.txt
Moved to: processed/test1.txt
Enter fullscreen mode Exit fullscreen mode

Check:

aws s3 ls s3://student-upload-bucket-aj/processed/
Enter fullscreen mode Exit fullscreen mode

File exists.


πŸ”΄ PART 7 β€” Demonstrate Failure

On EC2 #1:

pkill -f worker.py
Enter fullscreen mode Exit fullscreen mode

Worker is stopped.

Now upload from EC2 #2:

echo "after kill demo" > fail.txt
aws s3 cp fail.txt s3://student-upload-bucket-aj/uploads/fail.txt
Enter fullscreen mode Exit fullscreen mode

Check uploads:

aws s3 ls s3://student-upload-bucket-aj/uploads/
Enter fullscreen mode Exit fullscreen mode

File stays there.

Check SQS:

aws sqs receive-message \
  --queue-url https://sqs.us-east-2.amazonaws.com/ACCOUNT-ID/devops-sqs-queue \
  --region us-east-2 \
  --max-number-of-messages 1
Enter fullscreen mode Exit fullscreen mode

Message waiting.


🎯 Explain What Happened

  • S3 worked
  • SNS worked
  • SQS stored message
  • Worker EC2 failed
  • Processing stopped

This is:

Single Point of Failure


🟒 PART 8 β€” Restart Worker

On EC2 #1:

python3 worker.py
Enter fullscreen mode Exit fullscreen mode

It immediately processes:

Processing: uploads/fail.txt
Moved to: processed/fail.txt
Enter fullscreen mode Exit fullscreen mode

This proves:

SQS keeps messages safely.


πŸ— High Availability Discussion

For queue-based systems:

Use
Auto Scaling Group

Architecture becomes:

S3 β†’ SNS β†’ SQS β†’ ASG (2+ workers)

If one worker dies:

ASG launches new one automatically.

No downtime.


❓ When Do We Use ALB?

Use
Elastic Load Balancing

Only when users connect to EC2 via HTTP.

Example:

User β†’ ALB β†’ EC2 Web Servers

Your lab is queue-based.

So:

βœ” Need ASG
βœ– Do NOT need ALB


πŸŽ“ Final Architecture

Viewing Flow:
Browser β†’ S3

Processing Flow:
S3 β†’ SNS β†’ SQS β†’ EC2 Worker

High Availability:
S3 β†’ SNS β†’ SQS β†’ Auto Scaling Group Workers

πŸ”₯ FIRST β€” Important Understanding

Your system is queue-based, not web-based.

So:

βœ” ASG is required
❌ ALB is NOT required for workers

ALB is only needed if users send HTTP traffic to EC2.


πŸ— Final Production Architecture

Image

Image

Image

Image

Flow:

Browser β†’ S3
S3 β†’ SNS β†’ SQS
SQS β†’ Auto Scaling Group (2+ EC2 workers)


🟒 PART 1 β€” Convert Worker EC2 into Launch Template

Service used:
Amazon EC2


Step 1 β€” Stop current worker EC2

We will use it as template.


Step 2 β€” Create Launch Template

EC2 β†’ Launch Templates β†’ Create launch template

Use:

  • Same AMI
  • Same instance type
  • Same IAM Role
  • Same Security Group
  • Same Key pair

Step 3 β€” Add User Data (VERY IMPORTANT)

We must automatically start worker when instance launches.

In Launch Template β†’ User Data β†’ paste:

#!/bin/bash
apt update -y
apt install python3-pip -y
pip3 install boto3

cat <<EOF > /home/ubuntu/worker.py
# (paste your full worker.py code here)
EOF

chown ubuntu:ubuntu /home/ubuntu/worker.py
su - ubuntu -c "nohup python3 /home/ubuntu/worker.py > worker.log 2>&1 &"
Enter fullscreen mode Exit fullscreen mode

Now every new EC2 automatically runs worker.


🟒 PART 2 β€” Create Auto Scaling Group

Service used:
Auto Scaling Group


Step 1 β€” Create ASG

EC2 β†’ Auto Scaling Groups β†’ Create

Select:

Launch Template you created

Choose:

  • At least 2 Availability Zones
  • Min: 2
  • Desired: 2
  • Max: 4

No load balancer needed.

Create.


Step 2 β€” Test It

Upload file:

aws s3 cp test.txt s3://student-upload-bucket-aj/uploads/test.txt
Enter fullscreen mode Exit fullscreen mode

Check EC2 logs:

Both instances may process files.


πŸ”΄ PART 3 β€” Demonstrate Self-Healing

Terminate one instance manually.

ASG will:

  • Detect unhealthy instance
  • Launch new one automatically

Upload file again.

Processing continues.

This proves:

High availability.


🟑 When Do We Add ALB?

Service used:
Elastic Load Balancing

Add ALB only if:

You create a web application like:

User β†’ EC2 web server

Then architecture becomes:

User β†’ ALB β†’ ASG β†’ EC2 Web Servers


🎯 Example If You Want Web + Worker Combined

Architecture:

User β†’ ALB β†’ ASG (Web Servers)
S3 β†’ SNS β†’ SQS β†’ ASG (Worker Servers)

Two separate ASGs.


🧠 Explain To Students

Queue-based system:

  • No ALB required
  • Workers pull messages

Web-based system:

  • ALB required
  • Traffic distributed

πŸš€ Advanced (Optional)

Add scaling policy:

Scale based on SQS metric:

ApproximateNumberOfMessagesVisible

If queue > 10 β†’ Add instance

This is real production design.


πŸŽ“ Interview-Level Explanation

We deployed worker instances inside an Auto Scaling Group across multiple AZs to ensure fault tolerance and scalability. Since the system was queue-driven and did not handle direct HTTP traffic, an Application Load Balancer was not required.

Top comments (0)