DEV Community

BuildMintZ Media
BuildMintZ Media

Posted on

We Migrated our AI Agents from Azure to AWS — Here's the Minimal Setup You Actually Need

After migrating our agent-based document processing system from Azure to AWS, one thing became obvious:

Most cloud setups are wildly over-engineered for development.

You don’t need half the services people default to just to build and test a distributed system.

Here’s the minimal AWS architecture you actually need to develop and run an agent-based pipeline end-to-end — plus what each piece maps to if you’re coming from Azure.

The System

Before we even touch the cloud layer, here’s what’s actually running.

At its core, this is a pipeline of independent Python workers processing documents in stages. Each worker listens to its own queue, does one job well, and hands off to the next stage.

In front of that sits a hybrid FastAPI gateway exposing both REST and GraphQL endpoints, depending on the consumer.

Five containers total:

gateway
ai-core
collector
exporter
redis

[insert architecture diagram here]

Core Services — What You Actually Need

If you're building something similar on AWS, these are the essentials.

  1. RDS (PostgreSQL)

This is your source of truth: datasets, schemas, job states.

You can use SQLite for early local testing — but don’t let that leak into staging. You’ll regret it the moment concurrency shows up.

  1. S3

Raw data in. Clean datasets out. Exports delivered.

It’s cheap, durable, and at any meaningful scale, there’s really no alternative.

  1. SQS

This is the backbone of the entire system.

Each worker type gets its own queue:

scraper-jobs
validate-jobs
improve-jobs
refurbish-jobs
export-jobs

This gives you full decoupling:

Workers can scale independently
Crashes are isolated
Deployments don’t ripple across the system

  1. Cognito

Handles authentication for the gateway.

Don’t skip this. Retrofitting auth later is always worse than doing it properly from day one.

  1. VPC

Everything runs inside it.

Not optional. No shortcuts here.

What You Can Disable in Dev (and Save Real Money)

While you're still building, you don’t need the full production setup.

Service Why you don’t need it yet Monthly saving
ElastiCache (Redis) Run Redis locally in Docker ~$15
Application Load Balancer Not needed before production traffic ~$18
NAT Gateway Only required for private subnet outbound ~$32

That’s roughly $65/month saved — which adds up fast across a team.

Azure → AWS: The Mapping

[insert comparison table here]

The Key Insight

Cloud providers look very different — but under the hood, they all boil down to the same building blocks:

Identity
Messaging
Storage
Database
Networking

The names change. The SDKs change. The auth models definitely change.

But if your architecture is built around these primitives, switching clouds becomes a configuration problem, not a code problem.

In our case, moving from:

SERVICE_BUS_CONNECTION
AZURE_OPENAI_KEY

to:

IAM roles
Amazon Bedrock

…required zero changes to business logic.

All the seams were already at the infrastructure boundary — exactly where they should be.

Final Thought

If you're building in the cloud right now:

Start minimal — these five services are enough to ship something real
Add complexity only when you have a concrete reason
Optimize cost early — small monthly savings compound quickly

All five containers are now running healthy on AWS.

Curious — are you team AWS or Azure right now?
And has anyone done this migration the other way around?

Top comments (0)