Connect with me: LinkedIn
Support my work: Buy me a coffee ☕
You know that feeling when you add something to your to-do list with every intention of knocking it out quickly... and then three years pass? Yeah, me too.
Back in June 2022, Lars Klint posted the Cloud Portfolio Challenge: Load Balancing and Content Delivery Network on Pluralsight. The challenge was straightforward: build an image delivery service that returns images matching search criteria, while learning about load balancing, CDN, compute, and storage fundamentals.
I bookmarked it. Added it to my list. And promptly let life happen.
Fast forward to October 2025, and I finally dusted off that bookmark. But here's the thing—I wouldn't have been able to build what I built if I had actually met that 2022 deadline. The tech stack I used simply didn't exist back then, or wasn't mature enough to use effectively.
So this is my "better late than never" submission, powered by 2025's tech stack, and honestly? I'm glad I waited.
The Original Challenge (And How I Completely Reimagined It)
The original challenge asked participants to build an image delivery service with four key cloud components:
- Compute - Running the application logic
- Storage - Storing and serving images
- Load Balancing - Distributing traffic across instances
- CDN - Delivering content globally with low latency
Simple enough, right? Well, I decided to add a twist: What if the images were AI-generated from a curated set of prompts, and users could vote on which images they prefer in head-to-head battles?
Enter: AI Image Battle Arena 🥊
Instead of serving pre-existing images, my application:
- Asynchronously generates images using multiple AI providers (Freepik, Google Imagen, Leonardo AI) via scheduled cron jobs
- Presents two random images side-by-side for comparison (both from the same provider to ensure fair comparison)
- Lets users vote on which image they prefer via swipe gestures
- Tracks statistics and winners in a Valkey (Redis-compatible) database
- Serves everything through a CDN with load-balanced full-stack droplets
It's like Hot or Not, but for AI-generated art. Each comparison uses images from a single provider to keep things fair—I didn't want the user experience depending on multiple AI APIs being available simultaneously. This makes the system more resilient with ADK managing provider selection behind the scenes.
Why Digital Ocean? (A Love Letter to Simplicity)
Let me be upfront: I'm a major cloud provider person. I'm an AWS Community Builder, AWS User Group Leader, and AWS Gold Jacket holder. I've worked extensively with AWS, Azure, and GCP for years—and I genuinely love these platforms. They power the world's largest applications, and their breadth of services is unmatched.
(Sidebar: Yes, even with today's AWS outage on October 20, 2025—because let's be real, all cloud providers have bad days. The big three have earned their reliability reputations.)
But I kept hearing from other developers in the community: "You should try Digital Ocean. It's so much simpler. The developer experience is amazing."
And you know what? They were right.
After years with the hyperscalers, Digital Ocean felt like a breath of fresh air. The UI is clean, intuitive, and doesn't make you feel like you need a map just to find what you're looking for. Everything is straightforward—no endless service catalogs, no decision paralysis about which of the 17 database options to choose.
Some highlights from my DO experience:
1. Managed Databases (Valkey)
Setting up a Valkey cluster (Redis-compatible) via Pulumi was incredibly straightforward. The API is intuitive—just specify size, region, and VPC attachment. No complex subnet CIDR calculations or security group wizardry. It just works.
2. Spaces + CDN
Digital Ocean Spaces (S3-compatible object storage) comes with built-in CDN. Not "you need to set up CloudFront and configure origins and behaviors"—it's just... included. Upload your images, they're automatically CDN-distributed. Chef's kiss. 👌
3. DO Metrics Agent
Here's a cool feature I didn't expect: Digital Ocean offers an enhanced metrics agent you can install on droplets that measures things like memory usage—metrics that aren't included in the free tier on other providers. I configured it in my UserData script, and suddenly I had deep observability into my droplet performance without additional cost.
4. Credits and Support
DO gave me $200 in promotional credits (60-day expiration) to test things out. And when I had questions about Droplet limits? Their support team responded quickly with helpful, non-robotic answers.
To be clear: AWS, Azure, and GCP are incredible platforms that I'll continue using for enterprise work. But for side projects, learning, and rapid prototyping? Digital Ocean's simplicity is genuinely refreshing. Different tools for different jobs.
5. Domain Management with Namecheap + DO Nameservers
For this project, I grabbed a domain from Namecheap: wheeleraiduel.online
for just $0.98/year. Since this is a side project I don't plan to keep live beyond a year, why spend $12+ on a .com?
The setup is beautifully simple:
- Register domain on Namecheap (~$1)
- Point Namecheap to Digital Ocean's nameservers
- Manage all DNS records directly in the DO console
- Let's Encrypt SSL certificates auto-provision via Pulumi
This hybrid approach gives me Namecheap's pricing with DO's DNS management UX. Best of both worlds for a temporary project.
Why Random Prompts? (Safety First)
You might notice the application generates images from a curated set of prompts rather than taking user input. This was a deliberate design choice for three reasons:
1. Async Generation Architecture
Since images are generated via scheduled cron jobs (not on-demand), there's no user session to capture input from. The generation happens in the background, building up a library of images that the frontend randomly serves.
2. User Input Requires Special Care
Accepting user input means:
- Input validation and sanitization
- Rate limiting to prevent abuse
- Moderation to filter inappropriate prompts
- Storage and management of user data
- Potential GDPR/privacy concerns
For a side project focused on cloud architecture and AI orchestration? That's scope creep I didn't need.
3. Prompt Injection Is Real
Generative AI is susceptible to prompt injection attacks where malicious users craft inputs to bypass safety filters or generate harmful content. By using a curated set of prompts (generated by Claude and Gemini during development), I completely eliminate this attack vector.
Example curated prompts:
- "Robot holding a red skateboard"
- "Astronaut riding a bicycle on the moon"
- "Cat wearing sunglasses at a coffee shop"
Safe, fun, and focused on the technical infrastructure—not content moderation.
The Tech Stack That Made This Possible
Here's where 2025 tech really shines:
Backend: Go + Gin Framework
I built the API server in Go using the Gin web framework. Why Go? Fast, statically typed, great concurrency support, and perfect for cloud-native applications. The backend handles:
- Image generation orchestration
- Provider fallback logic
- Vote tracking and statistics
- Health checks for the load balancer
Frontend: Next.js 14
The image comparison interface is built with Next.js, featuring:
- Mobile-first responsive design with swipe gestures
- Framer Motion animations with spring physics
- Real-time vote feedback
- Server-side rendering for SEO
Infrastructure as Code: Pulumi (with Go)
I've used Terraform extensively, and I've worked with CloudFormation and AWS CDK. But for this project, I went with Pulumi—and I'm genuinely impressed.
Pulumi lets you write infrastructure code in real programming languages (I used Go for consistency with my backend). No HCL to learn, no YAML templating gymnastics—just actual code with real loops, conditionals, and type safety.
Here's what my infrastructure deploys:
// Simplified example from hosting/main.go
droplets := make([]*digitalocean.Droplet, dropletCount)
for i := 0; i < dropletCount; i++ {
droplet, err := digitalocean.NewDroplet(ctx, logicalName, &digitalocean.DropletArgs{
Name: pulumi.String(physicalName),
Image: pulumi.String("ubuntu-22-04-x64"),
Size: pulumi.String("s-2vcpu-2gb"),
Region: pulumi.String("nyc3"),
VpcUuid: vpc.ID(),
UserData: getFullStackUserData(config),
})
droplets[i] = droplet
}
Each droplet runs:
- Backend Go API server (port 8080)
- Frontend Next.js app (port 3000)
- Nginx reverse proxy (port 80)
- Automated log uploads to Spaces (hourly, gzip compressed)
- DigitalOcean metrics agent
The Pulumi console gives you real-time visibility into deployments:
Why Pulumi over Terraform? A few reasons:
- Type Safety: Compiler catches errors before deployment
-
Loops & Logic: Native language constructs instead of
count
hacks - Single Language: Same language as my backend (Go)
- Better Error Messages: Actually tells you what's wrong
Pulumi might not have Terraform's community size yet, but for greenfield projects, it's a compelling choice.
Google Agent Development Kit (ADK)
Here's the real 2025 magic: Google's Agent Development Kit.
I learned about ADK from my connection Kelby Enevold, who told me how cool it was. ADK is Google's framework for building AI agents that can orchestrate tasks, use tools, and handle complex workflows.
In my application, ADK powers the "orchestrator agent" that:
- Randomly selects an AI image provider
- Calls the provider's API to generate an image
- Detects quota limits, rate limits, or errors
- Automatically falls back to a different provider
- Handles retries and error scenarios
This pattern—intelligent provider selection with automatic fallback—would have been manual spaghetti code without ADK. Instead, it's a clean agent-based architecture that "just works."
This literally wouldn't have been possible in 2022. Google ADK was announced at Google Cloud Next 2025, with the stable Python v1.0.0 release happening in 2024. It's built on the same foundation powering Google's own products like Agentspace and their Customer Engagement Suite.
Hot Storage vs. Cold Storage: Why Valkey Is Optional
One architectural decision I'm particularly proud of: Digital Ocean Spaces is my single source of truth.
Here's how the data architecture works:
Cold Storage: DO Spaces (Source of Truth)
Every generated image is stored in DO Spaces with rich metadata:
- Image file: The actual PNG/JPEG
-
Object metadata:
-
provider
: Which AI service generated it (freepik, google-imagen, leonardo-ai) -
prompt
: The text prompt used for generation - Other generation details
-
This metadata lives directly on the S3-compatible object storage. No separate database required.
Hot Storage: Valkey (Performance Cache)
Valkey stores:
- Vote counts and statistics
- Side win tracking (left vs. right)
- Winning image references
- Real-time leaderboard data
But here's the key insight: Valkey is purely for performance. It's a cache, not the source of truth.
The recreate_valkey
Flag
In my GitHub Actions deploy workflow, there's a boolean parameter called recreate_valkey
. When set to true
, it:
- Scans all objects in DO Spaces
- Reads metadata from each image
- Rebuilds Valkey indexes from scratch
- Repopulates provider statistics
This means I can:
- Delete the entire Valkey cluster to save costs
- Recover from Valkey data corruption
- Rebuild after accidental data loss
- Migrate to a different caching solution
The images, prompts, and generation history are never lost. They live permanently in Spaces.
Why This Matters
Many applications tightly couple their database and storage layers. If the database fails, critical metadata is gone forever. By storing metadata with the objects themselves, I've created a resilient architecture where:
- Valkey failure = temporary performance degradation, not data loss
- Spaces backup = complete system backup, including all metadata
- Cost optimization = optional caching layer when budget is tight
For the "lite" version I'm planning in Part 2, I could potentially run without Valkey entirely—serving slightly slower but still functional. That's architectural flexibility.
The Build Process: Research First, Deploy Last
People sometimes ask me: "How do you even start large side projects like this?"
My answer: Start small. Start with research.
Here's how I structured this project:
Phase 1: Research (research/
directory)
Before writing a single line of business logic, I created small Proof of Concepts for each AI provider:
research/
├── freepik/ # Test Freepik API integration
├── google-imagen/ # Test Google Imagen API
├── leonardo-ai/ # Test Leonardo AI API
└── craiyon/ # Test Craiyon (spoiler: broken)
Each research folder had its own README documenting:
- How to authenticate
- How to make requests
- Response formats
- Pricing/quota limits
- Gotchas and workarounds
This meant when I started building the actual backend, I wasn't debugging API integration issues—I already knew exactly how each provider worked.
Key lesson: Spend time in research mode. It pays dividends later.
Phase 2: Backend (backend/
)
Once I understood the providers, I built the Go API server:
- Provider abstraction layer
- ADK orchestrator integration
- Valkey vote tracking
- Image storage to DO Spaces
Phase 3: Frontend (frontend/
)
Next.js app with mobile-optimized voting interface:
- Swipe left/right to vote
- Real-time animations
- Statistics display
Phase 4: Infrastructure (hosting/
)
Last step: Deploy to the cloud.
Why last? Because cloud providers bill you from the moment you provision resources. By building locally first, I:
- Avoided weeks of unnecessary billing
- Used my DO credits efficiently (60-day expiration)
- Deployed a complete, tested application—not a half-baked experiment
This order—research → backend → frontend → hosting—is how I approach every side project. It keeps costs down and reduces cloud debugging headaches.
GitHub Actions: The Full Deployment Pipeline
Once infrastructure code was ready, I built three GitHub Actions workflows to manage everything:
1. Deploy Workflow
One-click deployment via GitHub Actions:
- Provisions all infrastructure (load balancer, droplets, database, CDN)
- Configurable droplet count (2-10 instances)
- Auto-deploys applications via UserData script
- Sets up monitoring and log shipping
2. Teardown Workflow
Safe infrastructure destruction:
- Requires typing "DESTROY" to confirm (safety first!)
- Cleans up auto-created DNS records
- Removes all resources to stop billing
- Saves ~$68/month when not in use
3. Refresh Workflow
State synchronization:
- Syncs Pulumi state with actual cloud resources
- Useful after manual changes in DO console
- Detects and resolves state drift
Full transparency: This automation is a game-changer. Push code, watch it deploy, tear it down when done. No manual server configuration, no SSH debugging sessions.
What I Learned (And What Surprised Me)
1. Simplicity Has Value
AWS's breadth of services is powerful, but DO's focused simplicity meant I spent more time building features and less time reading documentation about VPC peering topologies.
2. IaC Language Matters
Using Go for both backend and infrastructure code created a cohesive developer experience. Context switching between languages is mentally taxing—Pulumi eliminated that.
3. AI Agents Are Production-Ready
Google ADK isn't just a toy—it handles real production workflows with fallback logic, error handling, and reliability. This is the future of orchestration.
4. UserData Is Underrated
My droplets auto-deploy everything via UserData scripts:
- Install dependencies (Go, Node.js, nginx)
- Clone and build applications
- Configure systemd services
- Set up cron jobs for log uploads
- Install monitoring agents
No manual SSH configuration. No Ansible playbooks. Just boot and go.
5. Research Time Is Never Wasted
Those small POCs in the research/
folder saved me countless debugging hours later. Investing in understanding your dependencies upfront is time well spent.
The Challenge Requirement Checklist ✅
So, did I actually complete the original challenge requirements?
- ✅ Compute: Full-stack droplets running Go + Next.js + nginx
- ✅ Storage: DO Spaces for images and logs
- ✅ Load Balancing: DO Load Balancer distributing traffic across instances
- ✅ CDN: Built-in CDN via DO Spaces
- ✅ Image Delivery: Returns images (AI-generated) matching search criteria
- ✅ Learning Outcome: Deep understanding of how these components work together
Bonus achievements:
- ✅ Infrastructure as Code (Pulumi)
- ✅ AI orchestration (Google ADK)
- ✅ Automated deployment (GitHub Actions)
- ✅ Production-grade monitoring (DO Metrics Agent)
- ✅ Security hardening (non-root service user, VPC-isolated database)
- ✅ Cost optimization (automated teardown, log compression)
Final Thoughts: Better Late Than Never
Did I miss the 2022 deadline? Absolutely.
Do I regret waiting? Not even a little.
This project showcases technology that didn't exist three years ago:
- Google ADK for AI orchestration (released 2024)
- Modern generative AI APIs (Imagen 3.0, Leonardo AI)
- Pulumi's matured Go SDK
- Next.js 14's App Router
- Digital Ocean's enhanced features
Sometimes the best time to tackle a challenge is when you have the right tools for the job.
If you're sitting on a dusty to-do item from years ago, consider this your sign: maybe now is actually the perfect time. The tools have gotten better. Your skills have improved. And that "overdue" project might turn into your best work yet.
Thanks to Lars Klint for the original challenge, Kelby Enevold for introducing me to Google ADK, and the Digital Ocean team for making cloud infrastructure genuinely enjoyable to work with.
What's Next? Stay Tuned for Part 2 📅
As my DO promotional credits approach expiration near the end of the year, I'm planning to deploy a "lite" version of this project—a cost-optimized configuration designed to avoid bill shock while keeping the core functionality intact.
Part 2 will cover:
- Cost optimization strategies for production
- Scaling down gracefully without losing features
- Balancing cloud costs vs. capabilities
- Real-world lessons from running AI infrastructure on a budget
Follow me on LinkedIn to catch Part 2 when it drops!
Want to Build This Yourself?
The entire project is open source on GitHub:
- Repository: wheeleruniverse/cgc-lb-and-cdn
- Live Demo: wheeleraiduel.online (when deployed)
The README includes:
- Complete setup instructions
- Architecture diagrams
- API documentation
- Cost breakdowns
- Deployment guides
Give it a star if you found this interesting, and feel free to fork it for your own experiments!
What cloud challenges are sitting on your dusty to-do list? Drop them in the comments—let's hold each other accountable! 👇
Enjoyed this post? Buy me a coffee ☕ to support more cloud adventures!
Top comments (0)