What I've Been Building for the Last Several Months and Why I'm Finally Writing About It

#aws #serverless #ai #distributedsystems

I've been quietly heads-down building something outside of work for the past several months. No posts, no updates, no "excited to share" announcements. Just building.
Today I'm breaking that silence, and this is the first post in a series where I'll share everything I've learned.

The Thing I Built
Autowired.ai — an AI-powered document extraction SaaS.
The idea is straightforward: businesses deal with mountains of documents — invoices, purchase orders, contracts, insurance forms — and extracting structured data from them is still mostly manual or brittle rule-based OCR that breaks the moment the template changes.

Autowired lets you define a visual extraction template on a canvas (you draw fields over a sample document), submit a batch of documents, and get back structured JSON with the extracted values. No code, no regex, no fragile parsers.

Sounds simple. The engineering is not.

Why I Built It, Solo
I have 11 years of software engineering experience. Enterprise Java, government systems, insurance platforms, cloud architecture. I've worked on large teams, gone through full SDLC processes, dealt with change advisory boards and SLA contracts.

What I hadn't done was build something entirely from scratch, make every architectural decision myself, and take it all the way to production — solo.

So that's what I did. This project is my proving ground for everything I know about cloud-native architecture and AI systems applied without the safety net of a team.

It's ~90% complete, which is in beta phase. And the lessons have been hard-earned.

The Stack (and Why)
Before I dive into any specific post, here's the full picture of what's running under the hood:

Infrastructure: AWS CDK (TypeScript) — everything is code, nothing is clicked into existence in the console. Six separate CDK stacks: database, storage, processing, Bedrock, API, and monitoring.
Database: DynamoDB single-table design with three GSIs. Multi-tenant data isolation baked into the partition key structure — not enforced by application code.

Document processing pipeline: S3 event notifications trigger a Lambda, which starts a Step Functions state machine. The state machine runs up to 10 documents in parallel, handles per-document failures independently, updates batch status, and optionally delivers webhook notifications via a separate SQS queue.

AI extraction: Amazon Bedrock Data Automation (BDA) for intelligent field extraction. Amazon Textract for OCR preprocessing. Bedrock Guardrails for safety filtering.

Auth: Clerk.

Frontend: Next.js product app and marketing site.

Runtime: ARM64 Lambdas on Node.js 20, X-Ray tracing across the pipeline, DLQs on every queue.

Every piece of that list came with decisions, tradeoffs, and at least one thing I got wrong the first time.

What's Coming in This Series
Over the next 10 weeks, I'm writing about each layer of this system in depth — not tutorials, not beginner walkthroughs, but the actual engineering reasoning behind the decisions:

Step Functions vs EventBridge vs SQS — when to use each, and how I use all three in the same system for different jobs
Building the event-driven document processing pipeline — S3, SQS, Lambda, Step Functions, and the failure handling that makes it production-grade
How I reduced Bedrock AI costs by ~40% — prompt caching, model tiering, token optimisation, result caching
DynamoDB single-table design for multi-tenant SaaS — real partition key patterns, GSI design decisions, and the tradeoffs nobody mentions
The full multi-tenant SaaS architecture — tenant isolation, async processing, API design, and how all the stacks fit together
Terraform vs AWS CDK — a practical comparison from someone who's used both in production
*RAG architecture on Amazon Bedrock *— embeddings, chunking strategy, tenant-aware retrieval, hallucination reduction
Designing high-availability systems at enterprise scale - what I carried over from government and insurance engineering
AI architecture patterns from a real product — observability, confidence thresholds, prompt versioning, output validation
From Enterprise Java engineer to AI Platform engineer — what the transition actually looks like.

If you work in cloud, AI infrastructure, or platform engineering or you're just curious how a solo engineer structures a production-grade SaaS — this series is for you.

A Note on Why I'm Sharing This
I'm not building in public for the sake of building in public. I'm sharing this because the content I wish had existed when I was making these decisions about DynamoDB key design, about when Step Functions is overkill, about how to actually reduce Bedrock costs in a real workload mostly doesn't exist at the depth it should.

Most AWS content is either too beginner or too abstract. There's not enough "here's a real system, here's why it's designed this way, here's what broke."

That's what I'm trying to write.

— Yoganand (Yogi)

Follow me here on Dev.to or connect on LinkedIn to get each post as it drops.