<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: akilesh thuniki</title>
    <description>The latest articles on DEV Community by akilesh thuniki (@akilesh_thuniki).</description>
    <link>https://dev.to/akilesh_thuniki</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3759846%2Fbc4a6ae6-3447-4483-b067-4613dfae2dba.jpg</url>
      <title>DEV Community: akilesh thuniki</title>
      <link>https://dev.to/akilesh_thuniki</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/akilesh_thuniki"/>
    <language>en</language>
    <item>
      <title>How I Built Multi-Tenant SaaS on AWS (So You Don't Have To)</title>
      <dc:creator>akilesh thuniki</dc:creator>
      <pubDate>Sun, 08 Feb 2026 11:49:41 +0000</pubDate>
      <link>https://dev.to/akilesh_thuniki/how-i-built-multi-tenant-saas-on-aws-so-you-dont-have-to-2p3o</link>
      <guid>https://dev.to/akilesh_thuniki/how-i-built-multi-tenant-saas-on-aws-so-you-dont-have-to-2p3o</guid>
      <description>&lt;p&gt;&lt;em&gt;It was 2:17 AM when my phone lit up with a Slack alert.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Two enterprise customers were seeing each other’s data.&lt;/p&gt;

&lt;p&gt;Not all of it — just enough to trigger panic. The kind of bug that doesn’t just wake you up; it makes you question every infrastructure decision you’ve ever made.&lt;/p&gt;

&lt;p&gt;That night is why SaaSInfraLab exists.&lt;/p&gt;

&lt;p&gt;I was tired of rebuilding the same fragile multi-tenant infrastructure for every new SaaS project and hoping I didn’t miss something critical at 2 AM again.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: Multi-Tenancy Breaks in Subtle, Expensive Ways
&lt;/h2&gt;

&lt;p&gt;Multi-tenant SaaS sounds straightforward until you’re running real workloads at scale.&lt;/p&gt;

&lt;p&gt;Here’s what broke for me repeatedly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Manual tenant onboarding took 2–3 hours per customer&lt;/li&gt;
&lt;li&gt;Namespace misconfigurations exposed data across tenants&lt;/li&gt;
&lt;li&gt;Terraform modules were copied and pasted and drifted over time&lt;/li&gt;
&lt;li&gt;CI/CD pipelines were brittle and hard to reason about&lt;/li&gt;
&lt;li&gt;AWS costs grew with no per-tenant visibility&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At around 40–50 tenants, everything slowed down.&lt;/p&gt;

&lt;p&gt;One bad helm change could impact everyone.&lt;br&gt;
One missed IAM permission could block a deployment.&lt;br&gt;
One rushed fix could leak data.&lt;/p&gt;

&lt;p&gt;The problem isn’t Kubernetes or AWS — it’s the lack of structure and repeatability.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Solution: A Production-Ready, GitOps-Driven SaaS Stack
&lt;/h2&gt;

&lt;p&gt;Instead of patching the same problems again, I stepped back and designed a system with one rule:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Tenant isolation must exist at every layer.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;High-Level Approach&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I built a modular infrastructure stack with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AWS EKS as the compute foundation&lt;/li&gt;
&lt;li&gt;Terraform for deterministic infrastructure&lt;/li&gt;
&lt;li&gt;GitOps (ArgoCD) as the control plane&lt;/li&gt;
&lt;li&gt;PostgreSQL schema isolation for data&lt;/li&gt;
&lt;li&gt;Namespaces, quotas, RBAC, and network policies by default&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything is defined once, versioned, and reused.&lt;/p&gt;

&lt;p&gt;No click-ops. No snowflakes.&lt;/p&gt;
&lt;h2&gt;
  
  
  Core Design Decisions (and Why)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Kubernetes Namespaces per tenant&lt;/strong&gt;&lt;br&gt;
This gives clean workload isolation, quota enforcement, and blast-radius control.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;PostgreSQL schemas instead of separate databases&lt;/strong&gt;&lt;br&gt;
Lower cost, simpler operations, and safe isolation when paired with strict search paths.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;await client.query(`SET search_path TO tenant_${tenantId}`);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9j78xr545r9d5htu02h6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9j78xr545r9d5htu02h6.png" alt=" " width="800" height="601"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitOps for all deployments&lt;/strong&gt;&lt;br&gt;
ArgoCD watches tenant definitions and applies changes automatically. No manual deploys, no surprises.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;IRSA + RBAC everywhere&lt;/strong&gt;&lt;br&gt;
Every pod gets only the AWS permissions it needs — nothing more.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CI/CD Flow&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CI (GitHub Actions): build images, run tests, push to ECR&lt;/li&gt;
&lt;li&gt;CD (ArgoCD): syncs manifests, runs per-tenant migrations, deploys safely&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcxlhgt9zc0p3fl57cuwp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcxlhgt9zc0p3fl57cuwp.png" alt=" " width="800" height="184"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Adding a tenant is a config change — not a weekend task.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Lessons Learned &amp;amp; What I’d Do Differently&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If I were starting again:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I’d add cost attribution from day one&lt;/li&gt;
&lt;li&gt;I’d document network policies earlier&lt;/li&gt;
&lt;li&gt;I’d automate tenant-isolation tests sooner&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The biggest takeaway?&lt;/strong&gt;&lt;br&gt;
Tenant isolation isn’t a single feature.&lt;br&gt;
It’s defense in depth: IAM, network, compute, data, and deployment workflows all working together.&lt;/p&gt;

&lt;p&gt;That’s what SaaSInfraLab tries to encode.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;The entire stack is open source.&lt;/p&gt;

&lt;p&gt;Clone it, define your tenants, and deploy a real multi-tenant SaaS foundation in under 30 minutes.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/SaaSInfraLab" rel="noopener noreferrer"&gt;https://github.com/SaaSInfraLab&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Questions? I’m happy to discuss design decisions or help troubleshoot edge cases.&lt;/p&gt;

&lt;p&gt;What’s been your worst infrastructure deployment incident — and how did you prevent it from happening again?&lt;/p&gt;

</description>
      <category>devops</category>
      <category>kubernetes</category>
      <category>terraform</category>
      <category>aws</category>
    </item>
  </channel>
</rss>
