<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Cláudio Filipe Lima Rapôso</title>
    <description>The latest articles on DEV Community by Cláudio Filipe Lima Rapôso (@sertaoseracloud).</description>
    <link>https://dev.to/sertaoseracloud</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2686804%2F21b0095d-7793-4b03-b894-5b639c6264c0.png</url>
      <title>DEV Community: Cláudio Filipe Lima Rapôso</title>
      <link>https://dev.to/sertaoseracloud</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sertaoseracloud"/>
    <language>en</language>
    <item>
      <title>Azure Cellular Architecture: Scaling with Cosmos DB Change Feed and Service Bus</title>
      <dc:creator>Cláudio Filipe Lima Rapôso</dc:creator>
      <pubDate>Thu, 30 Apr 2026 12:25:41 +0000</pubDate>
      <link>https://dev.to/sertaoseracloud/azure-cellular-architecture-scaling-with-cosmos-db-change-feed-and-service-bus-3a0d</link>
      <guid>https://dev.to/sertaoseracloud/azure-cellular-architecture-scaling-with-cosmos-db-change-feed-and-service-bus-3a0d</guid>
      <description>&lt;h3&gt;
  
  
  1. Introduction
&lt;/h3&gt;

&lt;p&gt;Scaling cloud-native systems often results in a massive, highly interconnected web of services where a single corrupted payload or regional brownout can trigger a global outage. Cellular Architecture mitigates this by decomposing the system into completely independent, isolated failure domains known as "cells." In this tutorial, you will build a highly resilient event-driven architecture using Microsoft Azure, combining the cellular pattern with an asynchronous workflow (Azure Cosmos DB, Azure Functions, and Azure Service Bus).&lt;/p&gt;

&lt;p&gt;Instead of relying on a single monolithic data and compute plane, you will deploy multiple identical infrastructure stamps. A thin edge routing layer will inspect incoming requests and route them to the appropriate cell based on a partition key (such as a &lt;code&gt;TenantId&lt;/code&gt;). If a localized infrastructure degradation or a "poison pill" event impacts one cell, the blast radius is strictly contained within that specific boundary, ensuring maximum availability for tenants residing in other cells. This pattern represents the ultimate fault isolation strategy for mission-critical enterprise systems on Azure.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Prerequisites
&lt;/h3&gt;

&lt;p&gt;To successfully deploy this architecture, your environment must be equipped with the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An active Azure Subscription with Owner or User Access Administrator privileges to provision compute, database, messaging, and Role-Based Access Control (RBAC) resources.&lt;/li&gt;
&lt;li&gt;The Azure CLI installed and authenticated locally (&lt;code&gt;az login&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Terraform (version 1.3.0 or higher) installed and accessible from your terminal.&lt;/li&gt;
&lt;li&gt;A solid understanding of the separation between a Control Plane (global routing state) and a Data Plane (the actual processing cells).&lt;/li&gt;
&lt;li&gt;Familiarity with Domain-Driven Design (DDD) to accurately define the partition keys that will dictate cell placement without creating cross-boundary data dependencies.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Step-by-Step
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Step 1: Defining the Data Plane Cell Module
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Create a reusable Terraform module that encapsulates the entire event-driven pipeline (Resource Group, Cosmos DB, Service Bus, Storage, and Azure Functions). This module represents a single, mathematically isolated cell.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; The core tenet of cellular architecture is absolute repeatability and isolation. By packaging the data plane into a module, every cell becomes an identical, deterministic stamp of infrastructure. Provisioning a dedicated Cosmos DB account and Service Bus namespace per cell guarantees that a noisy neighbor or an overloaded queue in "Cell Alpha" consumes zero compute or IOPS from "Cell Beta."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Screenshot/Example:&lt;/strong&gt; Save this configuration as &lt;code&gt;modules/azure_cell/main.tf&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"cell_id"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Unique identifier for the cell (e.g., alpha, beta)"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"location"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Azure region for the cell"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;
  &lt;span class="nx"&gt;default&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"East US"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Dedicated Resource Group for the Cell&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_resource_group"&lt;/span&gt; &lt;span class="s2"&gt;"cell_rg"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"rg-cell-${var.cell_id}"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Isolated Cosmos DB Account for the Cell&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_cosmosdb_account"&lt;/span&gt; &lt;span class="s2"&gt;"cell_db"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"cosmos-data-cell-${var.cell_id}"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;offer_type&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Standard"&lt;/span&gt;
  &lt;span class="nx"&gt;kind&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"GlobalDocumentDB"&lt;/span&gt;

  &lt;span class="nx"&gt;consistency_policy&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;consistency_level&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Session"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;geo_location&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;failover_priority&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Isolated Service Bus Namespace for the Cell&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_servicebus_namespace"&lt;/span&gt; &lt;span class="s2"&gt;"cell_sb"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"sb-namespace-cell-${var.cell_id}"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;sku&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Standard"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_servicebus_topic"&lt;/span&gt; &lt;span class="s2"&gt;"cell_topic"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"processing-topic"&lt;/span&gt;
  &lt;span class="nx"&gt;namespace_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_servicebus_namespace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_sb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_servicebus_subscription"&lt;/span&gt; &lt;span class="s2"&gt;"cell_sub"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;               &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"processing-subscription"&lt;/span&gt;
  &lt;span class="nx"&gt;topic_id&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_servicebus_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;max_delivery_count&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="c1"&gt;# Built-in poison pill protection&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Note: In a complete module, you would also define the azurerm_linux_function_app &lt;/span&gt;
&lt;span class="c1"&gt;# resources here for the Producer (Change Feed Trigger) and Consumer (Service Bus Trigger),&lt;/span&gt;
&lt;span class="c1"&gt;# utilizing SystemAssigned managed identities for secure communication within the cell.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 2: Instantiating the Independent Cells
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; In your root Terraform configuration, instantiate the cell module multiple times to create your physical failure domains.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; Deploying multiple instances generates the actual physical boundaries. You can assign high-tier enterprise tenants to their own dedicated cells or group smaller tenants into shared cells. Because the modules are parameterized, you can easily deploy cells across different Azure regions to combine cellular isolation with geographic high availability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Screenshot/Example:&lt;/strong&gt; Add this to your root &lt;code&gt;main.tf&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"cell_alpha"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"./modules/azure_cell"&lt;/span&gt;
  &lt;span class="nx"&gt;cell_id&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"alpha"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"East US 2"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"cell_beta"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"./modules/azure_cell"&lt;/span&gt;
  &lt;span class="nx"&gt;cell_id&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"beta"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"West US 2"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"cell_gamma"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"./modules/azure_cell"&lt;/span&gt;
  &lt;span class="nx"&gt;cell_id&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"gamma"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"North Europe"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 3: Building the Control Plane Mapping
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Provision a highly available, globally distributed Cosmos DB account that acts as the Control Plane data store. This database maps your tenant identifiers to their assigned Cell ID.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; The routing layer must dynamically know where to forward incoming requests. This global mapping table is queried by the edge router. It must be decoupled from the data plane and highly available; if the routing map goes down, traffic cannot reach the perfectly healthy data plane cells behind it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Screenshot/Example:&lt;/strong&gt; Create a &lt;code&gt;control_plane.tf&lt;/code&gt; file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_resource_group"&lt;/span&gt; &lt;span class="s2"&gt;"routing_rg"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"rg-global-routing"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"East US"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_cosmosdb_account"&lt;/span&gt; &lt;span class="s2"&gt;"routing_db"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"cosmos-global-tenant-map"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;routing_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;routing_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;offer_type&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Standard"&lt;/span&gt;
  &lt;span class="nx"&gt;kind&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"GlobalDocumentDB"&lt;/span&gt;

  &lt;span class="nx"&gt;consistency_policy&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;consistency_level&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Eventual"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="c1"&gt;# Optimize for fast reads&lt;/span&gt;

  &lt;span class="nx"&gt;geo_location&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"East US"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;failover_priority&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;geo_location&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"West Europe"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;failover_priority&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_cosmosdb_sql_database"&lt;/span&gt; &lt;span class="s2"&gt;"map_db"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"RoutingDB"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;routing_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;account_name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_cosmosdb_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;routing_db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_cosmosdb_sql_container"&lt;/span&gt; &lt;span class="s2"&gt;"tenant_map"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"TenantCellMap"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;routing_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;account_name&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_cosmosdb_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;routing_db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;database_name&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_cosmosdb_sql_database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;map_db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;partition_key_path&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"/tenantId"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 4: Implementing the Thin Cell Router
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Deploy an Azure Function App with an HTTP trigger acting as the "Cell Router". This function extracts the &lt;code&gt;TenantId&lt;/code&gt; from the incoming HTTP payload, queries the global Cosmos DB mapping table to resolve the destination cell, and then forwards the payload to that specific cell's Cosmos DB.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; The cell router is the critical entry point and must remain computationally "thin." Complex business logic belongs strictly inside the data plane cells, not the router. By keeping the router simple, you minimize its execution time and failure modes. Once the router successfully writes the data to the target cell's Cosmos DB, the localized event-driven pipeline (Change Feed -&amp;gt; Producer Function -&amp;gt; Service Bus -&amp;gt; Consumer Function) takes over autonomously.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Screenshot/Example:&lt;/strong&gt; Create a &lt;code&gt;router.tf&lt;/code&gt; file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Storage Account and App Service Plan for the Router Function&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_storage_account"&lt;/span&gt; &lt;span class="s2"&gt;"router_storage"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"stglobalrouterfunc"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;routing_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;routing_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;account_tier&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Standard"&lt;/span&gt;
  &lt;span class="nx"&gt;account_replication_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"LRS"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_service_plan"&lt;/span&gt; &lt;span class="s2"&gt;"router_plan"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"asp-router-plan"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;routing_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;routing_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;os_type&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Linux"&lt;/span&gt;
  &lt;span class="nx"&gt;sku_name&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Y1"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# The Thin Router Function&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_linux_function_app"&lt;/span&gt; &lt;span class="s2"&gt;"cell_router"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"func-global-edge-router"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;routing_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;                   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;routing_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;service_plan_id&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_service_plan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;router_plan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;storage_account_name&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_storage_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;router_storage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;storage_account_access_key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_storage_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;router_storage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;primary_access_key&lt;/span&gt;

  &lt;span class="nx"&gt;site_config&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;application_stack&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;node_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"20"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;identity&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"SystemAssigned"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# RBAC: Allow Router to read the mapping table&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_role_assignment"&lt;/span&gt; &lt;span class="s2"&gt;"router_map_reader"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;scope&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_cosmosdb_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;routing_db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;role_definition_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Cosmos DB Account Reader Role"&lt;/span&gt; &lt;span class="c1"&gt;# Note: Use custom data plane roles in production&lt;/span&gt;
  &lt;span class="nx"&gt;principal_id&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_linux_function_app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;identity&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;principal_id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Common Troubleshooting
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Cross-Cell Dependencies:&lt;/strong&gt; The most critical anti-pattern in cellular architecture is allowing cells to communicate with each other or share a backend data store. If "Cell Alpha" synchronously queries an API in "Cell Beta", the isolation boundary is compromised. Ensure strict domain boundaries; data required to process an event within a cell must reside entirely within that cell's Cosmos DB.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Control Plane Bottlenecks (Router Fatigue):&lt;/strong&gt; The Cell Router represents a single point of failure. If the HTTP-triggered Azure Function queries the global Cosmos DB mapping table on every single request, you will encounter latency and potential RU/s (Request Unit) throttling. You must implement aggressive in-memory caching (e.g., using &lt;code&gt;MemoryCache&lt;/code&gt; in C# or standard dictionary caching in Node.js) within the router's execution context so that tenant-to-cell mappings are resolved locally and instantly after the first lookup.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Poison Pills and Service Bus Dead-Lettering:&lt;/strong&gt; If a malformed payload triggers the Cosmos DB Change Feed but crashes the Consumer Azure Function, the message will be returned to the Service Bus Subscription. Because this is a cellular architecture, this poison pill will only degrade the specific cell it entered. To resolve this automatically, the Terraform module is configured with &lt;code&gt;max_delivery_count = 5&lt;/code&gt;. Service Bus will automatically move the failing message to the Dead-Letter Queue (DLQ) after 5 attempts, restoring the cell's processing health.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  5. Conclusion
&lt;/h3&gt;

&lt;p&gt;By embedding an asynchronous event-driven flow within a cellular architecture on Microsoft Azure, you have constructed a system optimized for extreme resilience and strictly controlled blast radiuses. The rigid separation between a global routing control plane and fully isolated data plane cells ensures that infrastructure failures, bad deployments, or tenant-specific traffic spikes remain mathematically contained. Utilizing Terraform modules to stamp out these cells guarantees environmental consistency and facilitates rapid horizontal scaling. As your system matures, consider enhancing the edge routing layer by placing Azure Front Door in front of your Cell Router Functions to provide global load balancing, WAF protection, and seamless traffic shifting during cell migrations or disaster recovery events.&lt;/p&gt;

</description>
      <category>azure</category>
      <category>cosmosdb</category>
      <category>azurefunctions</category>
      <category>terraform</category>
    </item>
    <item>
      <title>AWS Cellular Architecture: Scaling Event-Driven Systems with DynamoDB, SNS, and SQS</title>
      <dc:creator>Cláudio Filipe Lima Rapôso</dc:creator>
      <pubDate>Wed, 29 Apr 2026 15:00:02 +0000</pubDate>
      <link>https://dev.to/sertaoseracloud/aws-cellular-architecture-scaling-event-driven-systems-with-dynamodb-sns-and-sqs-4fil</link>
      <guid>https://dev.to/sertaoseracloud/aws-cellular-architecture-scaling-event-driven-systems-with-dynamodb-sns-and-sqs-4fil</guid>
      <description>&lt;h3&gt;
  
  
  1. Introduction
&lt;/h3&gt;

&lt;p&gt;In the cloud-native era, scaling an architecture often leads to a massive, interconnected system where a single failure can cause a global outage. Cellular Architecture solves this by decomposing the system into independent, isolated failure domains called "cells." In this tutorial, you will build a highly resilient event-driven architecture by combining the cellular pattern with an asynchronous flow (DynamoDB Streams, AWS Lambda, Amazon SNS, and Amazon SQS). &lt;/p&gt;

&lt;p&gt;Instead of a single monolithic data plane, you will deploy multiple identical infrastructure stamps (cells). A thin edge routing layer will inspect incoming requests and route them to the appropriate cell based on a partition key (such as a Tenant ID). If a poison pill event or a localized infrastructure degradation impacts one cell, the blast radius is strictly contained, ensuring maximum availability for tenants in other cells. This approach provides the ultimate fault isolation boundary for mission-critical distributed systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Prerequisites
&lt;/h3&gt;

&lt;p&gt;To execute this deployment, your environment must be equipped with the following tools and foundational knowledge:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An active AWS account with administrative privileges to provision IAM, compute, database, and messaging resources.&lt;/li&gt;
&lt;li&gt;Terraform (version 1.3.0 or higher) installed locally, alongside the AWS CLI authenticated with your credentials.&lt;/li&gt;
&lt;li&gt;A clear understanding of the separation between the Control Plane (global routing state) and the Data Plane (the actual processing cells).&lt;/li&gt;
&lt;li&gt;Familiarity with Domain-Driven Design (DDD) to correctly define the partition keys that will dictate cell placement without creating cross-boundary data dependencies.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Step-by-Step
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Step 1: Defining the Data Plane Cell Module
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Create a reusable Terraform module that encapsulates the entire event-driven pipeline (DynamoDB, SNS, SQS, and Lambdas). This module represents a single, self-contained cell.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; The core tenet of cellular architecture is repeatability. By packaging the data plane into a module, you ensure that every cell is an identical, deterministic stamp of infrastructure. This structural isolation guarantees that an overloaded SQS queue or a throttling DynamoDB table in Cell Alpha has absolutely zero impact on the compute resources in Cell Beta.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Screenshot/Example:&lt;/strong&gt; Save this configuration as &lt;code&gt;modules/event_driven_cell/main.tf&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"cell_id"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Unique identifier for the cell (e.g., alpha, beta)"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# DynamoDB Table restricted to this specific cell&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_dynamodb_table"&lt;/span&gt; &lt;span class="s2"&gt;"cell_table"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"app-data-cell-${var.cell_id}"&lt;/span&gt;
  &lt;span class="nx"&gt;billing_mode&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"PAY_PER_REQUEST"&lt;/span&gt;
  &lt;span class="nx"&gt;hash_key&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"id"&lt;/span&gt;

  &lt;span class="nx"&gt;attribute&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"id"&lt;/span&gt;
    &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"S"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;stream_enabled&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;stream_view_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"NEW_AND_OLD_IMAGES"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# SNS Topic and SQS Queue strictly bound to this cell&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_sns_topic"&lt;/span&gt; &lt;span class="s2"&gt;"cell_topic"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"processing-topic-cell-${var.cell_id}"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_sqs_queue"&lt;/span&gt; &lt;span class="s2"&gt;"cell_queue"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"processing-queue-cell-${var.cell_id}"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_sns_topic_subscription"&lt;/span&gt; &lt;span class="s2"&gt;"cell_sub"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;topic_arn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_sns_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
  &lt;span class="nx"&gt;protocol&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"sqs"&lt;/span&gt;
  &lt;span class="nx"&gt;endpoint&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_sqs_queue_policy"&lt;/span&gt; &lt;span class="s2"&gt;"cell_queue_policy"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;queue_url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;policy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jsonencode&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="nx"&gt;Version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;
    &lt;span class="nx"&gt;Statement&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;Effect&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;
        &lt;span class="nx"&gt;Principal&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Service&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"sns.amazonaws.com"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="nx"&gt;Action&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"sqs:SendMessage"&lt;/span&gt;
        &lt;span class="nx"&gt;Resource&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
        &lt;span class="nx"&gt;Condition&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;ArnEquals&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s2"&gt;"aws:SourceArn"&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_sns_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Producer Lambda Event Source Mapping&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_lambda_event_source_mapping"&lt;/span&gt; &lt;span class="s2"&gt;"dynamo_trigger"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;event_source_arn&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_dynamodb_table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stream_arn&lt;/span&gt;
  &lt;span class="nx"&gt;function_name&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_lambda_function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
  &lt;span class="nx"&gt;starting_position&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"LATEST"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 2: Instantiating the Independent Cells
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; In your root Terraform configuration, instantiate the cell module multiple times to create your isolated failure domains.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; Deploying multiple instances generates the physical infrastructure boundaries. For example, assigning high-tier enterprise tenants to their own dedicated cells prevents noisy neighbor problems. While this deployment runs natively on AWS, establishing decoupled infrastructure stamps is the fundamental prerequisite for extending your architecture to active-active multi-cloud environments in the future.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Screenshot/Example:&lt;/strong&gt; Add this to your root &lt;code&gt;main.tf&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"cell_alpha"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"./modules/event_driven_cell"&lt;/span&gt;
  &lt;span class="nx"&gt;cell_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"alpha"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"cell_beta"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"./modules/event_driven_cell"&lt;/span&gt;
  &lt;span class="nx"&gt;cell_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"beta"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"cell_gamma"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"./modules/event_driven_cell"&lt;/span&gt;
  &lt;span class="nx"&gt;cell_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"gamma"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 3: Building the Control Plane Mapping
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Create a global DynamoDB table that acts as the Control Plane data store. This table maps your partition keys (e.g., Tenant ID) to their assigned Cell ID.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; The routing layer must know where to send incoming requests without utilizing hardcoded logic. This global mapping table is queried by the edge router dynamically. It must be highly available, as its failure would prevent traffic from reaching the perfectly healthy data plane cells behind it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Screenshot/Example:&lt;/strong&gt; Create a &lt;code&gt;control_plane.tf&lt;/code&gt; file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_dynamodb_table"&lt;/span&gt; &lt;span class="s2"&gt;"tenant_routing_map"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"global-tenant-cell-mapping"&lt;/span&gt;
  &lt;span class="nx"&gt;billing_mode&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"PAY_PER_REQUEST"&lt;/span&gt;
  &lt;span class="nx"&gt;hash_key&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"tenant_id"&lt;/span&gt;

  &lt;span class="nx"&gt;attribute&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"tenant_id"&lt;/span&gt;
    &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"S"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;Layer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ControlPlane"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 4: Implementing the Thin Cell Router
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Deploy an Amazon API Gateway linked to a Lambda function. This function serves as the "Cell Router". It extracts the Tenant ID from the incoming HTTP payload, queries the global mapping table to resolve the destination cell, and then forwards the data to that specific cell's DynamoDB table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; The cell router is the critical entry point and must remain computationally "thin". Any complex business logic belongs inside the cell, not the router. By keeping the router simple, you minimize its failure modes. Once the router successfully writes the initial data to the target cell's DynamoDB table, the localized event-driven pipeline (Streams -&amp;gt; Producer Lambda -&amp;gt; SNS -&amp;gt; SQS -&amp;gt; Consumer Lambda) takes over autonomously.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Screenshot/Example:&lt;/strong&gt; Create a &lt;code&gt;router.tf&lt;/code&gt; file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# IAM Role granting the Router permission to read the map and write to ALL cells&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_role"&lt;/span&gt; &lt;span class="s2"&gt;"router_role"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"edge-cell-router-role"&lt;/span&gt;
  &lt;span class="nx"&gt;assume_role_policy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jsonencode&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="nx"&gt;Version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;
    &lt;span class="nx"&gt;Statement&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="nx"&gt;Action&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"sts:AssumeRole"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Effect&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Principal&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Service&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"lambda.amazonaws.com"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_role_policy"&lt;/span&gt; &lt;span class="s2"&gt;"router_permissions"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;role&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_iam_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;router_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;policy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jsonencode&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="nx"&gt;Version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;
    &lt;span class="nx"&gt;Statement&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;Effect&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;
        &lt;span class="nx"&gt;Action&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"dynamodb:GetItem"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="nx"&gt;Resource&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_dynamodb_table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tenant_routing_map&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;Effect&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;
        &lt;span class="nx"&gt;Action&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"dynamodb:PutItem"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="nx"&gt;Resource&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"arn:aws:dynamodb:*:*:table/app-data-cell-*"&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_lambda_function"&lt;/span&gt; &lt;span class="s2"&gt;"cell_router"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;function_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"global-edge-router"&lt;/span&gt;
  &lt;span class="nx"&gt;handler&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"router.handler"&lt;/span&gt;
  &lt;span class="nx"&gt;runtime&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"nodejs20.x"&lt;/span&gt;
  &lt;span class="nx"&gt;role&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_iam_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;router_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
  &lt;span class="c1"&gt;# Code package implementation omitted for brevity&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Common Troubleshooting
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Cross-Cell Dependencies and Shared State:&lt;/strong&gt; The most destructive error in cellular architecture is allowing cells to communicate with each other or share a backend resource (like a single centralized RDS instance or S3 bucket). If Cell Alpha synchronously queries data from Cell Beta, the isolation boundary is broken. Ensure strict domain boundaries; data required by a cell must reside entirely within that cell.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Control Plane Bottlenecks (Router Fatigue):&lt;/strong&gt; The Cell Router represents a single point of failure. If the Lambda router queries the global mapping table on every single request, you will encounter latency and DynamoDB read throttling. You must implement aggressive in-memory caching within the router's execution context so that tenant-to-cell mappings are resolved instantly locally.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Poison Pills and Asynchronous Failures:&lt;/strong&gt; If a malformed payload triggers the DynamoDB Stream but crashes the Consumer Lambda, the message will cycle continuously in the SQS queue. Because this is a cellular architecture, this poison pill will only degrade the specific cell it entered. To resolve the localized issue, ensure your cell module includes a Dead Letter Queue (DLQ) to automatically eject failing messages after a set number of retries.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  5. Conclusion
&lt;/h3&gt;

&lt;p&gt;By embedding an event-driven flow within a cellular architecture, you have constructed a system optimized for extreme resilience and strictly controlled blast radiuses. The separation between a global routing control plane and isolated data plane cells ensures that infrastructure or deployment failures remain mathematically contained. Utilizing Terraform modules to stamp out these cells guarantees environmental consistency and facilitates rapid, linear horizontal scaling. As your system matures, consider enhancing the edge routing layer using Amazon Route 53 Application Recovery Controller (ARC) for advanced DNS-level cell shifting and disaster recovery automation.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>sns</category>
      <category>sqs</category>
      <category>lambda</category>
    </item>
    <item>
      <title>Serverless Data Processing: Azure CosmoDB, Azure Functions, and Azure Service Bus</title>
      <dc:creator>Cláudio Filipe Lima Rapôso</dc:creator>
      <pubDate>Wed, 29 Apr 2026 03:00:00 +0000</pubDate>
      <link>https://dev.to/sertaoseracloud/serverless-data-processing-azure-cosmodb-azure-functions-and-azure-service-bus-2aed</link>
      <guid>https://dev.to/sertaoseracloud/serverless-data-processing-azure-cosmodb-azure-functions-and-azure-service-bus-2aed</guid>
      <description>&lt;h2&gt;
  
  
  1. Introduction
&lt;/h2&gt;

&lt;p&gt;An event-driven architecture using Azure Cosmos DB, Azure Functions, and Azure Service Bus is the enterprise standard for designing highly decoupled, scalable, and resilient systems in the Microsoft Cloud. In this tutorial, you will construct a complete, asynchronous data flow from scratch using Azure's analogous features to the AWS ecosystem. &lt;/p&gt;

&lt;p&gt;The process begins when modifications occur in a Cosmos DB container, which automatically surfaces through the Cosmos DB Change Feed. This feed is continuously monitored by a producer Azure Function, which processes the change and publishes a message to a Service Bus Topic. The Service Bus Topic acts as a high-availability message router (similar to AWS SNS), performing a fan-out distribution of this notification to a Service Bus Subscription (acting as the queue, similar to AWS SQS). Finally, a second consumer Azure Function reads the messages from the subscription to perform the final business logic processing. By the end of this guide, you will be able to deploy this robust infrastructure using Terraform, adhering to Azure security best practices such as utilizing Managed Identities instead of vulnerable connection strings.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before starting the infrastructure build, ensure your development environment is prepared. You will need an active Azure subscription with permissions to provision compute, messaging, database, and Role-Based Access Control (RBAC) resources. The Azure Command-Line Interface (Azure CLI) must be installed and authenticated (&lt;code&gt;az login&lt;/code&gt;) on your local machine. Additionally, Terraform (version 1.3.0 or higher) needs to be installed and globally accessible from your command terminal. Fundamental knowledge of Terraform's declarative syntax (HCL) and a basic understanding of Azure resource hierarchy (Resource Groups) are highly recommended. A modern code editor like VS Code with the HashiCorp Terraform extension will greatly facilitate your workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Step-by-Step
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Initial Configuration and Azure Provider
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Create the root Terraform configuration file to define the required Terraform version, establish the connection with the AzureRM provider, and create a foundational Resource Group.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; The &lt;code&gt;azurerm&lt;/code&gt; provider block instructs Terraform on how to interact with the Azure Resource Manager API. In Azure, all resources must be logically grouped into a Resource Group. Deploying all interconnected services (Cosmos DB, Service Bus, Functions) into a single Resource Group makes it easier to manage their lifecycle, monitor costs, and delete the entire environment cleanly when no longer needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example (&lt;code&gt;main.tf&lt;/code&gt;):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;required_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"&amp;gt;= 1.3.0"&lt;/span&gt;
  &lt;span class="nx"&gt;required_providers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;azurerm&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"hashicorp/azurerm"&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 3.0"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"azurerm"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;features&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_resource_group"&lt;/span&gt; &lt;span class="s2"&gt;"main"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"rg-eventdriven-architecture-prod"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"East US"&lt;/span&gt;

  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;Environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Production"&lt;/span&gt;
    &lt;span class="nx"&gt;Project&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"EventDrivenArchitecture"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Creation of the Cosmos DB Account and Containers
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Provision a Cosmos DB account with the SQL (Core) API, a database, and two containers: one for your application data and one "leases" container required to manage the state of the Change Feed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; Cosmos DB is Azure's fully managed NoSQL database. Unlike AWS DynamoDB where Streams must be explicitly enabled, the Cosmos DB Change Feed is enabled by default. However, to consume it reliably with Azure Functions, you must create an auxiliary container called a "lease container." This container acts as a checkpoint system, allowing the producer function to keep track of which events it has already processed, ensuring reliable, ordered, and distributed event consumption.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example (&lt;code&gt;database.tf&lt;/code&gt;):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_cosmosdb_account"&lt;/span&gt; &lt;span class="s2"&gt;"db_account"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"cosmos-event-app-prod"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;offer_type&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Standard"&lt;/span&gt;
  &lt;span class="nx"&gt;kind&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"GlobalDocumentDB"&lt;/span&gt;

  &lt;span class="nx"&gt;consistency_policy&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;consistency_level&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Session"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;geo_location&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;location&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
    &lt;span class="nx"&gt;failover_priority&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_cosmosdb_sql_database"&lt;/span&gt; &lt;span class="s2"&gt;"database"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ApplicationData"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;account_name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_cosmosdb_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Main Data Container&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_cosmosdb_sql_container"&lt;/span&gt; &lt;span class="s2"&gt;"data_container"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Items"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;account_name&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_cosmosdb_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;database_name&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_cosmosdb_sql_database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;partition_key_path&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"/id"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Leases Container for Change Feed State&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_cosmosdb_sql_container"&lt;/span&gt; &lt;span class="s2"&gt;"leases_container"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"leases"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;account_name&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_cosmosdb_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;database_name&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_cosmosdb_sql_database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;partition_key_path&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"/id"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Provisioning the Service Bus Topic and Subscription
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Implement the Azure Service Bus Namespace, a Topic, and a Subscription. This maps perfectly to the AWS SNS/SQS fan-out pattern.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; A Service Bus Topic allows you to publish a single message that can be received by multiple independent Subscriptions. The Topic handles the routing (like SNS), while the Subscription acts as a dedicated, persistent queue for your consumer function (like SQS). This decouples your microservices and provides message retention in case the consumer function is temporarily unavailable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example (&lt;code&gt;messaging.tf&lt;/code&gt;):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_servicebus_namespace"&lt;/span&gt; &lt;span class="s2"&gt;"sb_namespace"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"sb-event-routing-prod"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;sku&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Standard"&lt;/span&gt; &lt;span class="c1"&gt;# Required for Topics&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_servicebus_topic"&lt;/span&gt; &lt;span class="s2"&gt;"event_topic"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"data-processing-topic"&lt;/span&gt;
  &lt;span class="nx"&gt;namespace_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_servicebus_namespace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sb_namespace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_servicebus_subscription"&lt;/span&gt; &lt;span class="s2"&gt;"event_subscription"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;               &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"data-processing-subscription"&lt;/span&gt;
  &lt;span class="nx"&gt;topic_id&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_servicebus_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;max_delivery_count&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="c1"&gt;# Send to Dead Letter Queue after 10 fails&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Deploying the Producer and Consumer Azure Functions
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Create a Storage Account and an App Service Plan (Serverless Consumption tier) required to host Azure Functions. Then, deploy the two Linux Function Apps, ensuring that System Assigned Managed Identities are enabled for secure, passwordless authentication.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; Azure Functions require a backing Storage Account to manage triggers and function execution state. By enabling &lt;code&gt;identity { type = "SystemAssigned" }&lt;/code&gt;, Azure automatically provisions a service principal in Microsoft Entra ID (formerly Azure AD) tied to the lifecycle of the Function. This is a crucial security best practice, allowing your functions to securely access Cosmos DB and Service Bus without hardcoding connection strings in your configuration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example (&lt;code&gt;functions.tf&lt;/code&gt;):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_storage_account"&lt;/span&gt; &lt;span class="s2"&gt;"func_storage"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"stfuncappdata123"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;account_tier&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Standard"&lt;/span&gt;
  &lt;span class="nx"&gt;account_replication_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"LRS"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_service_plan"&lt;/span&gt; &lt;span class="s2"&gt;"consumption_plan"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"asp-serverless-plan"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;os_type&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Linux"&lt;/span&gt;
  &lt;span class="nx"&gt;sku_name&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Y1"&lt;/span&gt; &lt;span class="c1"&gt;# Dynamic Serverless Consumption&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_linux_function_app"&lt;/span&gt; &lt;span class="s2"&gt;"producer_func"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"func-cosmos-producer-prod"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;                   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;service_plan_id&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_service_plan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consumption_plan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;storage_account_name&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_storage_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;func_storage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;storage_account_access_key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_storage_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;func_storage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;primary_access_key&lt;/span&gt;

  &lt;span class="nx"&gt;site_config&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;application_stack&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;node_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"20"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;identity&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"SystemAssigned"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_linux_function_app"&lt;/span&gt; &lt;span class="s2"&gt;"consumer_func"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"func-sb-consumer-prod"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;                   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;service_plan_id&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_service_plan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consumption_plan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;storage_account_name&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_storage_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;func_storage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;storage_account_access_key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_storage_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;func_storage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;primary_access_key&lt;/span&gt;

  &lt;span class="nx"&gt;site_config&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;application_stack&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;node_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"20"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;identity&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"SystemAssigned"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Applying Role-Based Access Control (RBAC) Triggers
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Grant the specific IAM roles required for the Functions to interact with the database and messaging bus using Azure Role Assignments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; While AWS uses IAM Policies attached to Event Source Mappings, Azure utilizes RBAC assignments. The Producer Function needs permission to read the Cosmos DB Change Feed and send messages to the Service Bus. The Consumer Function needs permission to receive messages from the Service Bus. Defining these natively in Terraform centralizes your security posture and enforces the principle of least privilege.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example (&lt;code&gt;rbac.tf&lt;/code&gt;):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_subscription"&lt;/span&gt; &lt;span class="s2"&gt;"current"&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

&lt;span class="c1"&gt;# Give Producer Function permission to Send to Service Bus&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_role_assignment"&lt;/span&gt; &lt;span class="s2"&gt;"producer_sb_sender"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;scope&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_servicebus_namespace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sb_namespace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;role_definition_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Azure Service Bus Data Sender"&lt;/span&gt;
  &lt;span class="nx"&gt;principal_id&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_linux_function_app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;producer_func&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;identity&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;principal_id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Cosmos DB RBAC is custom, but for simplicity we grant Cosmos DB Account Reader &lt;/span&gt;
&lt;span class="c1"&gt;# (Note: Data plane role assignments for Cosmos DB are often managed via azapi or CLI, &lt;/span&gt;
&lt;span class="c1"&gt;# but this represents the structural intent)&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_role_assignment"&lt;/span&gt; &lt;span class="s2"&gt;"producer_cosmos_reader"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;scope&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_cosmosdb_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;role_definition_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"DocumentDB Account Contributor"&lt;/span&gt; 
  &lt;span class="nx"&gt;principal_id&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_linux_function_app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;producer_func&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;identity&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;principal_id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Give Consumer Function permission to Read from Service Bus&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_role_assignment"&lt;/span&gt; &lt;span class="s2"&gt;"consumer_sb_receiver"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;scope&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_servicebus_namespace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sb_namespace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;role_definition_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Azure Service Bus Data Receiver"&lt;/span&gt;
  &lt;span class="nx"&gt;principal_id&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_linux_function_app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consumer_func&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;identity&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;principal_id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4. Common Troubleshooting
&lt;/h2&gt;

&lt;p&gt;Navigating cloud permissions and distributed architectures can present hurdles. Here are common issues and their solutions when working with this Azure stack:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Producer Function is not triggering on Cosmos DB changes:&lt;/strong&gt; This usually indicates a problem with the leases container. Ensure that the &lt;code&gt;leases&lt;/code&gt; container actually exists in your Cosmos DB database and that your Function's &lt;code&gt;function.json&lt;/code&gt; bindings point to the correct database and collection names. Additionally, verify that your Function app settings contain the fully qualified URI to the Cosmos DB account if using Managed Identities.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Authorization rules or "Unauthorized" errors on Service Bus:&lt;/strong&gt; If your consumer function fails to read the queue, ensure that the "Azure Service Bus Data Receiver" role has propagated. RBAC assignments in Azure can sometimes take 5-10 minutes to fully propagate globally. Avoid using older connection string bindings in your function code; ensure your code uses &lt;code&gt;ServiceBusClient&lt;/code&gt; configured with &lt;code&gt;DefaultAzureCredential()&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Poison Messages and Infinite Loops:&lt;/strong&gt; Just like AWS SQS, if your Consumer Function throws an unhandled exception, the Service Bus message is not marked as complete and will be retried. If this continues, it will cause an infinite loop. The &lt;code&gt;max_delivery_count = 10&lt;/code&gt; defined in Step 3 solves this by automatically moving the failing message to a dead-letter queue (DLQ) after 10 attempts, allowing your system to continue processing healthy messages.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  5. Conclusion
&lt;/h2&gt;

&lt;p&gt;In this tutorial, you transitioned architectural concepts across cloud providers, building an enterprise-grade, asynchronous infrastructure on Microsoft Azure. By utilizing Cosmos DB Change Feeds, Azure Functions, and Service Bus Topics/Subscriptions, you implemented a resilient Fan-Out event-driven system. Leveraging Terraform and System Assigned Managed Identities ensured that your infrastructure is not only automated and version-controlled but also inherently secure by eliminating the need for hardcoded credentials. As next steps, consider implementing Azure Application Insights for end-to-end distributed tracing across your functions, and placing your Storage Account behind an Azure Virtual Network (VNet) private endpoint for enhanced network isolation.&lt;/p&gt;

</description>
      <category>azure</category>
      <category>cosmosdb</category>
      <category>terraform</category>
      <category>azureservicebus</category>
    </item>
    <item>
      <title>Serverless Data Processing: DynamoDB Streams, SNS, SQS, and Lambda</title>
      <dc:creator>Cláudio Filipe Lima Rapôso</dc:creator>
      <pubDate>Tue, 28 Apr 2026 01:58:48 +0000</pubDate>
      <link>https://dev.to/sertaoseracloud/serverless-data-processing-dynamodb-streams-sns-sqs-and-lambda-cka</link>
      <guid>https://dev.to/sertaoseracloud/serverless-data-processing-dynamodb-streams-sns-sqs-and-lambda-cka</guid>
      <description>&lt;h2&gt;
  
  
  1. Introduction
&lt;/h2&gt;

&lt;p&gt;An event-driven architecture utilizing DynamoDB Streams, AWS Lambda, Amazon SNS, and Amazon SQS is considered the gold standard for designing highly decoupled, scalable, and resilient systems in the cloud. In this tutorial, you will construct a complete, asynchronous data flow from scratch. The process begins when modifications occur in a DynamoDB table, which automatically triggers a data stream. This stream is continuously captured by a producer Lambda function, which processes the change and publishes a notification to an SNS topic. The SNS acts as a high-availability message router, performing a fan-out distribution of this notification to an SQS queue. Finally, a second consumer Lambda function reads the messages from the SQS queue to perform the final business logic processing. By the end of this guide, you will be able to deploy this robust infrastructure using Terraform alongside official AWS community modules. This approach is extremely useful because it ensures your infrastructure is version-controlled, highly maintainable, and inherently adopts security best practices, such as the principle of least privilege and fault tolerance.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before starting the infrastructure build, you must ensure that your development environment is properly prepared. You will need an active AWS account with administrative permissions to provision compute, messaging, database, and Identity and Access Management (IAM) resources. The AWS Command Line Interface (AWS CLI) must be installed and configured with your access credentials on your local machine. Additionally, Terraform (version 1.3.0 or higher) needs to be installed and globally accessible from your command terminal. Fundamental knowledge of Terraform's declarative syntax (HCL) and a basic understanding of how cloud messaging systems operate are highly recommended for the best learning experience. A modern code editor will greatly facilitate the creation and manipulation of the configuration files.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Step-by-Step
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Initial Configuration and AWS Provider
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Create the root Terraform configuration file to define the required Terraform version and establish the connection with the AWS provider, explicitly specifying the region where the resources will be allocated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; Terraform operates through plugins called providers. The AWS provider block instructs the tool on which API to use for authentication and resource provisioning. Declaring the region explicitly avoids unexpected behaviors and ensures that all interconnected services, such as DynamoDB and SNS, are deployed in the same geographical environment, significantly reducing the network latency of your architecture. Setting default tags also helps organize your resources for billing and management purposes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example (&lt;code&gt;main.tf&lt;/code&gt;):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;required_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"&amp;gt;= 1.3.0"&lt;/span&gt;
  &lt;span class="nx"&gt;required_providers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;aws&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"hashicorp/aws"&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 5.0"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"aws"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;region&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"us-east-1"&lt;/span&gt;
  &lt;span class="nx"&gt;default_tags&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;Environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Production"&lt;/span&gt;
      &lt;span class="nx"&gt;Project&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"EventDrivenArchitecture"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Creation of the DynamoDB Table with Stream Enabled
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Utilize the official DynamoDB Terraform module to create the main application table, ensuring that the Streams feature is activated with the view type configured to capture both new and old images of the modified items.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; The official module drastically simplifies the definition of partition keys and table attributes. Enabling the Stream is the foundational step of this entire architecture, as it acts as the initial trigger for our event pipeline. Setting the &lt;code&gt;stream_view_type&lt;/code&gt; to &lt;code&gt;NEW_AND_OLD_IMAGES&lt;/code&gt; guarantees that the consuming Lambda function will receive both the previous state of the modified item and its new state. This provides complete context, allowing your code to perform complex validations or delta comparisons during data processing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example (&lt;code&gt;dynamodb.tf&lt;/code&gt;):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"dynamodb_table"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/dynamodb-table/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 4.0"&lt;/span&gt;

  &lt;span class="nx"&gt;name&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"application-data-table"&lt;/span&gt;
  &lt;span class="nx"&gt;hash_key&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"id"&lt;/span&gt;
  &lt;span class="nx"&gt;billing_mode&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"PAY_PER_REQUEST"&lt;/span&gt;

  &lt;span class="nx"&gt;attributes&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"id"&lt;/span&gt;
      &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"S"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;

  &lt;span class="nx"&gt;stream_enabled&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;stream_view_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"NEW_AND_OLD_IMAGES"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Provisioning the SNS Topic and SQS Queue
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Implement the official modules for Amazon SNS and Amazon SQS. You must configure the access policy of the SQS queue to securely allow the SNS topic to publish messages directly into it, and then create the subscription linking them together.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; Using SNS combined with SQS implements the robust "Fan-out" messaging pattern. SNS manages the immediate delivery and routing, while SQS ensures temporary storage, persistence, and message retention in case the consumer Lambda is temporarily unavailable. Utilizing the official modules abstracts away the complexity of manually crafting JSON IAM policies. By enabling &lt;code&gt;create_queue_policy = true&lt;/code&gt; and providing the allowed ARNs, the SQS module automatically manages the IAM permissions required to receive payloads from SNS, keeping your security posture tight and error-free.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example (&lt;code&gt;messaging.tf&lt;/code&gt;):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"sns_topic"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/sns/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 6.0"&lt;/span&gt;

  &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"data-processing-topic"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"sqs_queue"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/sqs/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 4.0"&lt;/span&gt;

  &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"data-processing-queue"&lt;/span&gt;

  &lt;span class="nx"&gt;create_queue_policy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;queue_policy_statements&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;sns_publish&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;sid&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"AllowSNSPublish"&lt;/span&gt;
      &lt;span class="nx"&gt;actions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"sqs:SendMessage"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="nx"&gt;principals&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Service"&lt;/span&gt;
          &lt;span class="nx"&gt;identifiers&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"sns.amazonaws.com"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;]&lt;/span&gt;
      &lt;span class="nx"&gt;conditions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;test&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ArnEquals"&lt;/span&gt;
          &lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"aws:SourceArn"&lt;/span&gt;
          &lt;span class="nx"&gt;values&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sns_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;topic_arn&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_sns_topic_subscription"&lt;/span&gt; &lt;span class="s2"&gt;"sns_to_sqs"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;topic_arn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sns_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;topic_arn&lt;/span&gt;
  &lt;span class="nx"&gt;protocol&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"sqs"&lt;/span&gt;
  &lt;span class="nx"&gt;endpoint&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue_arn&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Deploying the Producer and Consumer Lambda Functions
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Configure two instances of the AWS Lambda module. The first function (Producer) requires strict permissions to publish messages to the SNS topic. The second function (Consumer) requires permissions to read and consume messages from the SQS queue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; Lambda functions require highly granular and well-defined permissions (Policies) attached to their execution identity (Role). The Terraform Lambda module handles the packaging of your source code and the creation of these Roles simultaneously. By passing inline JSON policies directly into the module, we guarantee that each Lambda possesses strictly the minimum level of access necessary to perform its specific task. This prevents security vulnerabilities that arise from excessive or wildcard permissions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example (&lt;code&gt;lambdas.tf&lt;/code&gt;):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"lambda_producer"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/lambda/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 7.0"&lt;/span&gt;

  &lt;span class="nx"&gt;function_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"dynamodb-stream-producer"&lt;/span&gt;
  &lt;span class="nx"&gt;handler&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"index.handler"&lt;/span&gt;
  &lt;span class="nx"&gt;runtime&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"nodejs20.x"&lt;/span&gt;
  &lt;span class="nx"&gt;source_path&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"../src/producer"&lt;/span&gt;

  &lt;span class="nx"&gt;environment_variables&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;SNS_TOPIC_ARN&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sns_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;topic_arn&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;attach_policy_json&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;policy_json&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jsonencode&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="nx"&gt;Version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;
    &lt;span class="nx"&gt;Statement&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;Action&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"sns:Publish"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="nx"&gt;Effect&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;
        &lt;span class="nx"&gt;Resource&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sns_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;topic_arn&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"lambda_consumer"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform-aws-modules/lambda/aws"&lt;/span&gt;
  &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 7.0"&lt;/span&gt;

  &lt;span class="nx"&gt;function_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"sqs-message-consumer"&lt;/span&gt;
  &lt;span class="nx"&gt;handler&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"index.handler"&lt;/span&gt;
  &lt;span class="nx"&gt;runtime&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"nodejs20.x"&lt;/span&gt;
  &lt;span class="nx"&gt;source_path&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"../src/consumer"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Mapping Event Sources to Triggers
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Create the native &lt;code&gt;aws_lambda_event_source_mapping&lt;/code&gt; resources to actively connect the DynamoDB Stream to the Producer Lambda, and the SQS Queue to the Consumer Lambda. You must also attach the AWS managed execution policies to the respective Lambda roles.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; While the Lambda module creates the compute resource, the "event source mapping" is the internal AWS polling service that actively reads the Stream or the Queue and invokes your function. This mapping mechanism requires specific operational policies (&lt;code&gt;AWSLambdaDynamoDBExecutionRole&lt;/code&gt; and &lt;code&gt;AWSLambdaSQSQueueExecutionRole&lt;/code&gt;) to function correctly. Integrating these components declaratively via Terraform centralizes the control of your end-to-end event flow, ensuring that infrastructure changes are applied consistently.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example (&lt;code&gt;triggers.tf&lt;/code&gt;):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Attach Stream reading permissions to the Producer Lambda&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_role_policy_attachment"&lt;/span&gt; &lt;span class="s2"&gt;"producer_dynamodb_policy"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;role&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_role_name&lt;/span&gt;
  &lt;span class="nx"&gt;policy_arn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"arn:aws:iam::aws:policy/service-role/AWSLambdaDynamoDBExecutionRole"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Integrate DynamoDB Stream with Producer Lambda&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_lambda_event_source_mapping"&lt;/span&gt; &lt;span class="s2"&gt;"dynamodb_trigger"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;event_source_arn&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dynamodb_table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dynamodb_table_stream_arn&lt;/span&gt;
  &lt;span class="nx"&gt;function_name&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_function_arn&lt;/span&gt;
  &lt;span class="nx"&gt;starting_position&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"LATEST"&lt;/span&gt;

  &lt;span class="nx"&gt;depends_on&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;aws_iam_role_policy_attachment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;producer_dynamodb_policy&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Attach SQS reading permissions to the Consumer Lambda&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_role_policy_attachment"&lt;/span&gt; &lt;span class="s2"&gt;"consumer_sqs_policy"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;role&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_consumer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_role_name&lt;/span&gt;
  &lt;span class="nx"&gt;policy_arn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"arn:aws:iam::aws:policy/service-role/AWSLambdaSQSQueueExecutionRole"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Integrate SQS with Consumer Lambda&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_lambda_event_source_mapping"&lt;/span&gt; &lt;span class="s2"&gt;"sqs_trigger"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;event_source_arn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue_arn&lt;/span&gt;
  &lt;span class="nx"&gt;function_name&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_consumer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_function_arn&lt;/span&gt;
  &lt;span class="nx"&gt;batch_size&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;

  &lt;span class="nx"&gt;depends_on&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;aws_iam_role_policy_attachment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consumer_sqs_policy&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4. Common Troubleshooting
&lt;/h2&gt;

&lt;p&gt;Even when following best practices, distributed systems can present challenges regarding network configuration and IAM permissions. Below are the most common problems you might encounter and how to efficiently solve them:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Producer Lambda is not invoked by DynamoDB:&lt;/strong&gt; This almost always occurs due to IAM permission failures. The execution Role of the Lambda must possess the &lt;code&gt;dynamodb:GetRecords&lt;/code&gt;, &lt;code&gt;dynamodb:GetShardIterator&lt;/code&gt;, &lt;code&gt;dynamodb:DescribeStream&lt;/code&gt;, and &lt;code&gt;dynamodb:ListStreams&lt;/code&gt; permissions. Verify that the &lt;code&gt;AWSLambdaDynamoDBExecutionRole&lt;/code&gt; policy was successfully attached in your Terraform state.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;SQS Queue does not receive messages from the SNS Topic:&lt;/strong&gt; The SNS subscription protocol dictates that the endpoint (the SQS queue) must explicitly authorize the reception of messages. Confirm that the SQS access policy contains a statement allowing the &lt;code&gt;sqs:SendMessage&lt;/code&gt; action, with the principal set to the &lt;code&gt;sns.amazonaws.com&lt;/code&gt; service, and a restrictive condition matching the exact ARN of your SNS topic. This prevents unauthorized cross-account access.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Duplicate processing or infinite retries in the Consumer Lambda:&lt;/strong&gt; If your code within the consumer Lambda fails and throws an unhandled exception, the message will return to the SQS queue after the visibility timeout expires and will be reprocessed, creating an infinite loop of errors. To solve this, wrap your logic in try-catch blocks and implement a Dead Letter Queue (DLQ). Configure the &lt;code&gt;maxReceiveCount&lt;/code&gt; property on your main SQS queue to safely route "poison pill" messages to the DLQ after a limited number of failed attempts.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  5. Conclusion
&lt;/h2&gt;

&lt;p&gt;In this tutorial, you have comprehensively understood and practically applied the creation of a highly scalable, asynchronous architecture on AWS. The combination of DynamoDB Streams, Lambda functions, SNS, and SQS forms a resilient, event-driven system where each isolated component performs its specific duty flawlessly. Utilizing Terraform with official modules ensured agility, security, and exceptional structural code organization. As next steps to evolve this ecosystem, it is highly recommended to implement KMS (Key Management Service) encryption for your SNS topics and SQS queues, build CI/CD pipelines to automate the provisioning process, and create specialized CloudWatch dashboards to monitor message age and function error metrics.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>terraform</category>
      <category>dynamodb</category>
      <category>lambda</category>
    </item>
    <item>
      <title>Practical Guide: Building an Active-Active Multicloud Cell-Based Architecture</title>
      <dc:creator>Cláudio Filipe Lima Rapôso</dc:creator>
      <pubDate>Thu, 23 Apr 2026 03:00:00 +0000</pubDate>
      <link>https://dev.to/sertaoseracloud/practical-guide-building-an-active-active-multicloud-cell-based-architecture-bk8</link>
      <guid>https://dev.to/sertaoseracloud/practical-guide-building-an-active-active-multicloud-cell-based-architecture-bk8</guid>
      <description>&lt;h2&gt;
  
  
  1. Introduction
&lt;/h2&gt;

&lt;p&gt;A multicloud cell-based architecture represents the pinnacle of fault isolation and vendor neutrality. By distributing autonomous "cells" across different cloud provider, such as placing Cell Alpha in AWS and Cell Beta in Azure, you eliminate the risk of a single cloud provider's regional or global outage taking down your entire application. At the end of this tutorial, you will understand how to provision a unified control plane that intelligently routes tenant traffic to entirely isolated data planes residing in disparate cloud ecosystems.&lt;/p&gt;

&lt;p&gt;This architecture is highly valuable for organizations with extreme availability requirements and strict regulatory compliance mandates. It prevents vendor lock-in by enforcing an agnostic ingress layer and identical compute/state contracts across clouds. Building this requires mature infrastructure as code practices and a deep understanding of decoupled domain boundaries.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Prerequisites
&lt;/h2&gt;

&lt;p&gt;To execute this architectural pattern, you must have the following tools and access configured:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Active accounts on both Amazon Web Services (AWS) and Microsoft Azure, with administrative credentials.&lt;/li&gt;
&lt;li&gt;Terraform (version 1.0+) installed locally, with both &lt;code&gt;hashicorp/aws&lt;/code&gt; and &lt;code&gt;hashicorp/azurerm&lt;/code&gt; providers authenticated.&lt;/li&gt;
&lt;li&gt;Python (version 3.11+) for writing the agnostic edge routing logic.&lt;/li&gt;
&lt;li&gt;A registered domain name and access to a neutral Edge DNS provider (e.g., Cloudflare) to act as the global entry point.&lt;/li&gt;
&lt;li&gt;Understanding of Domain-Driven Design (DDD) to ensure cell workloads are completely bounded and stateless.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. Step-by-Step
&lt;/h2&gt;

&lt;p&gt;A multicloud cellular architecture requires a strict separation between the "Control Plane" (which knows where tenants live) and the "Data Plane" (where the actual compute and storage happen). The sequence diagram below illustrates how an edge router evaluates the request and proxies it to the respective cloud provider.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe2e2rek5unmrhr5kaj2m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe2e2rek5unmrhr5kaj2m.png" alt="Sequence Diagram" width="800" height="495"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Configuring the Multicloud Terraform Environment
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Create a root Terraform configuration that initializes both the AWS and Azure providers simultaneously.&lt;br&gt;
&lt;strong&gt;Why do it:&lt;/strong&gt; To manage a true multicloud infrastructure, your CI/CD pipeline must be able to orchestrate state across both environments from a single source of truth, ensuring that the infrastructure contracts remain identical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Create your &lt;code&gt;main.tf&lt;/code&gt; file to authenticate both providers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;required_providers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;aws&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"hashicorp/aws"&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 5.0"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nx"&gt;azurerm&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"hashicorp/azurerm"&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 3.0"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"aws"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;region&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"us-east-1"&lt;/span&gt;
  &lt;span class="nx"&gt;alias&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"primary"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"azurerm"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;features&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
  &lt;span class="nx"&gt;subscription_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;azure_subscription_id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"azure_subscription_id"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Azure Subscription ID for the multicloud landing zone"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.2 Provisioning the AWS Cell (Cell Alpha)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Deploy the AWS-specific data plane, consisting of an API Gateway, Lambda, and DynamoDB.&lt;br&gt;
&lt;strong&gt;Why do it:&lt;/strong&gt; This establishes your first isolated failure domain. The AWS cell acts as a self-contained unit capable of serving its assigned tenants without any dependency on the central control plane or the Azure cell.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Append the AWS cell definition to your &lt;code&gt;main.tf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# AWS Cell Infrastructure (Simplified)&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_dynamodb_table"&lt;/span&gt; &lt;span class="s2"&gt;"aws_cell_state"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;provider&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;primary&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"multicloud-state-aws-alpha"&lt;/span&gt;
  &lt;span class="nx"&gt;billing_mode&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"PAY_PER_REQUEST"&lt;/span&gt;
  &lt;span class="nx"&gt;hash_key&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"id"&lt;/span&gt;

  &lt;span class="nx"&gt;attribute&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"id"&lt;/span&gt;
    &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"S"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_lambda_function"&lt;/span&gt; &lt;span class="s2"&gt;"aws_cell_worker"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;provider&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;primary&lt;/span&gt;
  &lt;span class="nx"&gt;filename&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"aws_worker.zip"&lt;/span&gt;
  &lt;span class="nx"&gt;function_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"aws-worker-alpha"&lt;/span&gt;
  &lt;span class="nx"&gt;role&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_iam_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_exec_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
  &lt;span class="nx"&gt;handler&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"worker.handler"&lt;/span&gt;
  &lt;span class="nx"&gt;runtime&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"python3.11"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_lambda_function_url"&lt;/span&gt; &lt;span class="s2"&gt;"aws_ingress"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;provider&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;primary&lt;/span&gt;
  &lt;span class="nx"&gt;function_name&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_lambda_function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_cell_worker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;function_name&lt;/span&gt;
  &lt;span class="nx"&gt;authorization_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"NONE"&lt;/span&gt; &lt;span class="c1"&gt;# In production, restrict to Edge Router IPs/Tokens&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;"aws_cell_endpoint"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_lambda_function_url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_ingress&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;function_url&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.3 Provisioning the Azure Cell (Cell Beta)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Deploy the Azure-specific data plane, consisting of an Azure Function and Cosmos DB.&lt;br&gt;
&lt;strong&gt;Why do it:&lt;/strong&gt; This creates the second failure domain in a completely separate physical network and control plane. If AWS experiences a major outage, the tenants assigned to the Azure cell experience zero degradation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Append the Azure cell definition to your &lt;code&gt;main.tf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Azure Cell Infrastructure (Simplified)&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_resource_group"&lt;/span&gt; &lt;span class="s2"&gt;"az_cell_rg"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"rg-multicloud-beta"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"East US"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_cosmosdb_account"&lt;/span&gt; &lt;span class="s2"&gt;"az_cell_db"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"db-multicloud-az-beta"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;az_cell_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;az_cell_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;offer_type&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Standard"&lt;/span&gt;
  &lt;span class="nx"&gt;kind&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"GlobalDocumentDB"&lt;/span&gt;

  &lt;span class="nx"&gt;geo_location&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;location&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;az_cell_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
    &lt;span class="nx"&gt;failover_priority&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;consistency_policy&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;consistency_level&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Session"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_storage_account"&lt;/span&gt; &lt;span class="s2"&gt;"az_storage"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"stmulticloudbeta"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;az_cell_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;az_cell_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;account_tier&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Standard"&lt;/span&gt;
  &lt;span class="nx"&gt;account_replication_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"LRS"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_service_plan"&lt;/span&gt; &lt;span class="s2"&gt;"az_plan"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"plan-multicloud-beta"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;az_cell_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;az_cell_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;os_type&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Linux"&lt;/span&gt;
  &lt;span class="nx"&gt;sku_name&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Y1"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_linux_function_app"&lt;/span&gt; &lt;span class="s2"&gt;"az_cell_worker"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"func-az-worker-beta"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;                   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;az_cell_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;az_cell_rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;service_plan_id&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_service_plan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;az_plan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;storage_account_name&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_storage_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;az_storage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;storage_account_access_key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_storage_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;az_storage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;primary_access_key&lt;/span&gt;

  &lt;span class="nx"&gt;site_config&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;application_stack&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;python_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"3.11"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;"az_cell_endpoint"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"https://${azurerm_linux_function_app.az_cell_worker.default_hostname}/api/process"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.4 Developing the Agnostic Edge Router (Python)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Write the Python routing logic that will be deployed to a neutral edge layer (like Cloudflare Workers via Pyodide, or a dedicated edge Kubernetes cluster).&lt;br&gt;
&lt;strong&gt;Why do it:&lt;/strong&gt; You cannot host the global router inside AWS or Azure. If you host it in AWS and AWS goes down, your clients cannot reach the Azure cell because the router itself is dead. The routing layer must be cloud-agnostic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Create &lt;code&gt;global_router.py&lt;/code&gt;. This script intercepts the payload, queries the registry, and proxies the request across the internet backbone to the appropriate cloud.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;urllib.request&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="c1"&gt;# The global registry maps tenant IDs to specific cloud cell endpoints.
# In production, this data is fetched from a distributed edge database.
&lt;/span&gt;&lt;span class="n"&gt;GLOBAL_REGISTRY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;T-100&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;provider&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;aws&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_CELL_ENDPOINT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://aws-api-mock.com/process&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;T-200&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;provider&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;azure&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AZ_CELL_ENDPOINT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://az-api-mock.com/api/process&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;route_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request_body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request_body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tenant_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Missing tenant_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;# 1. Lookup tenant location in the registry
&lt;/span&gt;        &lt;span class="n"&gt;cell_info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;GLOBAL_REGISTRY&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;cell_info&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tenant not allocated to any cell&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="n"&gt;target_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cell_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cell_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;provider&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="c1"&gt;# 2. Proxy request to the specific cloud provider
&lt;/span&gt;        &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;target_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request_body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;X-Multicloud-Auth&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;EdgeToken123&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;urlopen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;cell_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# 3. Return payload to client with multicloud diagnostic headers
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;headers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;X-Processed-By-Cloud&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cell_response&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Edge Routing Failure: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4. Common Troubleshooting
&lt;/h2&gt;

&lt;p&gt;Operating a multicloud cellular architecture introduces extreme networking and data synchronization complexities. Be prepared to address the following challenges:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;State Fragmentation and Global Reporting:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Problem:&lt;/strong&gt; Your business intelligence team cannot run global queries because data is split between AWS DynamoDB and Azure Cosmos DB.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution:&lt;/strong&gt; Do not attempt real-time federated queries across clouds. Implement a unified Data Lake (e.g., Snowflake or Databricks). Configure both the AWS Lambda and the Azure Function to asynchronously stream their state changes (Event Sourcing) into this central analytical repository.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-Cloud Latency at the Routing Layer:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Problem:&lt;/strong&gt; If the neutral Edge Router has to query a database in AWS to find out that a tenant belongs in Azure, the latency overhead becomes unacceptable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution:&lt;/strong&gt; The Tenant Registry must be cached at the edge. Utilize technologies like Cloudflare KV, Fastly Compute dictionaries, or Redis Enterprise Active-Active to ensure the routing metadata is geographically local to the Edge Router executing the Python script.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CI/CD Configuration Drift:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Problem:&lt;/strong&gt; Over time, developers add features to the AWS cell but forget to implement the exact Azure equivalent, breaking the "identical cell" contract.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution:&lt;/strong&gt; Enforce strict Hexagonal Architecture within your Python applications. The core domain logic must be a pure Python package shared across both clouds. The only differing code should be the specific I/O adapters (DynamoDB vs. Cosmos DB connectors).&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  5. Conclusion
&lt;/h2&gt;

&lt;p&gt;By implementing a multicloud cell-based architecture, you have constructed the ultimate defense against cloud-provider lock-in and catastrophic outages. We used Terraform to simultaneously orchestrate isolated data planes in AWS and Azure, and developed an agnostic Python edge router to direct traffic based on tenant identity. &lt;/p&gt;

&lt;p&gt;This model requires significant engineering maturity, but it guarantees that your infrastructure can scale infinitely horizontally while treating cloud providers as interchangeable utility commodities. As a next step, focus on automating the cross-cloud tenant migration process, allowing you to evacuate tenants from AWS to Azure in real-time if telemetry indicates a degradation in a specific provider's region.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>aws</category>
      <category>azure</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Event-Driven Architecture on Azure vs AWS: Service Bus vs SNS/SQS</title>
      <dc:creator>Cláudio Filipe Lima Rapôso</dc:creator>
      <pubDate>Tue, 21 Apr 2026 19:59:01 +0000</pubDate>
      <link>https://dev.to/sertaoseracloud/event-driven-architecture-on-azure-vs-aws-service-bus-vs-snssqs-3cml</link>
      <guid>https://dev.to/sertaoseracloud/event-driven-architecture-on-azure-vs-aws-service-bus-vs-snssqs-3cml</guid>
      <description>&lt;p&gt;Your &lt;code&gt;OrderService&lt;/code&gt; does six things when a customer clicks &lt;em&gt;Place Order&lt;/em&gt;. It writes to the orders table, reserves inventory, charges the card, enqueues the shipping label, emails the receipt, and logs the analytics event. It does all of that inside one HTTP handler, in one transaction, on one box. When the payment gateway hiccups, the order fails. When the email provider rate-limits, the order fails. When inventory is slow, the order fails. The monolith tied six independent failure domains to one shared fate.&lt;/p&gt;

&lt;p&gt;This article rebuilds that pipeline as an event-driven system on both Azure and AWS, and walks through where they genuinely differ - not where they superficially look the same. The reference scenario is e-commerce order processing with fan-out to Inventory, Payment, and Notification. Audience: intermediate-to-senior engineers who already read the vendor landing pages and want the parts those pages leave out.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pattern before the products
&lt;/h2&gt;

&lt;p&gt;Before naming any cloud service, fix the shape. What you want is &lt;strong&gt;publish/subscribe fan-out with durable per-consumer queues and per-consumer dead-letter queues&lt;/strong&gt;. The producer emits one logical event - &lt;code&gt;OrderPlaced&lt;/code&gt; - to a topic. The topic delivers a copy to one durable queue per consumer. Each consumer drains its own queue at its own pace, retries on its own schedule, and when it gives up, the message lands in &lt;em&gt;its own&lt;/em&gt; DLQ - not a shared one.&lt;/p&gt;

&lt;p&gt;That last part matters. A shared DLQ means a poisoned inventory message blocks the payment team's on-call from seeing their own poison. One queue per consumer, one DLQ per consumer, one retry budget per consumer. The blast radius of any single bad message is exactly one bounded context.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4u58flh3h0l5nf1oxzz6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4u58flh3h0l5nf1oxzz6.png" alt="Fluxogram" width="800" height="295"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With the shape fixed, we can map it onto two clouds - one concern at a time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Side by side, one concern at a time
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Topic and subscription model
&lt;/h3&gt;

&lt;p&gt;On &lt;strong&gt;AWS&lt;/strong&gt; the topic and the queues are separate primitives. SNS publishes; SQS stores. You wire them together with a topic subscription and a queue policy. The queue holds your messages; the topic just fans them out. Two resource types per consumer.&lt;/p&gt;

&lt;p&gt;On &lt;strong&gt;Azure&lt;/strong&gt; Service Bus collapses that into one resource graph. A &lt;strong&gt;namespace&lt;/strong&gt; contains a &lt;strong&gt;topic&lt;/strong&gt;, and each consumer is a &lt;strong&gt;subscription&lt;/strong&gt; on that topic. The subscription has a virtual queue behind it; you do not manage a separate queue resource. Fewer moving parts at the IaC layer, but less separation of concern - the topic and its subscribers share a lifecycle and a billing unit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Queue semantics and DLQ
&lt;/h3&gt;

&lt;p&gt;Both brokers are &lt;strong&gt;at-least-once&lt;/strong&gt;. Consumers will see duplicates. No marketing slide changes that.&lt;/p&gt;

&lt;p&gt;AWS pairs each SQS queue with an explicit DLQ via a redrive policy. &lt;code&gt;maxReceiveCount&lt;/code&gt; is the threshold; the main queue holds the in-flight message until the consumer explicitly deletes it, controlled by &lt;code&gt;visibility_timeout_seconds&lt;/code&gt;. That visibility timeout must exceed P99 handler latency, with margin. Set it too low and the broker redelivers while the first handler is still working - you get two concurrent handlers racing, and idempotency becomes load-bearing in a way you probably did not test.&lt;/p&gt;

&lt;p&gt;Azure Service Bus bakes the DLQ into every subscription as &lt;code&gt;$DeadLetterQueue&lt;/code&gt;. &lt;code&gt;maxDeliveryCount&lt;/code&gt; plays the same role as &lt;code&gt;maxReceiveCount&lt;/code&gt;. Service Bus also dead-letters on two classes of failure SQS does not know about: &lt;strong&gt;message TTL expiry&lt;/strong&gt; and &lt;strong&gt;filter-evaluation exceptions&lt;/strong&gt;. Those two extra DLQ triggers are real operational wins - expired or malformed messages do not vanish into metrics.&lt;/p&gt;

&lt;p&gt;One common claim worth pushing back on: &lt;em&gt;Service Bus gives exactly-once delivery&lt;/em&gt;. It does not. What it gives is &lt;strong&gt;duplicate detection within a bounded window&lt;/strong&gt; (up to seven days) on a per-&lt;code&gt;MessageId&lt;/code&gt; basis. That is a broker-side helper, not a semantic guarantee. Consumers must still be idempotent. Same story on SQS FIFO and its five-minute content-based dedup window.&lt;/p&gt;

&lt;h3&gt;
  
  
  Identity and data-plane auth
&lt;/h3&gt;

&lt;p&gt;On AWS, consumers assume an IAM role from Lambda, ECS, or EKS. No access keys, no user credentials in config. The queue policy restricts senders to a specific topic ARN via an &lt;code&gt;aws:SourceArn&lt;/code&gt; condition - without it, any SNS topic in your account can write to your queue. That is the classic confused-deputy footgun, and leaving the condition off is one of the most common rejection triggers in a real review.&lt;/p&gt;

&lt;p&gt;On Azure, the equivalent of "no long-lived keys" is &lt;code&gt;disableLocalAuth: true&lt;/code&gt; on the namespace, which kills SAS authentication entirely. All auth goes through AAD and Managed Identity. The right role is &lt;strong&gt;Service Bus Data Receiver&lt;/strong&gt; scoped &lt;strong&gt;per subscription&lt;/strong&gt;, not namespace-wide. Writers get &lt;strong&gt;Service Bus Data Sender&lt;/strong&gt; on the topic. Scoping at the subscription level means a compromised notification consumer cannot read payment events - lateral movement is bounded by the RBAC scope.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ordering
&lt;/h3&gt;

&lt;p&gt;Both platforms can do ordering. Neither should do ordering by default.&lt;/p&gt;

&lt;p&gt;On AWS, ordering means FIFO queues keyed by &lt;code&gt;MessageGroupId&lt;/code&gt;. The cap is 300 messages/sec (3,000 with batching) per queue. That is a hard ceiling, not a soft throttle.&lt;/p&gt;

&lt;p&gt;On Azure, ordering means &lt;code&gt;SessionId&lt;/code&gt; on messages and &lt;code&gt;requiresSession: true&lt;/code&gt; on the subscription. Order is preserved per session. Throughput is fine on Standard; partitioning on Premium pushes it higher. The cost is that session-pinning serialises a subscription - a slow consumer on one session stalls messages in that session until the lock clears.&lt;/p&gt;

&lt;p&gt;If the domain does not require strict order, do not enable it. FIFO is a business decision, not an architectural default.&lt;/p&gt;

&lt;h2&gt;
  
  
  The trade-off matrix
&lt;/h2&gt;

&lt;p&gt;This is the center of the article, not the closer. If you remember one thing, remember this table.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;AWS SNS + SQS&lt;/th&gt;
&lt;th&gt;Azure Service Bus (Standard / Premium)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Primitive model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Topic (SNS) fans out to separate SQS queues - two resource types per consumer&lt;/td&gt;
&lt;td&gt;Single namespace → topic → subscription - one resource graph&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pay-per-request on both SNS publishes and SQS requests; no idle cost&lt;/td&gt;
&lt;td&gt;Standard: pay-per-million operations + namespace hourly. Premium: fixed messaging units (predictable floor ≈ $670/MU/month at time of writing; verify current SKU pricing, it drifts)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Message ordering&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Standard: none. FIFO: strict ordering by &lt;code&gt;MessageGroupId&lt;/code&gt;, capped at 300 msgs/s (3,000 with batching)&lt;/td&gt;
&lt;td&gt;Standard: ordering within a session (&lt;code&gt;SessionId&lt;/code&gt;). Premium: same, higher throughput, partitioning supported&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Delivery semantics&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;At-least-once. FIFO adds content-based dedup in a 5-minute window&lt;/td&gt;
&lt;td&gt;At-least-once. PeekLock + duplicate detection up to 7 days. Broker-side helper; still needs idempotent consumers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Max message size&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;256 KB on both SNS and SQS. Workaround: claim-check via S3 + SQS Extended Client&lt;/td&gt;
&lt;td&gt;Standard: 256 KB. Premium: 100 MB native&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DLQ handling&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Explicit SQS queue + redrive policy. &lt;code&gt;maxReceiveCount&lt;/code&gt; threshold. Delivery-failure DLQ only&lt;/td&gt;
&lt;td&gt;Implicit &lt;code&gt;$DeadLetterQueue&lt;/code&gt; per subscription. &lt;code&gt;maxDeliveryCount&lt;/code&gt; threshold. Also DLQs on TTL expiry and filter-eval errors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Filtering&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SNS filter policies: JSON attribute match at subscription time&lt;/td&gt;
&lt;td&gt;SQL filter and correlation filter per subscription - richer expression model&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Ops surface&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;CloudWatch: &lt;code&gt;ApproximateNumberOfMessages&lt;/code&gt;, &lt;code&gt;ApproximateAgeOfOldestMessage&lt;/code&gt;. Alarm on DLQ depth &amp;gt; 0&lt;/td&gt;
&lt;td&gt;Azure Monitor: &lt;code&gt;ActiveMessages&lt;/code&gt;, &lt;code&gt;DeadletteredMessages&lt;/code&gt;. Alarm on &lt;code&gt;DeadletteredMessages &amp;gt; 0&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Identity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;IAM roles assumed by Lambda/ECS/EKS; SSE-KMS&lt;/td&gt;
&lt;td&gt;AAD + Managed Identity; &lt;code&gt;disableLocalAuth=true&lt;/code&gt; kills SAS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Network isolation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;VPC Endpoints (Interface for SNS, Interface/Gateway for SQS)&lt;/td&gt;
&lt;td&gt;Private Endpoint (Premium SKU for full VNet integration)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Throughput&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SNS: millions of msgs/s. SQS standard: effectively unlimited. SQS FIFO: 300–3,000 msgs/s per queue&lt;/td&gt;
&lt;td&gt;Standard: ~2,000 msgs/s per namespace as a working guideline. Premium: scales with messaging units (~1,000 msgs/s per MU)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;One-line heuristic:&lt;/strong&gt; If the workload demands &amp;gt;256 KB messages, strong ordering at high throughput, or VNet isolation with ordering, Service Bus Premium earns its cost. Otherwise - massive fan-out, idempotent consumers, per-request billing - SNS + SQS wins. &lt;strong&gt;Standard Service Bus is the baseline; Premium is an upgrade you justify, not a default.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The IaC - AWS
&lt;/h2&gt;

&lt;p&gt;Production-grade Terraform. FinOps tags, SSE, redrive policies, and an &lt;code&gt;aws:SourceArn&lt;/code&gt; condition on the queue policy. Keep all of it - each line carries a security or reliability property the cluster depends on.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;required_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"&amp;gt;= 1.6.0"&lt;/span&gt;
  &lt;span class="nx"&gt;required_providers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;aws&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"hashicorp/aws"&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 5.40"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"project"&lt;/span&gt;        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"environment"&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;               &lt;span class="c1"&gt;# dev | stg | prd&lt;/span&gt;
&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"aws_region"&lt;/span&gt;     &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;  &lt;span class="nx"&gt;default&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"us-east-1"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"consumers"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Logical consumers that subscribe to the OrderPlaced topic."&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;default&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"inventory"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"payment"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"notification"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"max_receive_count"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;number&lt;/span&gt; &lt;span class="nx"&gt;default&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="c1"&gt;# redrive threshold&lt;/span&gt;

&lt;span class="nx"&gt;locals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name_prefix&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${var.project}-${var.environment}"&lt;/span&gt;
  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;Project&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;project&lt;/span&gt;
    &lt;span class="nx"&gt;Environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;environment&lt;/span&gt;
    &lt;span class="nx"&gt;Workload&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"eda-orders"&lt;/span&gt;
    &lt;span class="nx"&gt;CostCenter&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"platform-events"&lt;/span&gt;
    &lt;span class="nx"&gt;ManagedBy&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"terraform"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# --- Topic ----------------------------------------------------------------&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_sns_topic"&lt;/span&gt; &lt;span class="s2"&gt;"orders_placed"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;              &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${local.name_prefix}-orders-placed"&lt;/span&gt;
  &lt;span class="nx"&gt;kms_master_key_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"alias/aws/sns"&lt;/span&gt;          &lt;span class="c1"&gt;# SSE at rest with AWS-managed CMK; swap to customer CMK for stricter tenants&lt;/span&gt;
  &lt;span class="nx"&gt;tags&lt;/span&gt;              &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tags&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# --- Queues + DLQs per consumer ------------------------------------------&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_sqs_queue"&lt;/span&gt; &lt;span class="s2"&gt;"dlq"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt;                   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consumers&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${local.name_prefix}-${each.key}-dlq"&lt;/span&gt;
  &lt;span class="nx"&gt;message_retention_seconds&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1209600&lt;/span&gt;            &lt;span class="c1"&gt;# 14 days - max allowed, buys ops time&lt;/span&gt;
  &lt;span class="nx"&gt;kms_master_key_id&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"alias/aws/sqs"&lt;/span&gt;
  &lt;span class="nx"&gt;tags&lt;/span&gt;                       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Role&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"dlq"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Consumer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_sqs_queue"&lt;/span&gt; &lt;span class="s2"&gt;"main"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt;                   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consumers&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${local.name_prefix}-${each.key}"&lt;/span&gt;
  &lt;span class="nx"&gt;visibility_timeout_seconds&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;                 &lt;span class="c1"&gt;# must exceed consumer max processing time&lt;/span&gt;
  &lt;span class="nx"&gt;message_retention_seconds&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;345600&lt;/span&gt;             &lt;span class="c1"&gt;# 4 days&lt;/span&gt;
  &lt;span class="nx"&gt;receive_wait_time_seconds&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;                 &lt;span class="c1"&gt;# long polling - reduces empty-receive cost&lt;/span&gt;
  &lt;span class="nx"&gt;kms_master_key_id&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"alias/aws/sqs"&lt;/span&gt;
  &lt;span class="nx"&gt;redrive_policy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jsonencode&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="nx"&gt;deadLetterTargetArn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dlq&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
    &lt;span class="nx"&gt;maxReceiveCount&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;max_receive_count&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="nx"&gt;tags&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Role&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"main"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Consumer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# --- Allow SNS to write to SQS -------------------------------------------&lt;/span&gt;
&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_policy_document"&lt;/span&gt; &lt;span class="s2"&gt;"sns_to_sqs"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consumers&lt;/span&gt;
  &lt;span class="nx"&gt;statement&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;sid&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"AllowSNSDeliver"&lt;/span&gt;
    &lt;span class="nx"&gt;effect&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;
    &lt;span class="nx"&gt;actions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"sqs:SendMessage"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;principals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Service"&lt;/span&gt;
      &lt;span class="nx"&gt;identifiers&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"sns.amazonaws.com"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nx"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;aws_sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;condition&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;test&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ArnEquals"&lt;/span&gt;
      &lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"aws:SourceArn"&lt;/span&gt;
      &lt;span class="nx"&gt;values&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;aws_sns_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;orders_placed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_sqs_queue_policy"&lt;/span&gt; &lt;span class="s2"&gt;"main"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consumers&lt;/span&gt;
  &lt;span class="nx"&gt;queue_url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;policy&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_iam_policy_document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sns_to_sqs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;json&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_sns_topic_subscription"&lt;/span&gt; &lt;span class="s2"&gt;"consumer"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consumers&lt;/span&gt;
  &lt;span class="nx"&gt;topic_arn&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_sns_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;orders_placed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
  &lt;span class="nx"&gt;protocol&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"sqs"&lt;/span&gt;
  &lt;span class="nx"&gt;endpoint&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
  &lt;span class="nx"&gt;raw_message_delivery&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;                     &lt;span class="c1"&gt;# consumers parse the raw event, not the SNS envelope&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# --- Consumer IAM role template (least-privilege) ------------------------&lt;/span&gt;
&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_policy_document"&lt;/span&gt; &lt;span class="s2"&gt;"consumer_assume"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;statement&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;actions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"sts:AssumeRole"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;principals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Service"&lt;/span&gt;
      &lt;span class="nx"&gt;identifiers&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"lambda.amazonaws.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"ecs-tasks.amazonaws.com"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_role"&lt;/span&gt; &lt;span class="s2"&gt;"consumer"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consumers&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;               &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${local.name_prefix}-${each.key}-consumer"&lt;/span&gt;
  &lt;span class="nx"&gt;assume_role_policy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_iam_policy_document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consumer_assume&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;json&lt;/span&gt;
  &lt;span class="nx"&gt;tags&lt;/span&gt;               &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Consumer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_policy_document"&lt;/span&gt; &lt;span class="s2"&gt;"consumer_sqs"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consumers&lt;/span&gt;
  &lt;span class="nx"&gt;statement&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;actions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="s2"&gt;"sqs:ReceiveMessage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s2"&gt;"sqs:DeleteMessage"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s2"&gt;"sqs:GetQueueAttributes"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s2"&gt;"sqs:ChangeMessageVisibility"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;aws_sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_role_policy"&lt;/span&gt; &lt;span class="s2"&gt;"consumer_sqs"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consumers&lt;/span&gt;
  &lt;span class="nx"&gt;role&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_iam_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consumer&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;policy&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_iam_policy_document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consumer_sqs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;json&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;"topic_arn"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_sns_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;orders_placed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;"queues"&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;q&lt;/span&gt; &lt;span class="nx"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;aws_sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt; &lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;k&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;q&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;"dlqs"&lt;/span&gt;      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;q&lt;/span&gt; &lt;span class="nx"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;aws_sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dlq&lt;/span&gt;  &lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;k&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;q&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The IaC - Azure
&lt;/h2&gt;

&lt;p&gt;Same scenario, same tags, same &lt;code&gt;maxDeliveryCount = 5&lt;/code&gt;. Note &lt;code&gt;disableLocalAuth: true&lt;/code&gt; on the namespace and per-subscription RBAC at the bottom.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Deployment scope: resourceGroup
targetScope = 'resourceGroup'

@description('Project short code, e.g. osecloud')
param project string

@allowed([ 'dev', 'stg', 'prd' ])
param environment string

@description('Azure region.')
param location string = resourceGroup().location

@description('Logical consumers that subscribe to the OrderPlaced topic.')
param consumers array = [ 'inventory', 'payment', 'notification' ]

@description('Consumer identity object IDs (managed identities that will receive Data Receiver role). Key must match a name in `consumers`.')
param consumerPrincipalIds object = {}

@description('Service Bus SKU. Use Premium when you need ordering at scale, VNet integration, or &amp;gt;1MB messages.')
@allowed([ 'Standard', 'Premium' ])
param skuName string = 'Standard'

var namePrefix = toLower('${project}-${environment}')
var tags = {
  project:     project
  environment: environment
  workload:    'eda-orders'
  costCenter:  'platform-events'
  managedBy:   'bicep'
}

// --- Namespace ------------------------------------------------------------
resource sbNamespace 'Microsoft.ServiceBus/namespaces@2022-10-01-preview' = {
  name:     '${namePrefix}-sb'
  location: location
  sku: {
    name: skuName
    tier: skuName
  }
  properties: {
    minimumTlsVersion: '1.2'
    publicNetworkAccess: 'Enabled' // set to 'Disabled' + private endpoint in prd
    disableLocalAuth: true         // force AAD/Managed Identity - no SAS keys
  }
  tags: tags
}

// --- Topic ----------------------------------------------------------------
resource topic 'Microsoft.ServiceBus/namespaces/topics@2022-10-01-preview' = {
  parent: sbNamespace
  name: 'orders-placed'
  properties: {
    defaultMessageTimeToLive: 'P14D'
    enableBatchedOperations: true
    supportOrdering: true          // preserves order within a session (partition) - only honoured with session-enabled subscriptions
  }
}

// --- Subscriptions + DLQ (DLQ is implicit per subscription) --------------
resource subs 'Microsoft.ServiceBus/namespaces/topics/subscriptions@2022-10-01-preview' = [for name in consumers: {
  parent: topic
  name:   '${name}-sub'
  properties: {
    deadLetteringOnMessageExpiration:    true
    deadLetteringOnFilterEvaluationExceptions: true
    maxDeliveryCount: 5             // redrive threshold → moves to $DeadLetterQueue
    lockDuration:     'PT1M'        // matches visibility-timeout in AWS terms
    defaultMessageTimeToLive: 'P4D'
    requiresSession: false          // set true for FIFO-per-session guarantees
  }
}]

// --- RBAC: Azure Service Bus Data Receiver on each subscription ----------
var dataReceiverRoleId = '4f6d3b9b-027b-4f4c-9142-0e5a2a2247e0' // Service Bus Data Receiver

resource rbacReceiver 'Microsoft.Authorization/roleAssignments@2022-04-01' = [for (name, i) in consumers: if (contains(consumerPrincipalIds, name)) {
  name:  guid(subs[i].id, consumerPrincipalIds[name], dataReceiverRoleId)
  scope: subs[i]
  properties: {
    roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', dataReceiverRoleId)
    principalId:      consumerPrincipalIds[name]
    principalType:    'ServicePrincipal'
  }
}]

output namespaceId       string = sbNamespace.id
output topicId           string = topic.id
output subscriptionNames array  = [for (name, i) in consumers: subs[i].name]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One caveat on the API version: &lt;code&gt;2022-10-01-preview&lt;/code&gt; is still labelled preview at the time of writing. If your platform team forbids preview API versions in production, pin to the latest stable GA release and re-test &lt;code&gt;disableLocalAuth&lt;/code&gt; - its behaviour has shifted across API versions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Seven architectural constraints
&lt;/h2&gt;

&lt;p&gt;Treat these as acceptance criteria for any EDA pipeline you ship on either cloud. They are a punch list, not a wish list.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;At-least-once is the default on both sides.&lt;/strong&gt; Exactly-once is a consumer property - idempotent handlers plus a dedup store - not a broker guarantee. Broker-side windows (SQS FIFO 5 min, Service Bus duplicate detection up to 7 days) narrow the problem; they do not eliminate it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visibility timeout and lock duration must exceed P99 handler latency.&lt;/strong&gt; If the broker redelivers while the first handler is still working, you double-process. Measure P99 under load, add headroom, and alert when handler duration approaches the timeout.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ordering is a bet you pay for.&lt;/strong&gt; FIFO caps AWS throughput at 300–3,000 msgs/s; sessions serialise Azure subscriptions. Turn it on only when the domain demands ordering - never as a safety blanket.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DLQs are not a graveyard.&lt;/strong&gt; They need alerts (&lt;code&gt;DLQ depth &amp;gt; 0&lt;/code&gt; pages the on-call) and a documented replay procedure - SQS redrive or Service Bus &lt;code&gt;$DeadLetterQueue&lt;/code&gt; receive-and-resubmit. A DLQ without a replay runbook is a silent leak.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Large messages are an anti-pattern.&lt;/strong&gt; &amp;gt;256 KB on AWS implies claim-check via S3. Premium Service Bus supports 100 MB, but transport cost and consumer memory pressure still argue for claim-check at that size.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tag every resource&lt;/strong&gt; with &lt;code&gt;project&lt;/code&gt;, &lt;code&gt;environment&lt;/code&gt;, &lt;code&gt;workload&lt;/code&gt;, &lt;code&gt;costCenter&lt;/code&gt;, &lt;code&gt;managedBy&lt;/code&gt;. Without those tags, FinOps cannot attribute spend and the platform team cannot enforce lifecycle policies. The snippets above carry the full set; do not strip them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No SAS keys, no IAM user keys.&lt;/strong&gt; Managed Identity on Azure, IAM roles on AWS. &lt;code&gt;disableLocalAuth: true&lt;/code&gt; on the Service Bus namespace, &lt;code&gt;aws:SourceArn&lt;/code&gt; condition on every SQS queue policy. Anything else is a long-lived credential waiting to leak.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The choice between SNS/SQS and Service Bus is rarely binary. Start on Standard Service Bus or SNS + SQS. Move to Premium or FIFO only when a constraint above - size, ordering, isolation - forces you there.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>azure</category>
      <category>cloud</category>
      <category>eventdriven</category>
    </item>
    <item>
      <title>Practical Guide: Building a Cell-Based Architecture on Azure with Terraform and Python</title>
      <dc:creator>Cláudio Filipe Lima Rapôso</dc:creator>
      <pubDate>Tue, 21 Apr 2026 15:15:00 +0000</pubDate>
      <link>https://dev.to/sertaoseracloud/practical-guide-building-a-cell-based-architecture-on-azure-with-terraform-and-python-1h04</link>
      <guid>https://dev.to/sertaoseracloud/practical-guide-building-a-cell-based-architecture-on-azure-with-terraform-and-python-1h04</guid>
      <description>&lt;h2&gt;
  
  
  1. Introduction
&lt;/h2&gt;

&lt;p&gt;As cloud applications scale to serve global audiences, relying on a single centralized infrastructure stack introduces critical vulnerabilities. A localized failure or a noisy neighbor can trigger a systemic outage. Cell-Based Architecture mitigates this by partitioning the system into isolated, identical, and self-contained units called "cells." By routing specific tenants or users to dedicated cells, you constrain the blast radius of any degradation strictly to that cell, preserving the availability of the rest of the system.&lt;/p&gt;

&lt;p&gt;By the end of this tutorial, you will be able to design and provision a cellular architecture on Microsoft Azure. We will utilize Azure Front Door as the global ingress, Azure Functions (Python) for dynamic traffic routing and compute, and Azure Cosmos DB for isolated state management. Mastering this pattern on Azure not only hardens your current workloads but also reinforces foundational multicloud principles, as the concepts of decoupled routing and state isolation are directly transferable across different cloud providers in a unified enterprise landscape.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Prerequisites
&lt;/h2&gt;

&lt;p&gt;To execute the configurations and code in this tutorial, ensure you have the following tools and access levels:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An active Microsoft Azure subscription with Owner or Contributor permissions to create Resource Groups, Cosmos DB accounts, Azure Functions, and Azure Front Door.&lt;/li&gt;
&lt;li&gt;Terraform CLI (version 1.0 or higher) installed locally for provisioning the Infrastructure as Code (IaC).&lt;/li&gt;
&lt;li&gt;Python (version 3.9 or higher) installed locally, along with the Azure Functions Core Tools for local development and packaging.&lt;/li&gt;
&lt;li&gt;Azure CLI installed and authenticated (&lt;code&gt;az login&lt;/code&gt;) on your local environment.&lt;/li&gt;
&lt;li&gt;Fundamental understanding of partitioning concepts and Terraform HCL syntax.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. Step-by-Step
&lt;/h2&gt;

&lt;p&gt;Before diving into the infrastructure code, let's visualize the execution flow. The sequence diagram below details how a global request is securely intercepted, evaluated, and forwarded to a strictly isolated cellular stack within Azure.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb41axbpax2r7t18yxjvt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb41axbpax2r7t18yxjvt.png" alt="Sequence Diagram" width="800" height="537"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Defining the Cell Blueprint (Terraform Module)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Create a reusable Terraform module representing a single, isolated "Cell." This includes a dedicated Storage Account, Application Service Plan, Azure Function (Worker), and a Cosmos DB database.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; The primary rule of cellular architecture is absolute consistency across environments. By encapsulating the infrastructure in a Terraform module, you ensure that every generated cell is an exact replica, preventing configuration drift and simplifying horizontal scaling. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Create a directory named &lt;code&gt;modules/cell&lt;/code&gt; and add a &lt;code&gt;main.tf&lt;/code&gt; file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# modules/cell/main.tf&lt;/span&gt;
&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"location"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"resource_group_name"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"cell_id"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_cosmosdb_account"&lt;/span&gt; &lt;span class="s2"&gt;"cell_db"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"cosmos-cell-${var.cell_id}"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;resource_group_name&lt;/span&gt;
  &lt;span class="nx"&gt;offer_type&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Standard"&lt;/span&gt;
  &lt;span class="nx"&gt;kind&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"GlobalDocumentDB"&lt;/span&gt;

  &lt;span class="nx"&gt;consistency_policy&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;consistency_level&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Session"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;geo_location&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;location&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
    &lt;span class="nx"&gt;failover_priority&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_cosmosdb_sql_database"&lt;/span&gt; &lt;span class="s2"&gt;"cell_sqldb"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"app-state"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;resource_group_name&lt;/span&gt;
  &lt;span class="nx"&gt;account_name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_cosmosdb_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_storage_account"&lt;/span&gt; &lt;span class="s2"&gt;"cell_storage"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"stcell${var.cell_id}"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;resource_group_name&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;account_tier&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Standard"&lt;/span&gt;
  &lt;span class="nx"&gt;account_replication_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"LRS"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_service_plan"&lt;/span&gt; &lt;span class="s2"&gt;"cell_plan"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"plan-cell-${var.cell_id}"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;resource_group_name&lt;/span&gt;
  &lt;span class="nx"&gt;os_type&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Linux"&lt;/span&gt;
  &lt;span class="nx"&gt;sku_name&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Y1"&lt;/span&gt; &lt;span class="c1"&gt;# Consumption plan&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_linux_function_app"&lt;/span&gt; &lt;span class="s2"&gt;"cell_worker"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"func-worker-${var.cell_id}"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;                   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;resource_group_name&lt;/span&gt;
  &lt;span class="nx"&gt;service_plan_id&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_service_plan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_plan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;storage_account_name&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_storage_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_storage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;storage_account_access_key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_storage_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_storage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;primary_access_key&lt;/span&gt;

  &lt;span class="nx"&gt;site_config&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;application_stack&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;python_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"3.11"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;app_settings&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;"CELL_ID"&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_id&lt;/span&gt;
    &lt;span class="s2"&gt;"COSMOS_DB_ENDPOINT"&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_cosmosdb_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;"function_default_hostname"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_linux_function_app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_worker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;default_hostname&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.2 Stamping Out Multiple Cells
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; In your root Terraform configuration, define the base resource group and iterate over a collection of identifiers to deploy multiple cells simultaneously.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; This translates the theoretical module into physical, isolated environments. Using a &lt;code&gt;for_each&lt;/code&gt; loop allows you to effortlessly scale from two cells to fifty cells simply by updating a local variable, abstracting the complexity of managing parallel infrastructures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
In your root directory, create &lt;code&gt;main.tf&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# main.tf&lt;/span&gt;
&lt;span class="nx"&gt;terraform&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;required_providers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;azurerm&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"hashicorp/azurerm"&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 3.0"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"azurerm"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;features&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_resource_group"&lt;/span&gt; &lt;span class="s2"&gt;"rg"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"rg-cellular-architecture"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"East US"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;locals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;cells&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"alpha"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"beta"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"isolated_cells"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"./modules/cell"&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;toset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cells&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="nx"&gt;cell_id&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.3 Developing the Cell Router in Python
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Write the Azure Function code in Python that will act as the global ingress router. It must inspect incoming HTTP requests, determine the appropriate tenant mapping, and proxy the request to the correct cell.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; The router abstracts the internal topology from the client. Applications interact with a single API endpoint, oblivious to the fact that their request is being routed to &lt;code&gt;cell-alpha&lt;/code&gt; or &lt;code&gt;cell-beta&lt;/code&gt;. This dynamic mapping is what allows you to perform live migrations of tenants between cells to balance load without downtime.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Create the Python code for your router function (&lt;code&gt;__init__.py&lt;/code&gt; inside your Azure Function folder).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;urllib.request&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;azure.functions&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;

&lt;span class="c1"&gt;# In production, fetch this from Cosmos DB (Tenant Registry)
&lt;/span&gt;&lt;span class="n"&gt;CELL_ENDPOINTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alpha&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CELL_ALPHA_URL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://func-worker-alpha.azurewebsites.net&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;beta&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CELL_BETA_URL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://func-worker-beta.azurewebsites.net&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_target_cell&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Mocking a Cosmos DB registry lookup
&lt;/span&gt;    &lt;span class="c1"&gt;# A real implementation would query the global metadata database
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alpha&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;beta&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HttpRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HttpResponse&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Global Router processing a request.&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;req_body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;req_body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tenant_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Missing partition key: tenant_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                &lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;target_cell&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_target_cell&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;target_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;CELL_ENDPOINTS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;target_cell&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/api/process&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="c1"&gt;# Proxy the request to the isolated cell
&lt;/span&gt;        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req_body&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;proxy_req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;target_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
            &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;urlopen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;proxy_req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;cell_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x_routed_to&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;target_cell&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cell_response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}),&lt;/span&gt;
            &lt;span class="n"&gt;mimetype&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Routing error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Internal Server Error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.4 Provisioning the Global Routing Layer
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Deploy the central routing Azure Function and configure an Azure Front Door profile to sit in front of it. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; Azure Front Door acts as a secure, globally distributed entry point. It absorbs DDoS attacks at the edge, provides WAF capabilities, and ensures that the Global Router Function is protected from direct, unauthenticated internet exposure. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Append the routing infrastructure to your root &lt;code&gt;main.tf&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Shared storage for the Global Router&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_storage_account"&lt;/span&gt; &lt;span class="s2"&gt;"router_storage"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"stglobalrouter"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;account_tier&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Standard"&lt;/span&gt;
  &lt;span class="nx"&gt;account_replication_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"LRS"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_service_plan"&lt;/span&gt; &lt;span class="s2"&gt;"router_plan"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"plan-global-router"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;os_type&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Linux"&lt;/span&gt;
  &lt;span class="nx"&gt;sku_name&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Y1"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_linux_function_app"&lt;/span&gt; &lt;span class="s2"&gt;"global_router"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"func-global-router-gateway"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;                   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;service_plan_id&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_service_plan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;router_plan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;storage_account_name&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_storage_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;router_storage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;storage_account_access_key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_storage_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;router_storage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;primary_access_key&lt;/span&gt;

  &lt;span class="nx"&gt;site_config&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;application_stack&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;python_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"3.11"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;app_settings&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;"CELL_ALPHA_URL"&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"https://${module.isolated_cells["&lt;/span&gt;&lt;span class="nx"&gt;alpha&lt;/span&gt;&lt;span class="s2"&gt;"].function_default_hostname}"&lt;/span&gt;
    &lt;span class="s2"&gt;"CELL_BETA_URL"&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"https://${module.isolated_cells["&lt;/span&gt;&lt;span class="nx"&gt;beta&lt;/span&gt;&lt;span class="s2"&gt;"].function_default_hostname}"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run &lt;code&gt;terraform init&lt;/code&gt;, &lt;code&gt;terraform plan&lt;/code&gt;, and &lt;code&gt;terraform apply&lt;/code&gt; to deploy the entire multi-cell architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Common Troubleshooting
&lt;/h2&gt;

&lt;p&gt;Transitioning to a cellular architecture requires a shift in how you manage state and traffic. Here are common issues you may encounter:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Cosmos DB Throttling (HTTP 429) in a Specific Cell:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Problem:&lt;/strong&gt; One cell begins rejecting requests, while others operate normally. This usually indicates a "noisy neighbor" – a tenant whose workload has suddenly spiked, exhausting the Request Units (RUs) provisioned for that specific cell's database.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution:&lt;/strong&gt; Verify the metrics in Azure Monitor. If a tenant has outgrown the shared cell, you must execute a live migration. Update the Global Tenant Registry (Metadata DB) to point that specific &lt;code&gt;tenant_id&lt;/code&gt; to a newly provisioned, dedicated cell, seamlessly redirecting their traffic.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Latency Overhead at the Routing Layer:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Problem:&lt;/strong&gt; Requests take significantly longer because they must pass through Front Door, the Router Function, and finally the Cell Function.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution:&lt;/strong&gt; The router logic must be aggressively optimized. Implement caching at the Global Router level using Azure Cache for Redis so the function doesn't need to query Cosmos DB for the tenant mapping on every single request.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Cold Starts in Azure Functions:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Problem:&lt;/strong&gt; The first request to a specific cell takes several seconds to execute.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solution:&lt;/strong&gt; Since we used the Consumption Plan (&lt;code&gt;Y1&lt;/code&gt;) for cost-effectiveness in this tutorial, cold starts are expected. For production workloads, switch the Application Service Plan to Premium (&lt;code&gt;EP1&lt;/code&gt; or higher) to keep instances pre-warmed, ensuring consistent low-latency responses.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  5. Conclusion
&lt;/h2&gt;

&lt;p&gt;Building a Cell-Based Architecture on Azure transforms monolithic vulnerabilities into contained, manageable units. By utilizing Terraform, we established a reproducible baseline for isolated environments, and implemented a Python routing layer to dynamically proxy traffic. &lt;/p&gt;

&lt;p&gt;This decoupling ensures that massive traffic spikes or bad deployments are restricted to single domains. As you mature this architecture, focus on automating the "Tenant Migration" process—moving live data between Cosmos DB instances without downtime—and standardizing your Terraform modules to ensure this pattern can be rapidly adopted across multiple cloud environments.&lt;/p&gt;

</description>
      <category>azure</category>
      <category>eventdriven</category>
      <category>python</category>
      <category>terraform</category>
    </item>
    <item>
      <title>Practical Guide: Building a Cell-Based Architecture on AWS with Terraform and Python</title>
      <dc:creator>Cláudio Filipe Lima Rapôso</dc:creator>
      <pubDate>Tue, 21 Apr 2026 15:11:00 +0000</pubDate>
      <link>https://dev.to/sertaoseracloud/practical-guide-building-a-cell-based-architecture-on-aws-with-terraform-and-python-n1p</link>
      <guid>https://dev.to/sertaoseracloud/practical-guide-building-a-cell-based-architecture-on-aws-with-terraform-and-python-n1p</guid>
      <description>&lt;h2&gt;
  
  
  1. Introduction
&lt;/h2&gt;

&lt;p&gt;In the cloud-native era, systems often reach a point where scaling a single massive architecture introduces unacceptable risks. A failure in a centralized component can result in a global outage, affecting all users simultaneously. Cell-Based Architecture solves this by dividing the system into multiple isolated, standalone, and identical instances called "cells." By placing users (or tenants) into specific cells, you drastically reduce the blast radius of any failure.&lt;/p&gt;

&lt;p&gt;While this tutorial focuses on an AWS implementation, designing cellular strategies is a cornerstone of robust multicloud engineering. The principles of isolating state and routing traffic based on partition keys apply seamlessly across different providers, such as Microsoft Azure, ensuring your landing zones maintain high availability regardless of the underlying cloud.&lt;/p&gt;

&lt;p&gt;By the end of this tutorial, you will understand how to provision a cellular infrastructure on AWS using Terraform. We will create a blueprint for a "cell," stamp out multiple instances of it, and build a Cell Router layer in Python to direct traffic to the correct isolated environment.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Prerequisites
&lt;/h2&gt;

&lt;p&gt;To successfully implement this architectural pattern, you will need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An active Amazon Web Services (AWS) account with administrative privileges.&lt;/li&gt;
&lt;li&gt;Terraform installed locally (version 1.0 or higher) for Infrastructure as Code.&lt;/li&gt;
&lt;li&gt;Python (version 3.9 or higher) to write the routing logic.&lt;/li&gt;
&lt;li&gt;AWS credentials configured in your environment.&lt;/li&gt;
&lt;li&gt;A foundational understanding of architectural decoupling and partition logic.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. Step-by-Step
&lt;/h2&gt;

&lt;p&gt;A Cell-Based Architecture introduces a "Thin Routing Layer" in front of your core infrastructure. The diagram below illustrates how an incoming request is evaluated and forwarded to a strictly isolated cellular stack.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flkaftk1syfa2tem9oys3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flkaftk1syfa2tem9oys3.png" alt="sequence diagram" width="800" height="545"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Defining the Cell Blueprint (Terraform Module)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Create a reusable Terraform module that defines exactly what a single "Cell" looks like.&lt;br&gt;
&lt;strong&gt;Why do it:&lt;/strong&gt; The core tenet of cellular architecture is that every cell is identical. By using a Terraform module, you ensure that any updates to the infrastructure are applied uniformly across all isolated cells, preventing configuration drift.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Create a folder named &lt;code&gt;modules/cell&lt;/code&gt; and add a &lt;code&gt;main.tf&lt;/code&gt; file inside it. This blueprint contains an API Gateway, a Lambda function, and an isolated DynamoDB table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# modules/cell/main.tf&lt;/span&gt;
&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"cell_id"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Unique identifier for the cell (e.g., cell-1)"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_dynamodb_table"&lt;/span&gt; &lt;span class="s2"&gt;"cell_state"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"app-state-${var.cell_id}"&lt;/span&gt;
  &lt;span class="nx"&gt;billing_mode&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"PAY_PER_REQUEST"&lt;/span&gt;
  &lt;span class="nx"&gt;hash_key&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"id"&lt;/span&gt;

  &lt;span class="nx"&gt;attribute&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"id"&lt;/span&gt;
    &lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"S"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_lambda_function"&lt;/span&gt; &lt;span class="s2"&gt;"cell_compute"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;filename&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"cell_worker.zip"&lt;/span&gt;
  &lt;span class="nx"&gt;function_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"worker-${var.cell_id}"&lt;/span&gt;
  &lt;span class="nx"&gt;role&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_iam_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
  &lt;span class="nx"&gt;handler&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"worker.handler"&lt;/span&gt;
  &lt;span class="nx"&gt;runtime&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"python3.11"&lt;/span&gt;

  &lt;span class="nx"&gt;environment&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;variables&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;CELL_ID&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_id&lt;/span&gt;
      &lt;span class="nx"&gt;TABLE_NAME&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_dynamodb_table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# (IAM Roles and API Gateway configurations for the cell would follow here)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.2 Stamping Out Multiple Cells
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; In your root &lt;code&gt;main.tf&lt;/code&gt;, iterate over a list of cell identifiers to provision multiple identical, isolated environments.&lt;br&gt;
&lt;strong&gt;Why do it:&lt;/strong&gt; This allows you to scale horizontally by adding new completely independent infrastructure stacks rather than increasing the size of a monolithic database or compute cluster. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
In your root directory, create a &lt;code&gt;main.tf&lt;/code&gt; to invoke the module.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# main.tf&lt;/span&gt;
&lt;span class="nx"&gt;terraform&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;required_providers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;aws&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"hashicorp/aws"&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 5.0"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"aws"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;region&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"us-east-1"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;locals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;cells&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"cell-alpha"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"cell-beta"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"isolated_cells"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;source&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"./modules/cell"&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;toset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;local&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cells&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="nx"&gt;cell_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.3 Developing the Cell Router in Python
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Write a Python function that acts as the ingress routing layer, determining which cell should handle a specific request based on a partition key.&lt;br&gt;
&lt;strong&gt;Why do it:&lt;/strong&gt; Clients should not need to know which cell they belong to. The router abstracts this complexity, allowing you to migrate tenants between cells behind the scenes without breaking client integrations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Create a &lt;code&gt;router.py&lt;/code&gt; file. This logic intercepts the request, hashes the partition key (e.g., &lt;code&gt;tenant_id&lt;/code&gt;), and routes it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;urllib.request&lt;/span&gt;

&lt;span class="c1"&gt;# In a real environment, this mapping could be retrieved from a highly available edge key-value store.
&lt;/span&gt;&lt;span class="n"&gt;CELL_ENDPOINTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cell-alpha&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://alpha.execute-api.us-east-1.amazonaws.com/prod&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cell-beta&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://beta.execute-api.us-east-1.amazonaws.com/prod&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_target_cell&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;partition_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Consistently hashes the partition key to a specific cell.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;hash_val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;md5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;partition_key&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Simple modulo distribution
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;hash_val&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cell-alpha&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cell-beta&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;{}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tenant_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Missing partition key: tenant_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="n"&gt;target_cell&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_target_cell&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;target_endpoint&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CELL_ENDPOINTS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;target_cell&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="c1"&gt;# Forwarding the request to the isolated cell (Simplified for demonstration)
&lt;/span&gt;        &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;target_endpoint&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/process&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
            &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;urlopen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;cell_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;routed_to&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;target_cell&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cell_response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cell_response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.4 Provisioning the Routing Layer
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Add the routing layer to your root Terraform configuration to expose a single unified endpoint to your users.&lt;br&gt;
&lt;strong&gt;Why do it:&lt;/strong&gt; This centralizes ingress control. All external traffic hits the router, which then proxies the data over the AWS internal backbone to the respective cells, ensuring strict access control at the boundary.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Add this to your root &lt;code&gt;main.tf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"archive_file"&lt;/span&gt; &lt;span class="s2"&gt;"router_zip"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"zip"&lt;/span&gt;
  &lt;span class="nx"&gt;source_file&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"router.py"&lt;/span&gt;
  &lt;span class="nx"&gt;output_path&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"router.zip"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_lambda_function"&lt;/span&gt; &lt;span class="s2"&gt;"cell_router"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;filename&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;archive_file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;router_zip&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;output_path&lt;/span&gt;
  &lt;span class="nx"&gt;function_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"GlobalCellRouter"&lt;/span&gt;
  &lt;span class="nx"&gt;role&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_iam_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;router_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt; &lt;span class="c1"&gt;# (Assume basic execution role is created)&lt;/span&gt;
  &lt;span class="nx"&gt;handler&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"router.lambda_handler"&lt;/span&gt;
  &lt;span class="nx"&gt;runtime&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"python3.11"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_lambda_function_url"&lt;/span&gt; &lt;span class="s2"&gt;"router_url"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;function_name&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_lambda_function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cell_router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;function_name&lt;/span&gt;
  &lt;span class="nx"&gt;authorization_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"NONE"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;output&lt;/span&gt; &lt;span class="s2"&gt;"global_entrypoint"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;value&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_lambda_function_url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;router_url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;function_url&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The single URL clients interact with."&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4. Common Troubleshooting
&lt;/h2&gt;

&lt;p&gt;Deploying cellular architectures shifts the complexity from infrastructure scaling to traffic routing and state management. Be prepared to handle these common challenges:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Partition Key Skew (Noisy Neighbors):&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Problem:&lt;/em&gt; One cell becomes overloaded while others are idle because a specific &lt;code&gt;tenant_id&lt;/code&gt; generates 80% of the traffic.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Solution:&lt;/em&gt; Monitor cell metrics closely. If a tenant outgrows a shared cell, you must implement a "tenant migration" process to move their data to an exclusive, single-tenant cell, updating the router's mapping logic accordingly.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Cross-Cell Data Aggregation:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Problem:&lt;/em&gt; You need to generate a global report, but the data is split across multiple isolated DynamoDB tables.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Solution:&lt;/em&gt; Do not query cells directly for global data. Implement an asynchronous data lake strategy where each cell streams its state changes (e.g., via DynamoDB Streams and Kinesis) to a central analytical data store.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Router Layer Bottlenecks:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Problem:&lt;/em&gt; The Cell Router itself goes down, causing a global outage—exactly what cellular architecture tries to prevent.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Solution:&lt;/em&gt; The routing layer must be incredibly thin and rely on highly resilient, geographically distributed edge services (like Amazon Route 53 or CloudFront/Lambda@Edge) rather than a single compute instance.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  5. Conclusion
&lt;/h2&gt;

&lt;p&gt;By implementing a Cell-Based Architecture, you establish definitive fault-isolation boundaries. We utilized Terraform to define a repeatable cell blueprint and created a thin Python routing layer to direct traffic dynamically. &lt;/p&gt;

&lt;p&gt;This approach minimizes the blast radius of localized failures, making your systems inherently more resilient. As you expand on this concept, consider how this decoupled routing strategy translates across multicloud landscapes, allowing you to seamlessly route traffic between an AWS cell and an Azure cell based on performance, cost, or regulatory requirements.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>aws</category>
      <category>python</category>
      <category>terraform</category>
    </item>
    <item>
      <title>Practical Guide: Building an Event-Driven Infrastructure on Microsoft Azure with Terraform and Python</title>
      <dc:creator>Cláudio Filipe Lima Rapôso</dc:creator>
      <pubDate>Sun, 19 Apr 2026 13:57:33 +0000</pubDate>
      <link>https://dev.to/sertaoseracloud/practical-guide-building-an-event-driven-infrastructure-on-microsoft-azure-with-terraform-and-1g7k</link>
      <guid>https://dev.to/sertaoseracloud/practical-guide-building-an-event-driven-infrastructure-on-microsoft-azure-with-terraform-and-1g7k</guid>
      <description>&lt;h2&gt;
  
  
  1. Introduction
&lt;/h2&gt;

&lt;p&gt;Event-driven architectures decouple system components, replacing direct synchronous communication with a highly scalable publish-subscribe model. By the end of this tutorial, you will be able to provision a complete event-based infrastructure on Microsoft Azure. This setup utilizes Azure Event Grid as the central event routing backbone, Azure Service Bus for reliable message queuing, and Azure Functions for serverless computational processing.&lt;/p&gt;

&lt;p&gt;Mastering this topology is a structural requirement for modern software design. Isolating producers from consumers ensures that localized failures do not cascade through the system, enabling independent scaling of microservices. Furthermore, translating these concepts across different cloud providers strengthens a robust multicloud strategy, allowing you to map architectural patterns (like Event Bus -&amp;gt; Queue -&amp;gt; Serverless Compute) seamlessly into an Azure-based landing zone.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Prerequisites
&lt;/h2&gt;

&lt;p&gt;To execute the configurations proposed in this guide, ensure you have the following prerequisites established:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An active Microsoft Azure account with permissions to create Resource Groups, Event Grid topics, Service Bus namespaces, and Function Apps.&lt;/li&gt;
&lt;li&gt;Terraform installed locally (version 1.0 or higher) for provisioning the Infrastructure as Code (IaC).&lt;/li&gt;
&lt;li&gt;Python (version 3.9 or higher) installed locally for developing the function logic.&lt;/li&gt;
&lt;li&gt;The Azure CLI (&lt;code&gt;az&lt;/code&gt;) installed and authenticated in your local environment.&lt;/li&gt;
&lt;li&gt;Familiarity with terminal navigation and Terraform HCL syntax.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. Step-by-Step
&lt;/h2&gt;

&lt;p&gt;Before diving into the code, it is critical to visualize the event lifecycle. The sequence diagram below maps the flow of information across the provisioned Azure services.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7kbcd6voyf2c8w8wq4tl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7kbcd6voyf2c8w8wq4tl.png" alt="Sequence diagram" width="800" height="335"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Configuring the Provider and Resource Group
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Define the Azure Resource Manager (&lt;code&gt;azurerm&lt;/code&gt;) provider in Terraform and create a foundational Resource Group to logically group all infrastructure assets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; Terraform requires provider definitions to authenticate and interact with the specific cloud API. The Resource Group is a mandatory Azure construct that controls the lifecycle and access management of the resources it contains.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Create a file named &lt;code&gt;main.tf&lt;/code&gt; and add the following configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;required_providers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;azurerm&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"hashicorp/azurerm"&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 3.0"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"azurerm"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;features&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_resource_group"&lt;/span&gt; &lt;span class="s2"&gt;"rg"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"rg-event-driven-architecture"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"East US"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.2 Creating the Event Grid Topic
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Provision a Custom Topic in Azure Event Grid.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; A Custom Topic serves as the dedicated endpoint where your applications publish business events. Isolating business events in a custom topic prevents mixing application logic with underlying Azure infrastructure events.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Append the following block to your &lt;code&gt;main.tf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_eventgrid_topic"&lt;/span&gt; &lt;span class="s2"&gt;"custom_topic"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"app-domain-events-topic"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.3 Provisioning the Service Bus Namespace and Queue
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Create a Service Bus Namespace and a specific Queue within it to absorb incoming events.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; While Event Grid can push directly to an Azure Function, routing through a Service Bus Queue introduces a critical buffer. This ensures high availability, message durability, and prevents overwhelming the downstream compute service during traffic spikes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Add the Service Bus configurations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_servicebus_namespace"&lt;/span&gt; &lt;span class="s2"&gt;"sb_namespace"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"sb-event-driven-demo"&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;sku&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Standard"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_servicebus_queue"&lt;/span&gt; &lt;span class="s2"&gt;"event_queue"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"event-processing-queue"&lt;/span&gt;
  &lt;span class="nx"&gt;namespace_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_servicebus_namespace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sb_namespace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.4 Developing the Azure Function in Python
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Write the Python code using the Azure Functions v2 programming model to process messages arriving in the Service Bus Queue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; The function represents the business logic reacting to the event. The v2 model utilizes decorators, providing a clean, concise way to define triggers and bindings directly in the code, automatically handling message deserialization.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Create a file named &lt;code&gt;function_app.py&lt;/code&gt; in your project directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;azure.functions&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize the Function App
&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;FunctionApp&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nd"&gt;@app.service_bus_queue_trigger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;arg_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;msg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;queue_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;event-processing-queue&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ServiceBusConnection&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_domain_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ServiceBusMessage&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getLogger&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Initiating Service Bus event processing.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Decode and load the message body
&lt;/span&gt;        &lt;span class="n"&gt;msg_body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_body&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;event_payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg_body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Complete event payload: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event_payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Event Grid schema typically encapsulates data in a 'data' field
&lt;/span&gt;        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event_payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;
        &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;order_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Successfully processed business operation for order: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;JSONDecodeError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed to decode message payload as JSON.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Unexpected error during processing: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.5 Provisioning the Compute Infrastructure for the Function
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Define the Storage Account, Service Plan (Consumption), and the Linux Function App via Terraform, injecting the necessary connection strings as environment variables.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; Azure Functions require a backing storage account for state management and an execution plan to define scale and pricing. Injecting &lt;code&gt;ServiceBusConnection&lt;/code&gt; securely links the compute tier to the messaging tier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Add the compute resources to your &lt;code&gt;main.tf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_storage_account"&lt;/span&gt; &lt;span class="s2"&gt;"sa"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"saeventdrivendemo123"&lt;/span&gt; &lt;span class="c1"&gt;# Must be globally unique&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;                 &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;account_tier&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Standard"&lt;/span&gt;
  &lt;span class="nx"&gt;account_replication_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"LRS"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_service_plan"&lt;/span&gt; &lt;span class="s2"&gt;"asp"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"asp-event-driven"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;os_type&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Linux"&lt;/span&gt;
  &lt;span class="nx"&gt;sku_name&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Y1"&lt;/span&gt; &lt;span class="c1"&gt;# Serverless Consumption Plan&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_linux_function_app"&lt;/span&gt; &lt;span class="s2"&gt;"function_app"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;                       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"func-order-processor-app"&lt;/span&gt;
  &lt;span class="nx"&gt;resource_group_name&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;                   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_resource_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;location&lt;/span&gt;
  &lt;span class="nx"&gt;service_plan_id&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_service_plan&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;asp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;storage_account_name&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_storage_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sa&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;storage_account_access_key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_storage_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sa&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;primary_access_key&lt;/span&gt;

  &lt;span class="nx"&gt;site_config&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;application_stack&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;python_version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"3.11"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;app_settings&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;"FUNCTIONS_WORKER_RUNTIME"&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"python"&lt;/span&gt;
    &lt;span class="s2"&gt;"ServiceBusConnection"&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_servicebus_namespace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sb_namespace&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;default_primary_connection_string&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.6 Creating the Event Grid Subscription (Routing Rule)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Configure an Event Subscription that filters events arriving at the Custom Topic and routes them to the Service Bus Queue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; Advanced filtering ensures that only relevant events reach the compute tier, preventing unnecessary executions and lowering costs. This acts as the intelligent router in the architecture.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Complete the &lt;code&gt;main.tf&lt;/code&gt; file with the routing subscription:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"azurerm_eventgrid_event_subscription"&lt;/span&gt; &lt;span class="s2"&gt;"queue_subscription"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"route-order-created"&lt;/span&gt;
  &lt;span class="nx"&gt;scope&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_eventgrid_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;custom_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;

  &lt;span class="nx"&gt;service_bus_queue_endpoint_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;azurerm_servicebus_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;

  &lt;span class="nx"&gt;advanced_filter&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;string_in&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;key&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"data.detail-type"&lt;/span&gt;
      &lt;span class="nx"&gt;values&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"OrderCreated"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To deploy the infrastructure, run &lt;code&gt;terraform init&lt;/code&gt;, &lt;code&gt;terraform plan&lt;/code&gt;, and &lt;code&gt;terraform apply&lt;/code&gt;. Note that while Terraform provisions the infrastructure, the actual deployment of the Python code is typically handled via Azure Functions Core Tools (&lt;code&gt;func azure functionapp publish func-order-processor-app&lt;/code&gt;) or a CI/CD pipeline like GitHub Actions.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Common Troubleshooting
&lt;/h2&gt;

&lt;p&gt;Deploying distributed systems can introduce integration challenges. Here are the most common issues and how to resolve them:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Service Bus Connection String Missing or Invalid:&lt;/strong&gt;&lt;br&gt;
Issue: The Azure Function fails to trigger, and logs show binding errors.&lt;br&gt;
Solution: Verify the Application Settings in the Azure Portal for the Function App. Ensure &lt;code&gt;ServiceBusConnection&lt;/code&gt; exactly matches the primary connection string of the Service Bus Namespace and is spelled correctly in the &lt;code&gt;app.service_bus_queue_trigger&lt;/code&gt; decorator.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Event Grid Schema Mismatch:&lt;/strong&gt;&lt;br&gt;
Issue: Events are successfully published to the topic but never arrive in the Service Bus Queue.&lt;br&gt;
Solution: Inspect the payload structure. Azure Event Grid requires a specific schema (id, subject, data, eventType, etc.). If you are filtering by &lt;code&gt;data.detail-type&lt;/code&gt; in Terraform, ensure your published JSON explicitly contains a &lt;code&gt;data&lt;/code&gt; object with a &lt;code&gt;detail-type&lt;/code&gt; key matching &lt;code&gt;"OrderCreated"&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Storage Account Naming Conflicts:&lt;/strong&gt;&lt;br&gt;
Issue: Terraform fails during the &lt;code&gt;terraform apply&lt;/code&gt; phase when creating the &lt;code&gt;azurerm_storage_account&lt;/code&gt;.&lt;br&gt;
Solution: Storage account names in Azure must be globally unique across all Azure customers, purely lowercase, and between 3 to 24 characters. Adjust the &lt;code&gt;name&lt;/code&gt; attribute in the Terraform block to a highly unique string.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  5. Conclusion
&lt;/h2&gt;

&lt;p&gt;This tutorial established a resilient, decoupled architecture native to Microsoft Azure. By utilizing Terraform, we provisioned Event Grid for intelligent event routing, Service Bus for robust message queuing, and Azure Functions for scalable compute. &lt;/p&gt;

&lt;p&gt;Implementing these patterns provides a clear parallel to other cloud ecosystems, reinforcing a strong foundation for designing cellular architectures and multicloud strategies. As a next step, explore implementing Dead-Letter Queues (DLQ) within the Service Bus to handle poison messages systematically, ensuring your distributed application remains robust even when faced with unprocessable data.&lt;/p&gt;

</description>
      <category>azure</category>
      <category>eventdriven</category>
      <category>python</category>
      <category>terraform</category>
    </item>
    <item>
      <title>Practical Guide: Building an Event-Driven Infrastructure on AWS with Terraform and Python</title>
      <dc:creator>Cláudio Filipe Lima Rapôso</dc:creator>
      <pubDate>Sun, 19 Apr 2026 13:52:58 +0000</pubDate>
      <link>https://dev.to/sertaoseracloud/practical-guide-building-an-event-driven-infrastructure-on-aws-with-terraform-and-python-3lph</link>
      <guid>https://dev.to/sertaoseracloud/practical-guide-building-an-event-driven-infrastructure-on-aws-with-terraform-and-python-3lph</guid>
      <description>&lt;h2&gt;
  
  
  1. Introduction
&lt;/h2&gt;

&lt;p&gt;Event-driven architectures transform the way systems communicate by replacing direct synchronous calls with a highly decoupled publish-subscribe model. By the end of this tutorial, you will be able to provision a complete event-based infrastructure on AWS, using Amazon EventBridge as the central router, Amazon SQS for queuing, and AWS Lambda for computational processing. &lt;/p&gt;

&lt;p&gt;This approach is fundamental for creating scalable and resilient systems. Decoupling ensures that a failure in one component does not take down the entire application, and it allows different parts of the system to scale independently. Although the technical focus here is AWS, mastering event routing and asynchronous processing is a core structural skill for strategies in multicloud environments, where similar architectural patterns are applied using corresponding services from each provider within a unified landing zone.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Prerequisites
&lt;/h2&gt;

&lt;p&gt;To follow this tutorial and execute the proposed configurations, you need the following resources and prior knowledge:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An active Amazon Web Services (AWS) account with administrative permissions to create IAM, Lambda, SQS, and EventBridge resources.&lt;/li&gt;
&lt;li&gt;Terraform installed on your local machine (version 1.0 or higher) for provisioning the Infrastructure as Code (IaC).&lt;/li&gt;
&lt;li&gt;Python (version 3.9 or higher) installed locally to package the Lambda function code.&lt;/li&gt;
&lt;li&gt;AWS credentials configured in your local environment (via AWS CLI or environment variables).&lt;/li&gt;
&lt;li&gt;Basic knowledge of terminal navigation and Terraform HCL syntax.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. Step-by-Step
&lt;/h2&gt;

&lt;p&gt;Before writing the code, it is important to visualize the event lifecycle in our infrastructure. The sequence diagram below illustrates the flow of information between the provisioned services.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F468cuf84sj0jebfx7lfj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F468cuf84sj0jebfx7lfj.png" alt="Sequence Diagram" width="800" height="348"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Configuring the Provider and Main File
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Create the initial Terraform configuration file to define AWS as the cloud provider and establish the deployment region.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; Terraform needs to know which cloud API it will interact with and the default region to allocate resources. This centralizes the basic configuration and prepares the infrastructure state.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Create a file named &lt;code&gt;main.tf&lt;/code&gt; and add the following block:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;terraform&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;required_providers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;aws&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;source&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"hashicorp/aws"&lt;/span&gt;
      &lt;span class="nx"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"~&amp;gt; 5.0"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="s2"&gt;"aws"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;region&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"us-east-1"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.2 Creating the Event Bus (EventBridge)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Provision a customized event bus in Amazon EventBridge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; Although AWS provides a default bus, creating a custom bus for applications is an excellent domain isolation practice. This prevents mixing AWS infrastructure events with your application's business events.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Add to your &lt;code&gt;main.tf&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_cloudwatch_event_bus"&lt;/span&gt; &lt;span class="s2"&gt;"custom_bus"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"app-domain-events"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.3 Creating the SQS Queue for Load Absorption
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Create a queue in Amazon SQS and configure the access policy to allow EventBridge to send messages to it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; Sending events directly from EventBridge to Lambda can cause data loss in case of failures or traffic spikes. The SQS queue acts as a buffer, ensuring message delivery and allowing reprocessing in case of temporary errors.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Add the queue configurations and its respective policy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_sqs_queue"&lt;/span&gt; &lt;span class="s2"&gt;"event_queue"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"event-processing-queue"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_sqs_queue_policy"&lt;/span&gt; &lt;span class="s2"&gt;"event_queue_policy"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;queue_url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;policy&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;aws_iam_policy_document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sqs_policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;json&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_policy_document"&lt;/span&gt; &lt;span class="s2"&gt;"sqs_policy"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;statement&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;effect&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;
    &lt;span class="nx"&gt;actions&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"sqs:SendMessage"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;aws_sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="nx"&gt;principals&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Service"&lt;/span&gt;
      &lt;span class="nx"&gt;identifiers&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"events.amazonaws.com"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nx"&gt;condition&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;test&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ArnEquals"&lt;/span&gt;
      &lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"aws:SourceArn"&lt;/span&gt;
      &lt;span class="nx"&gt;values&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;aws_cloudwatch_event_rule&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event_rule&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.4 Developing the Lambda Function in Python
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Write the Python source code that will be executed when an event arrives in the queue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; The Python code represents the business or computational logic that reacts to the event. In this example, we will iterate over the SQS queue records and log the details for visual validation of the architecture's functionality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Create a file named &lt;code&gt;processor.py&lt;/code&gt; in the same directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;

&lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getLogger&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setLevel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;INFO&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Starting processing of SQS event batch.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Records&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]):&lt;/span&gt;
        &lt;span class="n"&gt;body_string&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;{}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;event_detail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body_string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Received event detail: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event_detail&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="c1"&gt;# Extracting the payload from EventBridge
&lt;/span&gt;            &lt;span class="n"&gt;detail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event_detail&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;detail&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;
            &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;detail&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;order_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Successfully processed order: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;JSONDecodeError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error decoding message body as JSON.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Processing completed successfully&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.5 Provisioning Lambda with Terraform and Connecting to the Queue
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Create the ZIP package of the Python code, define the Lambda function in Terraform, create the execution permission (IAM Role), and map the SQS queue as the Lambda event source.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; The infrastructure needs to allocate the code in AWS and provide the exact permissions so the Lambda can read messages from the queue and write logs to CloudWatch, following the principle of least privilege.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Add the following code to your &lt;code&gt;main.tf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="s2"&gt;"archive_file"&lt;/span&gt; &lt;span class="s2"&gt;"lambda_zip"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"zip"&lt;/span&gt;
  &lt;span class="nx"&gt;source_file&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"processor.py"&lt;/span&gt;
  &lt;span class="nx"&gt;output_path&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"processor.zip"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_role"&lt;/span&gt; &lt;span class="s2"&gt;"lambda_exec_role"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"lambda_sqs_execution_role"&lt;/span&gt;

  &lt;span class="nx"&gt;assume_role_policy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jsonencode&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="nx"&gt;Version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;
    &lt;span class="nx"&gt;Statement&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
      &lt;span class="nx"&gt;Action&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"sts:AssumeRole"&lt;/span&gt;
      &lt;span class="nx"&gt;Effect&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;
      &lt;span class="nx"&gt;Principal&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;Service&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"lambda.amazonaws.com"&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}]&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_role_policy_attachment"&lt;/span&gt; &lt;span class="s2"&gt;"lambda_basic_execution"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;role&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_iam_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_exec_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;policy_arn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_iam_role_policy_attachment"&lt;/span&gt; &lt;span class="s2"&gt;"lambda_sqs_execution"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;role&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_iam_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_exec_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;policy_arn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"arn:aws:iam::aws:policy/service-role/AWSLambdaSQSQueueExecutionRole"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_lambda_function"&lt;/span&gt; &lt;span class="s2"&gt;"event_processor"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;filename&lt;/span&gt;         &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;archive_file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_zip&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;output_path&lt;/span&gt;
  &lt;span class="nx"&gt;function_name&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"OrderEventProcessor"&lt;/span&gt;
  &lt;span class="nx"&gt;role&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_iam_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_exec_role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
  &lt;span class="nx"&gt;handler&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"processor.lambda_handler"&lt;/span&gt;
  &lt;span class="nx"&gt;source_code_hash&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;archive_file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lambda_zip&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;output_file_base64sha256&lt;/span&gt;
  &lt;span class="nx"&gt;runtime&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"python3.11"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_lambda_event_source_mapping"&lt;/span&gt; &lt;span class="s2"&gt;"sqs_trigger"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;event_source_arn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
  &lt;span class="nx"&gt;function_name&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_lambda_function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event_processor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
  &lt;span class="nx"&gt;batch_size&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.6 Creating the Routing Rule in EventBridge
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Configure a rule that filters specific events published on the bus and directs them to the SQS queue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do it:&lt;/strong&gt; The bus can receive thousands of different events. Rules ensure that the target system (your SQS queue) receives only the events that matter to it, optimizing processing and reducing costs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
Complete the &lt;code&gt;main.tf&lt;/code&gt; file with the routing configurations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_cloudwatch_event_rule"&lt;/span&gt; &lt;span class="s2"&gt;"event_rule"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"capture-order-created"&lt;/span&gt;
  &lt;span class="nx"&gt;event_bus_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_cloudwatch_event_bus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;custom_bus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Captures order created events"&lt;/span&gt;

  &lt;span class="nx"&gt;event_pattern&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jsonencode&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="nx"&gt;source&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"com.mycompany.orders"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;detail-type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"OrderCreated"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_cloudwatch_event_target"&lt;/span&gt; &lt;span class="s2"&gt;"sqs_target"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;rule&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_cloudwatch_event_rule&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event_rule&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;event_bus_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_cloudwatch_event_bus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;custom_bus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;target_id&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"SendToSQS"&lt;/span&gt;
  &lt;span class="nx"&gt;arn&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To deploy the entire infrastructure, run the commands &lt;code&gt;terraform init&lt;/code&gt;, &lt;code&gt;terraform plan&lt;/code&gt;, and &lt;code&gt;terraform apply&lt;/code&gt; in your terminal.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Common Troubleshooting
&lt;/h2&gt;

&lt;p&gt;Even with a well-defined script, infrastructure provisioning can encounter obstacles. Below are the most frequent problems and their solutions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Access Denied Permissions on the SQS Queue:&lt;/strong&gt;&lt;br&gt;
What is happening: EventBridge cannot deliver messages to the queue.&lt;br&gt;
How to solve it: Check the &lt;code&gt;aws_sqs_queue_policy&lt;/code&gt;. Ensure the EventBridge rule ARN is correct in the security condition. A misconfigured policy will result in the silent discard of events.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Lambda is not triggered by SQS messages:&lt;/strong&gt;&lt;br&gt;
What is happening: Messages arrive in the queue, but the Python function is not executed.&lt;br&gt;
How to solve it: Validate if the IAM Role linked to the Lambda (&lt;code&gt;lambda_sqs_execution_role&lt;/code&gt;) has the &lt;code&gt;AWSLambdaSQSQueueExecutionRole&lt;/code&gt; policy properly attached. Lambda needs read permissions (&lt;code&gt;sqs:ReceiveMessage&lt;/code&gt;, &lt;code&gt;sqs:DeleteMessage&lt;/code&gt;, &lt;code&gt;sqs:GetQueueAttributes&lt;/code&gt;) to consume the data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Event Pattern Incompatibility:&lt;/strong&gt;&lt;br&gt;
What is happening: You publish an event to the bus, but it does not appear in the queue.&lt;br&gt;
How to solve it: Verify the accuracy of the injected JSON. The &lt;code&gt;source&lt;/code&gt; and &lt;code&gt;detail-type&lt;/code&gt; fields sent by the message producer must perfectly match the values stipulated in the &lt;code&gt;event_pattern&lt;/code&gt; block of the Terraform rule. EventBridge is case-sensitive.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  5. Conclusion
&lt;/h2&gt;

&lt;p&gt;In this tutorial, we built a robust backbone for an event-driven architecture. We configured the structure in code using Terraform, establishing EventBridge as the intelligent traffic router, SQS as the fault-tolerance layer, and Python Lambda as the scalable processing unit. &lt;/p&gt;

&lt;p&gt;Mastering this topology empowers you to integrate microservices safely and cleanly. As next steps, I recommend diving into the implementation of Dead Letter Queues (DLQ) for advanced handling of unresolved failures and exploring how this decoupling architectural pattern translates into the design of complex systems in multicloud scenarios.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>eventdriven</category>
      <category>python</category>
      <category>terraform</category>
    </item>
    <item>
      <title>Road to Golden Jacket: My Journey and Tips for the New AWS SOA-C03 Exam 🚀</title>
      <dc:creator>Cláudio Filipe Lima Rapôso</dc:creator>
      <pubDate>Mon, 30 Mar 2026 16:15:52 +0000</pubDate>
      <link>https://dev.to/sertaoseracloud/road-to-golden-jacket-my-journey-and-tips-for-the-new-aws-soa-c03-exam-i4j</link>
      <guid>https://dev.to/sertaoseracloud/road-to-golden-jacket-my-journey-and-tips-for-the-new-aws-soa-c03-exam-i4j</guid>
      <description>&lt;p&gt;When I sat down to take my AWS Certified SysOps Administrator - Associate exam on &lt;strong&gt;March 23&lt;/strong&gt;, I had prepared for a broad range of operational topics. But when the test began, one thing became immediately clear: &lt;strong&gt;Amazon CloudWatch was everywhere.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It felt like every other scenario revolved around monitoring, analyzing logs, or troubleshooting performance bottlenecks. Now that AWS has officially transitioned to the new &lt;strong&gt;AWS Certified CloudOps Engineer - Associate (SOA-C03)&lt;/strong&gt; exam, I realize that my heavy CloudWatch experience was actually the best preparation I could have asked for.&lt;/p&gt;

&lt;p&gt;Here is a personal look at my exam experience, why CloudWatch is your best friend, and my top tips for anyone tackling the new CloudOps certification.&lt;/p&gt;




&lt;h2&gt;
  
  
  📈 The CloudWatch Marathon
&lt;/h2&gt;

&lt;p&gt;During my exam on March 23, the monitoring and logging domain was a massive part of the test. I had to know exactly how to trigger alarms, collect logs, and automate responses.&lt;/p&gt;

&lt;p&gt;If you are studying for the new SOA-C03 exam, this is even more critical. The domain has been expanded to &lt;strong&gt;"Monitoring, Logging, Analysis, Remediation, and Performance Optimization"&lt;/strong&gt; and now carries the heaviest weight on the exam at &lt;strong&gt;22%&lt;/strong&gt;,.&lt;/p&gt;

&lt;p&gt;To pass, you must understand how to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Configure and manage the CloudWatch agent to collect metrics and logs not just from EC2 instances, but also from &lt;strong&gt;Amazon ECS and Amazon EKS clusters&lt;/strong&gt;,.&lt;/li&gt;
&lt;li&gt;  Set up composite alarms and configure them to invoke AWS services directly or through &lt;strong&gt;Amazon EventBridge&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;  Automate remediation tasks using &lt;strong&gt;AWS Systems Manager Automation runbooks&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🔄 Bridging the Gap: SysOps to CloudOps
&lt;/h2&gt;

&lt;p&gt;While my exam tested me heavily on traditional monitoring, the new "CloudOps" title reflects a modern shift in cloud operations. If I were taking the SOA-C03 today, here is what I would add to my study list to bridge the gap:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;The Container Ecosystem:&lt;/strong&gt; Containers are now officially in-scope. You need to understand Amazon Elastic Container Registry (ECR), Elastic Container Service (ECS), and Elastic Kubernetes Service (EKS),.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Infrastructure as Code (IaC):&lt;/strong&gt; The new exam emphasizes creating and managing stacks using &lt;strong&gt;AWS CloudFormation&lt;/strong&gt; and the &lt;strong&gt;AWS Cloud Development Kit (AWS CDK)&lt;/strong&gt;,.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Multi-Account Management:&lt;/strong&gt; Operating in a single account is a thing of the past. You must know how to enforce compliance and manage security across multiple accounts using &lt;strong&gt;AWS Organizations&lt;/strong&gt;, &lt;strong&gt;Service Control Policies (SCPs)&lt;/strong&gt;, and &lt;strong&gt;AWS Control Tower&lt;/strong&gt;,.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  💡 My Top Tips for Exam Day
&lt;/h2&gt;

&lt;p&gt;Based on my experience in the hot seat, here are my top tips for crushing the exam:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Build a Real Monitoring Lab:&lt;/strong&gt; Don't just read the docs. Spin up an EC2 instance, install the CloudWatch agent, and create a custom dashboard,. Set up an alarm to trigger an SNS notification or an EventBridge rule, so you see the automated flow in action.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Master Automation:&lt;/strong&gt; Understand how to use AWS Systems Manager to automate operational processes and remediate issues,.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Eliminate the Noise:&lt;/strong&gt; AWS questions are famously wordy and the distractors are designed to look plausible. Find the core requirement like "most cost-effective" or "highly available" and eliminate the answers that don't fit.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embrace the AWS Well-Architected Framework:&lt;/strong&gt; The exam tests your ability to support workloads according to these best practices,. Always think about how to optimize performance and reliability.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Taking this exam validated my practical skills and proved that I could maintain and secure workloads in the cloud. If you are preparing for the SOA-C03, dive deep into CloudWatch and embrace automation!&lt;/p&gt;




&lt;h1&gt;
  
  
  💡 Enjoyed it?
&lt;/h1&gt;

&lt;p&gt;If you want to chat about AI, cloud, and architecture, follow me on my socials:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://olink.ai/sertaoseracloud" rel="noopener noreferrer"&gt;Social Media&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I post technical content straight from the front lines. And whenever I discover a time-saving tool that gets the job done right, like this one, you'll hear about it too.&lt;/p&gt;

&lt;p&gt;Have you taken the exam yet? Let me know your experience in the comments below! 👇&lt;/p&gt;

</description>
      <category>aws</category>
      <category>career</category>
      <category>devops</category>
      <category>monitoring</category>
    </item>
    <item>
      <title>From Infrastructure to Product: Pipelines into a Profitable SaaS</title>
      <dc:creator>Cláudio Filipe Lima Rapôso</dc:creator>
      <pubDate>Thu, 12 Mar 2026 19:12:31 +0000</pubDate>
      <link>https://dev.to/sertaoseracloud/from-infrastructure-to-product-pipelines-into-a-profitable-saas-44mc</link>
      <guid>https://dev.to/sertaoseracloud/from-infrastructure-to-product-pipelines-into-a-profitable-saas-44mc</guid>
      <description>&lt;p&gt;&lt;strong&gt;Open Journal Systems (OJS)&lt;/strong&gt; is the engine behind thousands of scientific publications worldwide. However, educational institutions, from high schools running literary magazines to universities managing peer-reviewed journals, share a massive pain point: they want to publish research, but they lack the internal IT resources to manage complex PHP monoliths, backups, and server security.&lt;/p&gt;

&lt;p&gt;Blending my experience in the classroom with software architecture, I realized this is a massive opportunity. We can move beyond just "hosting" OJS and transform it into a fully automated, &lt;strong&gt;Zero-Touch B2B SaaS platform&lt;/strong&gt; targeted globally at the education sector.&lt;/p&gt;

&lt;p&gt;This article details the design, infrastructure, and product strategy to build a highly available, multi-tenant OJS SaaS on &lt;strong&gt;Amazon Web Services (AWS)&lt;/strong&gt; using Terraform, Go, and GitHub Actions.&lt;/p&gt;




&lt;h2&gt;
  
  
  🏗️ 1. The Reference Architecture on AWS
&lt;/h2&gt;

&lt;p&gt;To sell this globally, we cannot afford manual interventions or local server disks. We must adopt a distributed, &lt;em&gt;stateless&lt;/em&gt;, and 100% managed topology.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Components:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Compute:&lt;/strong&gt; &lt;strong&gt;Amazon ECS with AWS Fargate&lt;/strong&gt;. We run the OJS Docker image in a &lt;em&gt;serverless&lt;/em&gt; manner. We pay only for the CPU/RAM consumed, which is crucial for maximizing profit margins.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database:&lt;/strong&gt; &lt;strong&gt;Amazon RDS for MySQL&lt;/strong&gt;. Fully managed, automated backups, and isolated data tiers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;File Persistence:&lt;/strong&gt; OJS requires a persistent directory (&lt;code&gt;files_dir&lt;/code&gt;) for articles and PDFs. We mount &lt;strong&gt;Amazon EFS (Elastic File System)&lt;/strong&gt; directly to the ECS Task via the NFS protocol.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ⚙️ 2. The "Zero-Touch" Provisioning Engine
&lt;/h2&gt;

&lt;p&gt;To turn this infrastructure into a product, the entire customer lifecycle must be automated. Here is how the provisioning engine works:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftro7h6xlpywnjfbrj862.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftro7h6xlpywnjfbrj862.png" alt="Engine" width="800" height="672"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The Storefront:&lt;/strong&gt; A university administrator selects a plan on your landing page and checks out.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Orchestrator (Golang API):&lt;/strong&gt; For high concurrency and low footprint, a lightweight backend API built in &lt;strong&gt;Go&lt;/strong&gt; listens for the payment gateway webhook (e.g., Stripe).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Trigger:&lt;/strong&gt; The Go API automatically calls the &lt;strong&gt;GitHub Actions API&lt;/strong&gt;, passing the &lt;code&gt;tenant_name&lt;/code&gt;, &lt;code&gt;region&lt;/code&gt;, and &lt;code&gt;tier&lt;/code&gt; as JSON payloads.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Magic:&lt;/strong&gt; GitHub Actions runs the Terraform scripts. Within minutes, the isolated AWS infrastructure is provisioned, Route53 updates the DNS (&lt;code&gt;journal.university.edu&lt;/code&gt;), and the client receives an automated email with their admin credentials. &lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  🌍 Level 1: The System Context Diagram (Business View)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Who it's for:&lt;/strong&gt; Executives, Investors, University Directors, and non-technical Stakeholders.&lt;br&gt;
&lt;strong&gt;What it shows:&lt;/strong&gt; The "Big Picture." It places our &lt;strong&gt;OJS SaaS Platform&lt;/strong&gt; at the center of the universe and focuses exclusively on answering two questions: &lt;em&gt;Who uses the system?&lt;/em&gt; and &lt;em&gt;What other systems do we talk to?&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Actors:&lt;/strong&gt; It makes it clear that we have different user profiles (the Admin who pays, the Author who publishes, the Reader who consumes).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The External Boundaries:&lt;/strong&gt; It shows that we didn't build a payment processor from scratch (we use &lt;strong&gt;Stripe&lt;/strong&gt;), we didn't reinvent infrastructure pipelines (we use &lt;strong&gt;GitHub Actions&lt;/strong&gt;), and we integrate with academic industry standards (&lt;strong&gt;Crossref&lt;/strong&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Value:&lt;/strong&gt; Here, technology doesn't matter. There is no AWS, Go, or PHP in this diagram. The focus is purely on the business value flow.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fierybgtw58y6oubbfa0v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fierybgtw58y6oubbfa0v.png" alt="Systems Context" width="800" height="565"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Level 2: The Container Diagram (Solution/Cloud View)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Who it's for:&lt;/strong&gt; Cloud Architects, DevOps Engineers, Tech Leads, and Security Teams.&lt;br&gt;
&lt;strong&gt;What it shows:&lt;/strong&gt; How the central system (the blue box from Level 1) is divided internally.&lt;br&gt;
&lt;em&gt;(Architectural Note: In the C4 Model, a "Container" means an independently deployable/runnable unit, not necessarily a Docker container, although in our case, they coincide).&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Separation of Concerns:&lt;/strong&gt; The division between the &lt;strong&gt;Storefront Web&lt;/strong&gt; (React Front-end), the &lt;strong&gt;Provisioning API&lt;/strong&gt; (our Go engine), and the &lt;strong&gt;Tenant Infrastructure&lt;/strong&gt; (the client's OJS) is crystal clear.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Technology Choices:&lt;/strong&gt; This is where the cloud "magic" appears. We show that OJS runs on &lt;strong&gt;ECS Fargate&lt;/strong&gt; (Compute), uses &lt;strong&gt;Amazon RDS&lt;/strong&gt; (Relational Database), and &lt;strong&gt;Amazon EFS&lt;/strong&gt; (Network persistence for PDFs).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Communication Flow:&lt;/strong&gt; It shows that the Admin doesn't access the database directly; the communication flows from the Browser to the Load Balancer, to the container, and only then to the database via port 3306.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fud5pyvu4a8vh42skqk10.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fud5pyvu4a8vh42skqk10.png" alt="Container Context" width="800" height="713"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  🧩 Level 3: The Component Diagram (Engineering/Code View)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Who it's for:&lt;/strong&gt; Software Developers (Backend).&lt;br&gt;
&lt;strong&gt;What it shows:&lt;/strong&gt; We take just one of the containers from Level 2 (in this case, our &lt;strong&gt;Go Provisioning API&lt;/strong&gt;) and open the hood to see how the code is structured internally.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Domain Isolation:&lt;/strong&gt; This is the perfect level to demonstrate &lt;strong&gt;Domain-Driven Design (DDD)&lt;/strong&gt; principles. The &lt;code&gt;Tenant Manager&lt;/code&gt; component represents the heart of our domain (where the business rules live).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dependency Inversion:&lt;/strong&gt; The diagram shows components like the &lt;code&gt;GitHub API Client&lt;/code&gt; and the &lt;code&gt;Metadata Repository&lt;/code&gt;. As architects, we want our domain (&lt;code&gt;Tenant Manager&lt;/code&gt;) to not depend on specific infrastructure implementations, but rather on interfaces (ports and adapters). The visual flow shows how the business rule delegates the heavy I/O lifting to external clients.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Webhook Lifecycle:&lt;/strong&gt; It explains the exact path of the data: The request hits the &lt;code&gt;Router&lt;/code&gt;, passes through validation in the &lt;code&gt;Payment Handler&lt;/code&gt;, updates the database via the &lt;code&gt;Metadata Repo&lt;/code&gt;, and finally triggers the infrastructure via the &lt;code&gt;GitHub Client&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiqdgo3vaqy1dd2p65kyq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiqdgo3vaqy1dd2p65kyq.png" alt="Component Diagram" width="513" height="1080"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here is the GitHub Actions workflow that your Go API triggers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Zero-Touch&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;SaaS&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Provisioning'&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;workflow_dispatch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;inputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;tenant_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Tenant&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Name'&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;true&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
      &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Tier&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;(basic/premium)'&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;true&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
      &lt;span class="na"&gt;aws_region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Target&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Region'&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;true&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Configure AWS Credentials&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/configure-aws-credentials@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;aws-access-key-id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_ACCESS_KEY_ID }}&lt;/span&gt;
          &lt;span class="na"&gt;aws-secret-access-key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_SECRET_ACCESS_KEY }}&lt;/span&gt;
          &lt;span class="na"&gt;aws-region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ github.event.inputs.aws_region }}&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Terraform Init &amp;amp; Apply&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;terraform init -backend-config="key=${{ github.event.inputs.tenant_name }}.tfstate"&lt;/span&gt;
          &lt;span class="s"&gt;terraform apply -var-file="envs/${{ github.event.inputs.environment }}.tfvars.json" -auto-approve&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🌍 3. Global Scaling and Data Compliance
&lt;/h2&gt;

&lt;p&gt;Selling to the global education market means navigating strict data privacy laws. Because our infrastructure is codified with Terraform, deploying a compliant environment is just a matter of dynamically injecting the &lt;code&gt;aws_region&lt;/code&gt; variable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GDPR (Europe):&lt;/strong&gt; If a German university signs up, the automation deploys to &lt;code&gt;eu-central-1&lt;/code&gt; (Frankfurt). The data never leaves Europe.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LGPD (Brazil):&lt;/strong&gt; For Brazilian institutions, the automation targets &lt;code&gt;sa-east-1&lt;/code&gt; (São Paulo).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FERPA/HIPAA (USA):&lt;/strong&gt; Medical universities publishing clinical trials can be hosted in isolated, dedicated environments in the US.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  💎 4. Tiered Product Strategy (FinOps)
&lt;/h2&gt;

&lt;p&gt;We can structure the SaaS offerings to capture both ends of the academic spectrum, leveraging AWS FinOps strategies:&lt;/p&gt;

&lt;h3&gt;
  
  
  Tier 1: High School &amp;amp; Small College Plan (High Volume)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use Case:&lt;/strong&gt; High school science fairs or small departmental newsletters.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture:&lt;/strong&gt; Shared ECS resources and a single shared RDS instance with logical separation (different schemas per school).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FinOps:&lt;/strong&gt; We use &lt;strong&gt;Fargate Spot&lt;/strong&gt; to utilize spare AWS compute capacity for up to a &lt;strong&gt;70% discount&lt;/strong&gt;. We also use EventBridge to scale the environments down to zero during weekends and school holidays.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Tier 2: University &amp;amp; Research Institute Plan (High Margin)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use Case:&lt;/strong&gt; Official university peer-reviewed journals indexing globally with Crossref and DOIs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture:&lt;/strong&gt; Dedicated ECS Tasks, isolated RDS instances powered by Graviton (ARM) processors for better price/performance, and custom domain mapping.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Price Point:&lt;/strong&gt; Premium recurring revenue with strict SLA guarantees and daily automated snapshots.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🐳 5. Dockerizing OJS
&lt;/h2&gt;

&lt;p&gt;The Docker image remains very similar to any standard containerized environment, focusing heavily on the required PHP extensions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; php:8.2-apache&lt;/span&gt;

&lt;span class="c"&gt;# System dependencies and PHP extensions required for OJS on AWS&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;apt-get update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    libpng-dev libjpeg-dev libxml2-dev libzip-dev zlib1g-dev libicu-dev g++ &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; docker-php-ext-configure gd &lt;span class="nt"&gt;--with-jpeg&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; docker-php-ext-install gd mysqli opcache zip intl gettext

&lt;span class="c"&gt;# Upload Optimizations&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'opcache.memory_consumption=128'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'upload_max_filesize=64M'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'post_max_size=64M'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /usr/local/etc/php/conf.d/ojs-optimization.ini

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; . /var/www/html/&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;chown&lt;/span&gt; &lt;span class="nt"&gt;-R&lt;/span&gt; www-data:www-data /var/www/html/

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🛠️ 6. Infrastructure as Code (Terraform)
&lt;/h2&gt;

&lt;p&gt;Governance across the AWS environment is managed via Terraform. The state (&lt;code&gt;.tfstate&lt;/code&gt;) is securely stored in &lt;strong&gt;Amazon S3&lt;/strong&gt; with State Locking handled by &lt;strong&gt;DynamoDB&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Below is the critical snippet that connects ECS (Fargate) to EFS to persist the PDFs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# 1. Amazon RDS MySQL&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_db_instance"&lt;/span&gt; &lt;span class="s2"&gt;"ojs_db"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;identifier&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ojs-db-${var.environment}"&lt;/span&gt;
  &lt;span class="nx"&gt;allocated_storage&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;
  &lt;span class="nx"&gt;engine&lt;/span&gt;               &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"mysql"&lt;/span&gt;
  &lt;span class="nx"&gt;engine_version&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"8.0"&lt;/span&gt;
  &lt;span class="nx"&gt;instance_class&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_instance_class&lt;/span&gt;
  &lt;span class="nx"&gt;username&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_user&lt;/span&gt;
  &lt;span class="nx"&gt;password&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;db_password&lt;/span&gt;
  &lt;span class="nx"&gt;vpc_security_group_ids&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;aws_security_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rds_sg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;db_subnet_group_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_db_subnet_group&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;
  &lt;span class="nx"&gt;skip_final_snapshot&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# 2. Amazon EFS for the files directory&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_efs_file_system"&lt;/span&gt; &lt;span class="s2"&gt;"ojs_files"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;creation_token&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ojs-efs-${var.environment}"&lt;/span&gt;
  &lt;span class="nx"&gt;encrypted&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# 3. ECS Task Definition (Fargate) with EFS Volume&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_ecs_task_definition"&lt;/span&gt; &lt;span class="s2"&gt;"ojs_task"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;family&lt;/span&gt;                   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ojs-app-${var.environment}"&lt;/span&gt;
  &lt;span class="nx"&gt;requires_compatibilities&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"FARGATE"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="nx"&gt;network_mode&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"awsvpc"&lt;/span&gt;
  &lt;span class="nx"&gt;cpu&lt;/span&gt;                      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fargate_cpu&lt;/span&gt;
  &lt;span class="nx"&gt;memory&lt;/span&gt;                   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;fargate_memory&lt;/span&gt;

  &lt;span class="nx"&gt;container_definitions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jsonencode&lt;/span&gt;&lt;span class="p"&gt;([{&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ojs-container"&lt;/span&gt;
    &lt;span class="nx"&gt;image&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ecr_image_url&lt;/span&gt;
    &lt;span class="nx"&gt;mountPoints&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
      &lt;span class="nx"&gt;sourceVolume&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ojs-efs-volume"&lt;/span&gt;
      &lt;span class="nx"&gt;containerPath&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"/var/www/ojsdata"&lt;/span&gt;
    &lt;span class="p"&gt;}]&lt;/span&gt;
    &lt;span class="nx"&gt;environment&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"MYSQL_HOST"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_db_instance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ojs_db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;address&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}])&lt;/span&gt;

  &lt;span class="nx"&gt;volume&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ojs-efs-volume"&lt;/span&gt;
    &lt;span class="nx"&gt;efs_volume_configuration&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;file_system_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;aws_efs_file_system&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ojs_files&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
      &lt;span class="nx"&gt;transit_encryption&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ENABLED"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Variable Injection (FinOps)
&lt;/h3&gt;

&lt;p&gt;We define instance sizes via JSON (&lt;code&gt;prod.tfvars.json&lt;/code&gt;) to control costs per environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"environment"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"prod"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"region"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"us-east-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"fargate_cpu"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1024"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"fargate_memory"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2048"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"db_instance_class"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"db.t4g.medium"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"db_user"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ojsadmin"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🚀 7. CI/CD with GitHub Actions
&lt;/h2&gt;

&lt;p&gt;The automated deployment pipeline checks out the code, configures AWS credentials, and updates the infrastructure via Terraform.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Deploy&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;OJS&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;AWS'&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;workflow_dispatch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;inputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;tenant_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Tenant&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Name'&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;true&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
      &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Environment&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;(dev/hom/prod)'&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;required&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;true&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Configure AWS Credentials&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/configure-aws-credentials@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;aws-access-key-id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_ACCESS_KEY_ID }}&lt;/span&gt;
          &lt;span class="na"&gt;aws-secret-access-key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AWS_SECRET_ACCESS_KEY }}&lt;/span&gt;
          &lt;span class="na"&gt;aws-region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;us-east-1&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Terraform Init (S3 Backend)&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;terraform init \&lt;/span&gt;
            &lt;span class="s"&gt;-backend-config="bucket=ojs-terraform-state-central" \&lt;/span&gt;
            &lt;span class="s"&gt;-backend-config="key=${{ github.event.inputs.tenant_name }}.${{ github.event.inputs.environment }}.tfstate" \&lt;/span&gt;
            &lt;span class="s"&gt;-backend-config="dynamodb_table=terraform-lock"&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Terraform Apply&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;TF_VAR_db_password&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.DB_PASSWORD }}&lt;/span&gt;
        &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;terraform apply \&lt;/span&gt;
            &lt;span class="s"&gt;-var-file="envs/${{ github.event.inputs.environment }}.tfvars.json" \&lt;/span&gt;
            &lt;span class="s"&gt;-auto-approve&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  💰 8. FinOps: Cost Optimization on AWS
&lt;/h2&gt;

&lt;p&gt;AWS provides fantastic mechanisms to slash costs in non-production environments:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Fargate Spot:&lt;/strong&gt; In &lt;strong&gt;Dev/Hom&lt;/strong&gt; environments, we configure the ECS &lt;em&gt;Capacity Provider&lt;/em&gt; to use &lt;strong&gt;Fargate Spot&lt;/strong&gt;, utilizing spare AWS compute capacity for up to a &lt;strong&gt;70% discount&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graviton (ARM):&lt;/strong&gt; We use RDS instances powered by Graviton processors (&lt;code&gt;db.t4g&lt;/code&gt;), which deliver better performance at a lower price point than the x86 architecture.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scheduling Automation:&lt;/strong&gt; We leverage &lt;strong&gt;Amazon EventBridge&lt;/strong&gt; to trigger an &lt;strong&gt;AWS Lambda&lt;/strong&gt; function that stops the RDS instance and scales the ECS &lt;em&gt;Desired Count&lt;/em&gt; to &lt;code&gt;0&lt;/code&gt; outside of regular business hours.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  🎯 Conclusion
&lt;/h2&gt;

&lt;p&gt;By bridging the gap between software architecture and the needs of the classroom, we can transform Open Journal Systems from a complex IT burden into a seamless, globally scalable SaaS product. The combination of AWS ECS Fargate for horizontal scalability, Go for rapid API orchestration, and Terraform for compliant IaC delivers a true, zero-touch EdTech platform.&lt;/p&gt;

&lt;p&gt;Have you ever thought about productizing open-source software into a niche B2B SaaS? Share your thoughts and architectural approaches in the comments! 👇&lt;/p&gt;

</description>
      <category>aws</category>
      <category>terraform</category>
      <category>devops</category>
      <category>architecture</category>
    </item>
  </channel>
</rss>
