<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Raj Shah</title>
    <description>The latest articles on DEV Community by Raj Shah (@rajshahblog).</description>
    <link>https://dev.to/rajshahblog</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2525891%2Fec7fe2ca-e3f6-49c1-ae63-9655e2bb77ec.png</url>
      <title>DEV Community: Raj Shah</title>
      <link>https://dev.to/rajshahblog</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rajshahblog"/>
    <language>en</language>
    <item>
      <title>Mastering Amazon EKS Auto Mode: A Deep Dive into Serverless Kubernetes</title>
      <dc:creator>Raj Shah</dc:creator>
      <pubDate>Mon, 15 Dec 2025 13:00:39 +0000</pubDate>
      <link>https://dev.to/rajshahblog/mastering-amazon-eks-auto-mode-a-deep-dive-into-serverless-kubernetes-26fk</link>
      <guid>https://dev.to/rajshahblog/mastering-amazon-eks-auto-mode-a-deep-dive-into-serverless-kubernetes-26fk</guid>
      <description>&lt;h2&gt;
  
  
  1. The Kubernetes Tax: The Hidden Cost of Cluster Management
&lt;/h2&gt;

&lt;p&gt;In the world of cloud-native development, Kubernetes is the undisputed king. Yet, for many engineering teams, the crown feels unexpectedly heavy. This weight is the "Kubernetes Tax" — A significant operational cost teams must pay in the form of relentless cluster management. This undifferentiated work, while necessary, distracts skilled engineers from their primary goal: building innovative applications that drive business value.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Basically everyone today wants to deploy their applications to Kubernetes initially they think it's easy but very lately they realize the challenges with Kubernetes management.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;These challenges begin on day one and persist throughout the entire lifecycle of a cluster, neatly falling into two categories.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day 1 Operations (Provisioning):
&lt;/h3&gt;

&lt;p&gt;The initial setup is fraught with complex decisions and manual effort.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Initial capacity planning:&lt;/strong&gt; Teams struggle to determine the right number and size of nodes for workloads they have yet to run.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Instance selection:&lt;/strong&gt; Choosing the correct EC2 instance types from hundreds of options (e.g., memory-optimized, GPU-accelerated) is critical for performance and cost, but difficult to get right.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manual networking setup:&lt;/strong&gt; Provisioning a Virtual Private Cloud (VPC) with the necessary subnets, route tables, internet gateways, and NAT gateways is time-consuming and prone to misconfiguration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Infrastructure as Code (IaC) overhead:&lt;/strong&gt; Using tools like Terraform requires significant effort to manage state files, handle locking, and maintain complex configuration files.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Day 2 Operations (Ongoing Management):
&lt;/h3&gt;

&lt;p&gt;Once a cluster is live, the management burden becomes a relentless cycle of maintenance.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Constant node management:&lt;/strong&gt; Engineers frequently perform manual scaling, adding nodes for weekend traffic surges or flash sales, and then scaling down to control costs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security patching:&lt;/strong&gt; Teams are responsible for continuously applying patches and fixes for critical vulnerabilities across all worker nodes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cluster version upgrades:&lt;/strong&gt; Keeping the cluster up-to-date with the latest Kubernetes versions is a frequent and necessary task to access new features and bug fixes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Component compatibility:&lt;/strong&gt; With each cluster upgrade, core components like the CNI, CoreDNS, and CSI drivers must be checked and updated to ensure they remain compatible.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Amazon EKS Auto Mode is called “serverless Kubernetes” because AWS fully manages the underlying compute lifecycle — nodes exist, but operators never interact with them. Developers deploy pods, and AWS handles capacity, scaling, patching, and infrastructure automatically.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. The Shift to Serverless Kubernetes: Introducing EKS Auto Mode
&lt;/h2&gt;

&lt;p&gt;Having established the relentless operational tax of Kubernetes, we can now see the strategic shift it necessitates. Amazon EKS Auto Mode is AWS's direct response to this challenge, engineered to absorb the undifferentiated work of the data plane. It represents an evolutionary shift toward "Serverless Kubernetes" by automating the entire lifecycle of compute, networking, and storage components.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F719m5uewf9dseknoslb6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F719m5uewf9dseknoslb6.png" alt="Responsibility Stack Comparison between EKS Standard and EKS Auto Mode" width="800" height="900"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With EKS Auto Mode, the responsibility for managing the data plane moves from your team to AWS. This allows platform teams and developers to stop managing infrastructure and focus exclusively on deploying and running their applications. This functionality can be enabled on both new and existing EKS clusters, providing a direct path to reducing operational overhead.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. How It Works: The Technical Pillars of "Hands-Off" Kubernetes
&lt;/h2&gt;

&lt;p&gt;EKS Auto Mode is built on a foundation of managed, integrated components that work together to deliver a fully automated experience.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Automated Compute with Integrated Karpenter
&lt;/h3&gt;

&lt;p&gt;EKS Auto Mode integrates an upstream-compatible, AWS-managed version of the Karpenter controller directly into the cluster. This eliminates the need for manual node management and delivers intelligent, on-demand compute.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Automated Provisioning:&lt;/strong&gt; It automatically launches and consolidates nodes based on the real-time demands of your application workloads.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Intelligent Selection:&lt;/strong&gt; It intelligently selects the optimal and lowest-cost EC2 instance types, including Spot and Graviton, that precisely meet your application's resource requirements.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero Overhead:&lt;/strong&gt; It removes the need to run and manage a dedicated node just for the Karpenter controller itself, further reducing cost and complexity.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F80t0l589t51oncmxbyyu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F80t0l589t51oncmxbyyu.png" alt="Karpenter in AWS EKS Auto Mode" width="800" height="203"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3.2 Zero-Touch Security and Upgrades with Bottlerocket
&lt;/h3&gt;

&lt;p&gt;EKS Auto Mode exclusively uses Bottlerocket AMIs for all worker nodes. Bottlerocket is a purpose-built, minimal Linux-based operating system designed for running containers. This approach provides significant security and operational benefits.&lt;/p&gt;

&lt;p&gt;AWS continuously patches, tests, and rolls out updates to these AMIs, removing the manual patching burden entirely. Worker nodes are automatically recycled after a maximum of 21 days (a configurable 20-day expiry plus a 1-day grace period). This mandatory lifecycle isn't a limitation; it's a core security feature. It guarantees that nodes are constantly replaced with the latest patched and validated Bottlerocket AMI, effectively eliminating configuration drift and ensuring vulnerabilities are purged from the cluster automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.3 Managed Core Services and True Scale-to-Zero
&lt;/h3&gt;

&lt;p&gt;Essential cluster add-ons, including the EBS CSI driver, VPC CNI, and CoreDNS, are managed by AWS. Instead of running as daemonsets on your worker nodes, these core components are integrated directly into the control plane or baked into the Bottlerocket AMI as systemd processes.&lt;/p&gt;

&lt;p&gt;This architectural decision is the key to enabling a true scale-to-zero capability. Because essential services like the VPC CNI are not running as daemonsets requiring a persistent user-managed node, the data plane can completely vanish when no application workloads are running. Your compute footprint drops to zero, and you pay nothing for compute resources.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F779mi436tcf2via8oz2m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F779mi436tcf2via8oz2m.png" alt="Node Anatomy in AWS EKS Mode" width="800" height="655"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  4. The Critical Choice: EKS Auto Mode vs. Self-Managed Karpenter
&lt;/h2&gt;

&lt;p&gt;Choosing the right path for your cluster automation isn't about finding a 'better' tool, but the 'right' tool for your organization's needs. The decision boils down to a fundamental trade-off: embracing the Superior Simplicity of a fully managed solution or retaining the Unparalleled Flexibility of self-management.&lt;/p&gt;

&lt;p&gt;EKS with Self-Managed Karpenter Vs Amazon EKS Auto Mode Table&lt;/p&gt;

&lt;h3&gt;
  
  
  4.1 Who Should Choose Which?
&lt;/h3&gt;

&lt;p&gt;Choose Self-Managed Karpenter if: You have an in-house platform team with the expertise to manage Karpenter, you require the use of custom AMIs (like Ubuntu), your workloads need nodes that must run longer than 21 days, or you have nuanced custom networking requirements.&lt;/p&gt;

&lt;p&gt;Choose EKS Auto Mode if: You want to accelerate your time-to-market, wish to completely eliminate node and add-on management, need a serverless experience more powerful than EKS Fargate, with full support for daemonsets, service meshes, GPUs, and Spot instances, or are a startup without a dedicated platform team and want to focus on delivering business value.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Seeing is Believing: A Walkthrough of EKS Auto in Action
&lt;/h2&gt;

&lt;p&gt;The power of EKS Auto Mode is best understood by seeing it respond to a real-world scenario, as shown in the demonstration using the eks-node-viewer utility.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go to EKS -&amp;gt; Create a Cluster&lt;/li&gt;
&lt;li&gt;Create Cluster and Node IAM Roles&lt;/li&gt;
&lt;li&gt;Create the EKS Cluster&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F52vicl2y3t72w28vvyou.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F52vicl2y3t72w28vvyou.png" alt="AWS EKS Auto Mode Cluster Configuration" width="800" height="492"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Once cluster is in "Active" State. Update the local kubeconfig. And let's check the &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftfmnzcfjpy3o8mxqvvlk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftfmnzcfjpy3o8mxqvvlk.png" alt="EKS Cluster in Active State with 2 Nodes" width="800" height="307"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws eks update-kubeconfig --name &amp;lt;cluster-name&amp;gt; --region &amp;lt;aws-region&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Let's try scaling up with an application over 20 microservices. &lt;a href="https://opentelemetry.io/docs/demo/kubernetes-deployment/" rel="noopener noreferrer"&gt;OpenTelemetry&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts

helm install my-otel-demo open-telemetry/opentelemetry-demo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxrok0aa2en99tdr84atb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxrok0aa2en99tdr84atb.png" alt="Install OpenTelemetry" width="800" height="583"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A crucial moment arrives: 24 pods sit in a "Pending" state, awaiting resources. This is where Auto Mode's intelligence becomes visible. The integrated Karpenter controller detects this demand in real-time and, after a swift calculation, provisions a perfectly right-sized m5a.large node. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcqr3alexj7wfx5cij8c3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcqr3alexj7wfx5cij8c3.png" alt="New Pods being deployed" width="800" height="570"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpo3lzol0ijhqpyq2m3wx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpo3lzol0ijhqpyq2m3wx.png" alt="New node deployed by EKS Auto Mode" width="800" height="341"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The controller calculates this and deploys new node, demonstrating the intelligence of the right-sizing as the pending pods quickly transition to a running state.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Let's uninstall the OpenTelemetry Application.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm uninstall my-otel-demo open-telemetry/opentelemetry-demo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe5sfi78dldxrkc5equ66.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe5sfi78dldxrkc5equ66.png" alt="Uninstall OpenTelemetry application" width="800" height="535"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scaling Down for Cost Efficiency The application is uninstalled, and its 24 pods are terminated. Karpenter detects that the m5a.large node is now empty and underutilized. After a brief consolidation period of 30 seconds, it automatically terminates the node to eliminate waste. This demonstrates the solution's powerful cost-effectiveness, ensuring you never pay for idle resources and achieving true scale-to-zero.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8gl4a4bkyd8nim5ufxaz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8gl4a4bkyd8nim5ufxaz.png" alt="Nodes scale down" width="800" height="314"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Let's have a recap:
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F64k603bofijpmkcfbnw1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F64k603bofijpmkcfbnw1.png" alt="Operational Flow: From Requests to Realization" width="800" height="372"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  6. The Business Impact: Beyond Technical Elegance
&lt;/h2&gt;

&lt;p&gt;The benefits of EKS Auto Mode extend directly to the bottom line and team productivity.&lt;/p&gt;

&lt;h3&gt;
  
  
  6.1 Continuous Cost Optimization
&lt;/h3&gt;

&lt;p&gt;EKS Auto Mode delivers continuous cost optimization out of the box. The integrated Karpenter automatically performs bin-packing to consolidate workloads onto fewer nodes, terminates underutilized instances, and always selects the lowest-cost EC2 instance types that meet your application's needs. This automation ensures your cluster is always right-sized, and you can continue to benefit from programs like AWS Savings Plans.&lt;/p&gt;

&lt;h3&gt;
  
  
  6.2 Reducing Operational Overhead
&lt;/h3&gt;

&lt;p&gt;The core value proposition of EKS Auto Mode is offloading the operational burden of Kubernetes. By automating cluster provisioning, scaling, patching, and upgrades, it eliminates the undifferentiated work associated with infrastructure management. This frees up engineers and platform teams to stop managing clusters and dedicate their time and talent to building the applications that drive business innovation.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Conclusion: The Dawn of Invisible Infrastructure
&lt;/h2&gt;

&lt;p&gt;Amazon EKS Auto Mode is a significant step toward making Kubernetes infrastructure management truly "invisible." It abstracts away the immense complexity of running production-grade clusters without sacrificing the power and conformance of the Kubernetes API. By taking on the heavy lifting of the data plane, AWS allows teams to treat Kubernetes as a true application platform.&lt;/p&gt;

&lt;p&gt;It's time for platform teams to audit their current management overhead. What could you build if that time was given back to innovation?&lt;/p&gt;

</description>
      <category>serverless</category>
      <category>kubernetes</category>
      <category>devops</category>
      <category>aws</category>
    </item>
    <item>
      <title>Building a Production ready AI Agent with Amazon Bedrock AgentCore: A Complete Hands-On Guide</title>
      <dc:creator>Raj Shah</dc:creator>
      <pubDate>Mon, 15 Dec 2025 12:59:43 +0000</pubDate>
      <link>https://dev.to/rajshahblog/building-a-production-ready-ai-agent-with-amazon-bedrock-agentcore-a-complete-hands-on-guide-pll</link>
      <guid>https://dev.to/rajshahblog/building-a-production-ready-ai-agent-with-amazon-bedrock-agentcore-a-complete-hands-on-guide-pll</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F025xe9v3po9fy3geoumu.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F025xe9v3po9fy3geoumu.jpeg" alt=" " width="800" height="368"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you’ve used frameworks like LangChain or LlamaIndex, you know the excitement of your first working agent locally. But turning that prototype into a production system quickly hits infrastructure complexity. You suddenly deal with scaling, security, and cloud components instead of just code. Amazon Bedrock AgentCore bridges this gap, taking agents from local scripts to production in minutes.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is Amazon Bedrock AgentCore?
&lt;/h2&gt;

&lt;p&gt;Amazon Bedrock AgentCore is an enterprise-grade framework and managed hosting service that provides the "primitives" for generative AI operations. Think of it as the foundational infrastructure that handles the boring but critical parts of agentic systems—containerization, isolation, and compliance—so you can focus on your agent's reasoning logic.&lt;/p&gt;

&lt;p&gt;Architecturally, AgentCore is split into two primary layers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Control Plane API: Used at configuration time for resource management and setup.&lt;/li&gt;
&lt;li&gt;Data Plane API: Used at runtime for actual session invocation and operation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is strictly framework-agnostic. Whether you are using Strands Agents, LangChain, or your own custom orchestration, AgentCore provides the managed environment to run those agents at AWS scale.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdg7dgwbqs9n57f2dhd8e.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdg7dgwbqs9n57f2dhd8e.jpeg" alt="Why AgentCore? The AWS Advantage" width="800" height="432"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The AgentCore Solution: Managed Infrastructure for Agents
&lt;/h2&gt;

&lt;p&gt;AgentCore introduces a serverless compute environment built specifically for the agentic loop. Think of the AgentCore Runtime not as a single function call, but as a dedicated "clean room" for your agent’s session.&lt;/p&gt;

&lt;p&gt;Unlike standard Lambda functions with a 15-minute cap, AgentCore provides a dedicated microVM for every session that can stay active for up to &lt;strong&gt;8 hours&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Session Isolation: Every session is cryptographically isolated.&lt;/li&gt;
&lt;li&gt;Persistent Connection: You can call the agent multiple times while the session is active, and it maintains its state.&lt;/li&gt;
&lt;li&gt;Streaming Support: The runtime supports streaming data, allowing for the low-latency, real-time responses that production users expect.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fopegvln5f2dae5w8fr61.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fopegvln5f2dae5w8fr61.jpeg" alt=" " width="800" height="381"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Agent Deployment Workflow
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Environment Setup&lt;/strong&gt;&lt;br&gt;
AgentCore uses &lt;em&gt;uv&lt;/em&gt; to ensure fast and reliable dependency management. A best practice is to separate your setup into two directories — one for development with full tooling, and another lightweight deployment folder containing only essential dependencies and your agent code. This keeps your runtime secure and improves performance.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
Production-Ready AI Agent for Amazon Bedrock AgentCore
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;strands&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;strands_tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;calculator&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;bedrock_agentcore.runtime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BedrockAgentCoreApp&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BedrockAgentCoreApp&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;MODEL_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;us.anthropic.claude-4-5-sonnet-20250929-v1:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="nd"&gt;@app.entrypoint&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MODEL_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a helpful assistant that can perform calculations. Use the calculate tool for any math problems.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;calculator&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[{}])[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;strong&gt;2. Entrypoint&lt;/strong&gt;&lt;br&gt;
The &lt;code&gt;@app.entrypoint&lt;/code&gt; decorator acts as the bridge between your local script and the AgentCore runtime. It defines how incoming requests are handled, making your agent cloud-compatible with minimal changes to your existing code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Project creation and install dependencies&lt;/span&gt;
&lt;span class="nb"&gt;mkdir &lt;/span&gt;agentcore-demo &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;agentcore-demo
uv init &lt;span class="nt"&gt;--no-workspace&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; uv add bedrock-agentcore-starter-toolkit

&lt;span class="c"&gt;# 2. Create a deployment folder and add required pyproject.toml:&lt;/span&gt;
&lt;span class="nb"&gt;mkdir &lt;/span&gt;agent_deployment
uv init &lt;span class="nt"&gt;--bare&lt;/span&gt; ./agent_deployment &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; uv &lt;span class="nt"&gt;--directory&lt;/span&gt; ./agent_deployment add strands-agents bedrock-agentcore strands-agents-tools

&lt;span class="c"&gt;# 3. Agent code should be saved in to the agent_deployment folder as agent.py&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;strong&gt;3. CLI Workflow&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;agentcore configure&lt;/code&gt; → prepares infra (no deployment)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;agentcore launch&lt;/code&gt; → builds &amp;amp; deploys your agent&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;agentcore invoke&lt;/code&gt; → test your live agent from CLI
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 3. Configure and deploy&lt;/span&gt;
&lt;span class="c"&gt;# Use all default answers for now:&lt;/span&gt;
uv run agentcore configure &lt;span class="nt"&gt;-e&lt;/span&gt; ./agent_deployment/agent.py

uv run agentcore launch

&lt;span class="c"&gt;# 4. Test your deployed agent&lt;/span&gt;
uv run agentcore invoke &lt;span class="s1"&gt;'{"prompt": "What is 87 * 54 + 9?"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Architecture Deep Dive: Runtime and Memory
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fasp54nwov3rvqeyjtk8q.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fasp54nwov3rvqeyjtk8q.jpeg" alt="AgentCore Architecture" width="800" height="341"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. The Runtime Lifecycle and the "33-Character Rule"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When you invoke an agent, the Runtime spawns a microVM. For this to work securely, Session IDs must be 33+ characters long and sufficiently complex. This ID serves as the key for spawning the dedicated environment and prevents session hijacking. The environment automatically cleans up after 15 minutes of inactivity to optimize costs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. A Two-Tier Memory System&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AgentCore provides managed memory that scales independently of your compute:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Short-term Memory (STM): Stores the exact conversation history within a single session.&lt;/li&gt;
&lt;li&gt;Long-term Memory (LTM): Uses intelligent extraction to store user facts and preferences that persist across different sessions over weeks or months.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. The Lazy Loading Pattern&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You’ll often see the get_or_create_agent pattern in AgentCore code. This is necessary because the actor_id and session_id are passed in the request headers at the moment of invocation. Because you don’t have these IDs at the module’s global startup, you must "lazy load" the agent. This approach ensures the agent instance is initialized only once per session, avoiding the "cold start" cost of recreating the agent and re-connecting to memory on every request.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa9cw59ifenmymucemkzv.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa9cw59ifenmymucemkzv.jpeg" alt="Lazy Loading" width="800" height="385"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The AgentCore Toolbox: Key Capabilities
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa0dcfpzoih0sh452gqss.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa0dcfpzoih0sh452gqss.png" alt="Key Capabilities" width="800" height="349"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Observability: Seeing Inside the Agentic Loop
&lt;/h2&gt;

&lt;p&gt;Debugging an autonomous agent is notoriously difficult. AgentCore automatically enables a GenAI Observability Dashboard in Amazon CloudWatch.&lt;/p&gt;

&lt;p&gt;The standout feature here is the Service Map, which provides a visual representation of how your agent interacts with memory, tools, and the model. By using AWS X-Ray, you can perform end-to-end tracing to see exactly how long a model call took versus how long it took to hydrate state from memory. This transparency is vital for identifying bottlenecks in the agentic loop.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;Amazon Bedrock AgentCore offers a modular, professional path to scale:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Framework Agnostic: Whether you use LangChain, CrewAI, or LlamaIndex, the infrastructure remains the same.&lt;/li&gt;
&lt;li&gt;Production-Ready in Minutes: Automates the "plumbing" of ECR, IAM, and CodeBuild, allowing for deterministic deployments.&lt;/li&gt;
&lt;li&gt;Managed Security: Uses isolated microVMs for session compute and secure sandboxes for code execution.&lt;/li&gt;
&lt;li&gt;Modular "Bolt-on" Philosophy: You only use what you need. Need memory? Bolt it on. Need a browser? Add it in. You don’t pay the complexity tax for features you aren't using.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>agents</category>
      <category>aws</category>
      <category>ai</category>
      <category>aiops</category>
    </item>
    <item>
      <title>Serverless MongoDB Integration on AWS: A No-Bloat Lambda Approach</title>
      <dc:creator>Raj Shah</dc:creator>
      <pubDate>Fri, 30 May 2025 12:25:36 +0000</pubDate>
      <link>https://dev.to/rajshahblog/serverless-mongodb-integration-on-aws-a-no-bloat-lambda-approach-3o3k</link>
      <guid>https://dev.to/rajshahblog/serverless-mongodb-integration-on-aws-a-no-bloat-lambda-approach-3o3k</guid>
      <description>&lt;p&gt;Hey everyone! 👋&lt;/p&gt;

&lt;p&gt;I recently started building &lt;a href="//futurejobs.today"&gt;futurejobs.today&lt;/a&gt; — a job board platform that helps people find future-focused jobs in tech. While the main site is under development, I wanted to quickly set up a Coming Soon page to collect emails of people who are interested.&lt;/p&gt;

&lt;p&gt;Sounds simple, right? But there was a catch — I wanted to do it serverless, use MongoDB as my backend, and not bloat my Lambda function with extra MBs of dependencies. While MongoDB’s docs touch on this, they didn’t go deep enough. So here’s how I actually made it work — and I hope this blog helps someone avoid the detours I had to take.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧩 The Stack
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Frontend: Vite (hosted on Vercel)&lt;/li&gt;
&lt;li&gt;Backend: AWS Lambda (Node.js)&lt;/li&gt;
&lt;li&gt;Database: MongoDB Atlas&lt;/li&gt;
&lt;li&gt;Deployment: Lambda Function URL&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Step 1: Create Your MongoDB Atlas Cluster
&lt;/h2&gt;

&lt;p&gt;First things first, set up your MongoDB cluster on MongoDB Atlas:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a free cluster.&lt;/li&gt;
&lt;li&gt;Create a database and a subscribers collection.&lt;/li&gt;
&lt;li&gt;Whitelist your IP or allow access from anywhere (for testing).&lt;/li&gt;
&lt;li&gt;Grab the connection string (with your username and password embedded).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftyl8ntug949p2gtc8y1d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftyl8ntug949p2gtc8y1d.png" alt="MongoDB Network Access Config" width="800" height="378"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Write the Lambda Function
&lt;/h2&gt;

&lt;p&gt;Here's a basic Lambda handler in Node.js that connects to MongoDB and saves an email:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const { MongoClient } = require('mongodb');
const client = new MongoClient(process.env.MONGODB_URI);

exports.handler = async function (event) {
  const headers = {
    'Content-Type': 'application/json'
  };

  // Handle preflight OPTIONS request
  if (event.requestContext?.http?.method === 'OPTIONS') {
    return {
      statusCode: 200,
      headers,
      body: '',
    };
  }

  try {
    const email = event.queryStringParameters?.email || JSON.parse(event.body || '{}').email;

    if (!email) {
      return {
        statusCode: 400,
        headers,
        body: JSON.stringify({ error: 'Email is required' }),
      };
    }

    await client.connect();
    const db = client.db('emails');
    const collection = db.collection('emails');

    // Check if email already exists
    const existing = await collection.findOne({ email: email.toLowerCase() });

    if (existing) {
      return {
        statusCode: 200,
        headers,
        body: JSON.stringify({ message: 'Email already signed up' }),
      };
    }

    // Insert new email
    const result = await collection.insertOne({ email: email.toLowerCase(), createdAt: new Date() });

    return {
      statusCode: 200,
      headers,
      body: JSON.stringify({
        message: 'Email added successfully',
        insertedId: result.insertedId,
      }),
    };
  } catch (err) {
    console.error('Error inserting email:', err);
    return {
      statusCode: 500,
      headers,
      body: JSON.stringify({ error: err.message }),
    };
  }
};

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;📝 Note: Don’t forget to set the MONGO_URI environment variable in your Lambda function config.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Add MongoDB Node.js Driver via Lambda Layer
&lt;/h2&gt;

&lt;p&gt;Here’s where things get a bit tricky.&lt;/p&gt;

&lt;p&gt;Lambda has a 50MB limit for deployment packages, and the MongoDB Node.js driver is… kinda chunky. To keep things clean, we’ll use a Lambda Layer.&lt;/p&gt;

&lt;p&gt;On your local machine, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mkdir -p layer/nodejs
cd layer/nodejs
npm init -y
npm install mongodb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Zip it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cd ..
zip -r mongodb-layer.zip nodejs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Upload it to AWS Lambda &amp;gt; Layers, and attach it to your function.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0eg36i57pg6xrxioeboh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0eg36i57pg6xrxioeboh.png" alt="Add a Lambda Layer" width="800" height="287"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In your Lambda, just make sure to require('mongodb') — AWS will automatically resolve it from the Layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Expose the Function Using Lambda Function URL
&lt;/h2&gt;

&lt;p&gt;You don’t have to set up API Gateway just to get a POST endpoint. Lambda Function URLs to the rescue!&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to your Lambda function&lt;/li&gt;
&lt;li&gt;Click on "Function URL"&lt;/li&gt;
&lt;li&gt;Enable it and choose “Auth: NONE” (or configure custom auth if needed)&lt;/li&gt;
&lt;li&gt;Copy the URL — that’s your API endpoint!&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Step 5: Test It
&lt;/h2&gt;

&lt;p&gt;Now you can make a simple POST request from your frontend:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;fetch("https://your-lambda-url.amazonaws.com", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ email: "test@example.com" }),
})
.then(res =&amp;gt; res.json())
.then(data =&amp;gt; console.log(data.message));

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Boom! 🎉 Now you're collecting emails without maintaining any servers.&lt;/p&gt;

&lt;h2&gt;
  
  
  🤔 Why Not Use API Gateway or a Framework?
&lt;/h2&gt;

&lt;p&gt;Great question. I wanted to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Avoid API Gateway setup (extra steps, more config)&lt;/li&gt;
&lt;li&gt;Keep it ultra lightweight for MVP&lt;/li&gt;
&lt;li&gt;Get to market fast&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This setup does exactly that. Fast, serverless, and minimal dependencies.&lt;/p&gt;

&lt;h2&gt;
  
  
  📝 Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Even though MongoDB's docs mention Lambda support, there’s a surprising lack of complete real-world examples — especially when combining Lambda Layers, Function URLs, and MongoDB.&lt;/p&gt;

&lt;p&gt;If you're building a similar setup, I hope this post saves you a few hours of debugging 🙌&lt;/p&gt;

&lt;p&gt;Thanks for reading! Follow me here on dev.to or check out &lt;a href="//futurejobs.today"&gt;futurejobs.today&lt;/a&gt; to see the full platform once it’s live 🚀&lt;/p&gt;

</description>
      <category>aws</category>
      <category>lambda</category>
      <category>mongodb</category>
      <category>serverless</category>
    </item>
    <item>
      <title>Stop Worrying About EC2 Patching – Automate It Like a Pro!</title>
      <dc:creator>Raj Shah</dc:creator>
      <pubDate>Wed, 15 Jan 2025 07:42:20 +0000</pubDate>
      <link>https://dev.to/rajshahblog/stop-worrying-about-ec2-patching-automate-it-like-a-pro-3hmb</link>
      <guid>https://dev.to/rajshahblog/stop-worrying-about-ec2-patching-automate-it-like-a-pro-3hmb</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Let's be real—manually patching EC2 instances is about as fun as debugging a production outage on a Friday night. If you've ever had to SSH into dozens of instances just to run &lt;code&gt;yum update -y&lt;/code&gt; or &lt;code&gt;apt upgrade&lt;/code&gt;, you know the pain is real. But what if I told you there's a better way?&lt;/p&gt;

&lt;p&gt;AWS Systems Manager (SSM) Quick Setup and Custom Documents can automate this process, ensuring your Linux EC2 instances stay up to date without manual intervention. In this blog, I’ll walk you through setting up automated OS patching using AWS SSM and we would also look into creating custom patch baselines. Let's dive in!&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: Setting Up AWS SSM Quick Setup for OS Patching
&lt;/h2&gt;

&lt;p&gt;AWS SSM Quick Setup provides a hassle-free way to manage patching at scale. Here’s how you can set it up:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Go to the AWS Console&lt;/strong&gt; and navigate to &lt;em&gt;Systems Manager &amp;gt; Quick Setup&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Create&lt;/strong&gt; and choose &lt;em&gt;Host Management&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;AWS-DefaultPatchBaseline&lt;/strong&gt; under &lt;em&gt;Patch Manager&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Choose a schedule for automatic patching (e.g., weekly, daily).&lt;/li&gt;
&lt;li&gt;Ensure that SSM Agent is installed and running on all instances (it’s pre-installed on Amazon Linux, Ubuntu, and Windows Server AMIs).&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Create&lt;/strong&gt;, and you're done! 🎉&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frd8ge4ja8hd35w7etqbl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frd8ge4ja8hd35w7etqbl.png" alt="Image description" width="800" height="449"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flawgpzvw08hqnhno2l60.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flawgpzvw08hqnhno2l60.png" alt="Image description" width="800" height="376"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx4e6j3usonfx379sroc7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx4e6j3usonfx379sroc7.png" alt="Image description" width="800" height="402"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F82s4dbserf8kfje8fqca.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F82s4dbserf8kfje8fqca.png" alt="Image description" width="800" height="249"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With this setup, AWS will handle OS patching on a schedule, reducing the risk of security vulnerabilities without you lifting a finger.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: Creating a Custom Patch Baseline for a selected OS Type
&lt;/h2&gt;

&lt;p&gt;While &lt;strong&gt;AWS-DefaultPatchBaseline&lt;/strong&gt; under &lt;em&gt;Patch Manager&lt;/em&gt; covers only necessary updates, you might also want to update all installed packages (think security patches, bug fixes, and new features). Let’s create a custom SSM Patch Baseline to handle this:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Create an SSM Patch Baseline
&lt;/h3&gt;

&lt;p&gt;Go to &lt;em&gt;AWS Systems Manager &amp;gt; Patch Manager &amp;gt; Patch baselines&lt;/em&gt; and click &lt;strong&gt;Create Patch baseline&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvc03tkyn1kdi65ummhve.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvc03tkyn1kdi65ummhve.png" alt="Image description" width="800" height="334"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8mvsy5c91wtzq3riaar0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8mvsy5c91wtzq3riaar0.png" alt="Image description" width="800" height="492"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3eqddoid0thngsig52ih.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3eqddoid0thngsig52ih.png" alt="Image description" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click on &lt;em&gt;Create Patch Baseline&lt;/em&gt; to create it.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Include the Custom Patch Baseline
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Run a CLI Command to set the created Patch Baseline as default for the resp. OS Type
&lt;code&gt;aws ssm register-default-patch-baseline --baseline-id baseline-id-or-ARN&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Select the newly created Patch Baseline in the Quick Setup -&amp;gt; &lt;em&gt;Custom patch baseline&lt;/em&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqp5l8yuvsi8lf2g3n9sp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqp5l8yuvsi8lf2g3n9sp.png" alt="Image description" width="800" height="417"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Boom! Your instances will now update all installed packages automatically.&lt;/p&gt;

&lt;p&gt;And that's it! You’ve now automated EC2 package updates without having to log in ever again. 🏆&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;With AWS SSM Quick Setup and a custom document, you can automate OS patching and package updates across your EC2 instances like a pro. No more SSHing into instances or dealing with outdated software vulnerabilities. Set it up once, sit back, and let AWS do the work for you!&lt;/p&gt;

&lt;p&gt;Got any cool automation tricks for AWS EC2? Drop them in the comments below! 🚀&lt;/p&gt;




&lt;p&gt;Contributed By: &lt;a href="https://www.linkedin.com/in/rajshah001" rel="noopener noreferrer"&gt;Raj Shah&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>devops</category>
      <category>automation</category>
      <category>patch</category>
    </item>
    <item>
      <title>Automating VM Disaster Recovery Using AWS Elastic Disaster Recovery (DRS)</title>
      <dc:creator>Raj Shah</dc:creator>
      <pubDate>Thu, 12 Dec 2024 04:33:10 +0000</pubDate>
      <link>https://dev.to/rajshahblog/automating-disaster-recovery-using-aws-elastic-disaster-recovery-drs-815</link>
      <guid>https://dev.to/rajshahblog/automating-disaster-recovery-using-aws-elastic-disaster-recovery-drs-815</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Disaster recovery (DR) is a critical aspect of business continuity, ensuring that applications remain available in the face of unexpected failures such as hardware malfunctions, cyberattacks, or natural disasters. Traditional disaster recovery methods often involve complex manual processes, making them costly and error-prone.&lt;/p&gt;

&lt;p&gt;AWS &lt;strong&gt;Elastic Disaster Recovery (AWS DRS)&lt;/strong&gt; provides a &lt;strong&gt;fully managed, scalable, and automated disaster recovery solution&lt;/strong&gt;, allowing businesses to replicate workloads from on-premises or cloud environments to AWS with minimal downtime. &lt;/p&gt;

&lt;p&gt;In this blog, we’ll explore AWS DRS and walk through the &lt;strong&gt;step-by-step process&lt;/strong&gt; of setting up disaster recovery for a &lt;strong&gt;virtual machine (VM)&lt;/strong&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Step 2: Setting Up AWS Elastic Disaster Recovery&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Navigate to AWS DRS Console&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Open the &lt;strong&gt;AWS Management Console&lt;/strong&gt; → Search for &lt;strong&gt;Elastic Disaster Recovery&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Get Started&lt;/strong&gt; if using AWS DRS for the first time.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Download and Install the AWS Replication Agent&lt;/strong&gt;  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go to &lt;strong&gt;Source Servers&lt;/strong&gt; → Click &lt;strong&gt;Add Server&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;Refer to this documentation to get installation instructions as per OS Type - &lt;a href="https://docs.aws.amazon.com/drs/latest/userguide/adding-servers.html" rel="noopener noreferrer"&gt;https://docs.aws.amazon.com/drs/latest/userguide/adding-servers.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Run the command on the source VM (Example for Linux):
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;wget -O ./aws-replication-installer-init https://aws-elastic-disaster-recovery-us-east-1.s3.us-east-1.amazonaws.com/latest/linux/aws-replication-installer-init

chmod +x aws-replication-installer-init; sudo ./aws-replication-installer-init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1kubmqxuzi4z6vz8k96e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1kubmqxuzi4z6vz8k96e.png" alt="Image description" width="800" height="148"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After the DRS agent installation &amp;amp; configuration is complete the EC2 will show up in the Source Servers. After which it will start the initial sync for becoming "Ready to Recovery".&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvalh2k2v2k1qscg7rt1i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvalh2k2v2k1qscg7rt1i.png" alt="Image description" width="800" height="139"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Step 3: Configuring Replication Settings&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Go to Replication Settings&lt;/strong&gt; in AWS DRS.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Define replication parameters&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Replication Server Instance Type&lt;/strong&gt; – Select an appropriate instance.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EBS Volume Type&lt;/strong&gt; – Choose based on performance needs.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Encryption &amp;amp; Data Retention Settings&lt;/strong&gt; – Enable encryption for security.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Save and Apply settings&lt;/strong&gt; – AWS is ready for replication.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3aasu9z0dgjv858niz39.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3aasu9z0dgjv858niz39.png" alt="Image description" width="800" height="299"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Step 4: Performing a Recovery Drill (Non-Disruptive Test)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Once the sync and snapshots are complete, we can proceed to initiate the recovery drill.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwrgpdwrxam8y7roh6jpw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwrgpdwrxam8y7roh6jpw.png" alt="Image description" width="800" height="101"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbh01vntmee09xl1i0rwz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbh01vntmee09xl1i0rwz.png" alt="Image description" width="800" height="323"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmweyzd3fwpeexxuefgm9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmweyzd3fwpeexxuefgm9.png" alt="Image description" width="800" height="393"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1utjcnciquh8eh13abdj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1utjcnciquh8eh13abdj.png" alt="Image description" width="800" height="188"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Select a VM in AWS DRS Console&lt;/strong&gt; → Click &lt;strong&gt;Launch Recovery Instances&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;Choose &lt;strong&gt;Test Recovery Mode&lt;/strong&gt; to avoid affecting production.
&lt;/li&gt;
&lt;li&gt;AWS will create a &lt;strong&gt;temporary recovery instance&lt;/strong&gt; in your target AWS region/AZ.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Validate application functionality&lt;/strong&gt; and ensure the data is consistent.
&lt;/li&gt;
&lt;li&gt;Once confirmed, &lt;strong&gt;terminate the test recovery instance&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Step 5: Enabling Failback (Post-Disaster Recovery)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Once the source environment is restored, you need to &lt;strong&gt;reverse replication&lt;/strong&gt; to return workloads to their original location.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Initiate Failback Process in AWS DRS&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;Select the &lt;strong&gt;failed over instance&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Reverse Replication&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;AWS will sync the latest data back to the original VM.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F714057c8c0leeh3yhl41.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F714057c8c0leeh3yhl41.png" alt="Image description" width="800" height="247"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Cost Considerations for AWS DRS&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;AWS DRS pricing depends on:&lt;br&gt;
💰 &lt;strong&gt;Storage Costs&lt;/strong&gt; – Data stored in Amazon S3 and EBS snapshots.&lt;br&gt;&lt;br&gt;
💰 &lt;strong&gt;Compute Costs&lt;/strong&gt; – Recovery instances running in AWS.&lt;br&gt;&lt;br&gt;
💰 &lt;strong&gt;Data Transfer Costs&lt;/strong&gt; – Replication traffic from source to AWS.  &lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Cost Optimization Tips:&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;✅ &lt;strong&gt;Use lower-tier EBS volumes&lt;/strong&gt; for replication storage.&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;Terminate unused recovery instances&lt;/strong&gt; to avoid charges.&lt;br&gt;&lt;br&gt;
✅ &lt;strong&gt;Perform periodic DR drills&lt;/strong&gt; to validate without excess costs.  &lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;AWS Elastic Disaster Recovery is a &lt;strong&gt;powerful, automated, and scalable&lt;/strong&gt; solution for &lt;strong&gt;VM disaster recovery&lt;/strong&gt;. With its continuous replication, fast failover, and automated recovery processes, AWS DRS helps minimize downtime and protect critical workloads.&lt;/p&gt;

&lt;p&gt;✅ &lt;strong&gt;Key Takeaways:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AWS DRS simplifies &lt;strong&gt;disaster recovery automation&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Perform non-disruptive recovery drills&lt;/strong&gt; to validate failover readiness.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failback support ensures business continuity&lt;/strong&gt; after an outage.
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Contributed By: &lt;a href="https://www.linkedin.com/in/rajshah001/" rel="noopener noreferrer"&gt;Raj Shah&lt;/a&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>drs</category>
      <category>virtualmachine</category>
    </item>
  </channel>
</rss>
