<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ahmad Kanj</title>
    <description>The latest articles on DEV Community by Ahmad Kanj (@ahmadkanj).</description>
    <link>https://dev.to/ahmadkanj</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F831662%2Feb27849f-9238-4a8c-944e-e020e20b563a.jpeg</url>
      <title>DEV Community: Ahmad Kanj</title>
      <link>https://dev.to/ahmadkanj</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ahmadkanj"/>
    <language>en</language>
    <item>
      <title>Using IAM Users in 2026 Is a Life Choice</title>
      <dc:creator>Ahmad Kanj</dc:creator>
      <pubDate>Mon, 29 Dec 2025 10:43:51 +0000</pubDate>
      <link>https://dev.to/aws-builders/using-iam-users-in-2026-is-a-life-choice-3lbk</link>
      <guid>https://dev.to/aws-builders/using-iam-users-in-2026-is-a-life-choice-3lbk</guid>
      <description>&lt;p&gt;Cloud incidents don’t start with breaches.&lt;br&gt;&lt;br&gt;
They start with &lt;strong&gt;archaeology&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;You open the IAM console.&lt;br&gt;&lt;br&gt;
You scroll.&lt;br&gt;&lt;br&gt;
And there it is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;legacy-service-migration&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Access keys: active&lt;br&gt;
Console access: Enabled&lt;br&gt;
Last rotation: &lt;em&gt;never&lt;/em&gt;&lt;br&gt;
Owner: &lt;em&gt;unknown&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;No one remembers why it exists.&lt;br&gt;&lt;br&gt;
No one knows what breaks if you delete it.&lt;br&gt;&lt;br&gt;
So it stays.&lt;/p&gt;

&lt;p&gt;This isn’t negligence.&lt;br&gt;&lt;br&gt;
It’s a &lt;strong&gt;trophic cascade&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  🐺 Apex Trigger: “We’ll Just Create an IAM User”
&lt;/h2&gt;

&lt;p&gt;Every cascade begins with a reasonable decision:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“We need access for a script”&lt;/li&gt;
&lt;li&gt;“CI needs credentials”&lt;/li&gt;
&lt;li&gt;“It’s temporary”&lt;/li&gt;
&lt;li&gt;“We’ll clean it up later”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;An IAM user is created.&lt;br&gt;&lt;br&gt;
Access keys are generated.&lt;br&gt;&lt;br&gt;
The system moves on.&lt;/p&gt;

&lt;p&gt;Nothing breaks.&lt;br&gt;&lt;br&gt;
Nothing alerts.&lt;/p&gt;

&lt;p&gt;That’s how invasive species enter ecosystems.&lt;/p&gt;




&lt;h2&gt;
  
  
  🐗 Primary Impact: Long-Lived Identity Enters the System
&lt;/h2&gt;

&lt;p&gt;IAM users don’t expire.&lt;/p&gt;

&lt;p&gt;They outlive:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scripts&lt;/li&gt;
&lt;li&gt;CI pipelines&lt;/li&gt;
&lt;li&gt;Teams&lt;/li&gt;
&lt;li&gt;Architectural decisions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Fast-forward a few years.&lt;/p&gt;

&lt;p&gt;The script is gone.&lt;br&gt;
The migration is done.&lt;br&gt;&lt;br&gt;
The pipeline changed.&lt;br&gt;&lt;br&gt;
The team rotated.&lt;br&gt;&lt;br&gt;
The IAM user remains.&lt;/p&gt;

&lt;p&gt;Now you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Credentials with unclear ownership&lt;/li&gt;
&lt;li&gt;Permissions added “just in case”&lt;/li&gt;
&lt;li&gt;No confidence about blast radius&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is &lt;strong&gt;identity hygiene debt&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  🌿 Secondary Cascade: Hygiene Decay Becomes Normalized
&lt;/h2&gt;

&lt;p&gt;Eventually, IAM users stop feeling temporary.&lt;/p&gt;

&lt;p&gt;You start hearing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“Don’t touch that one”&lt;/li&gt;
&lt;li&gt;“It’s probably used somewhere”&lt;/li&gt;
&lt;li&gt;“We’ll audit later”&lt;/li&gt;
&lt;li&gt;“It’s been there forever”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At this stage, security stops being &lt;strong&gt;declarative&lt;/strong&gt; and becomes &lt;strong&gt;historical&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“We don’t know why this exists, but it must.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Unknown identity is worse than no identity.&lt;/p&gt;




&lt;h2&gt;
  
  
  🌊 Ecosystem Shock: A Real Incident (AWS IMDS)
&lt;/h2&gt;

&lt;p&gt;This fragility is exactly why real-world incidents hurt.&lt;/p&gt;

&lt;p&gt;In 2025, &lt;strong&gt;An active exploitation attempts&lt;/strong&gt; tied to &lt;strong&gt;CVE-2025-51591&lt;/strong&gt; an SSRF vulnerability in the Pandoc document converter.&lt;/p&gt;

&lt;p&gt;Attackers submitted crafted HTML designed to force servers to make internal requests — specifically targeting &lt;strong&gt;AWS Instance Metadata Service (IMDS)&lt;/strong&gt; at:&lt;/p&gt;

&lt;p&gt;Why IMDS?&lt;/p&gt;

&lt;p&gt;Because it can return &lt;strong&gt;AWS credentials&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Wiz observed attackers probing metadata paths like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;/latest/meta-data/iam/info&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;/latest/meta-data/iam/security-credentials&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In many environments, the attack failed thanks to &lt;strong&gt;IMDSv2&lt;/strong&gt;, which requires session tokens and blocks blind SSRF.&lt;/p&gt;

&lt;p&gt;But here’s the uncomfortable question:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;What if those workloads relied on &lt;strong&gt;static IAM user keys&lt;/strong&gt; instead of roles?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That’s where the cascade completes.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧨 When IAM User Hygiene Is Bad, Incidents Become Permanent
&lt;/h2&gt;

&lt;p&gt;There’s a critical difference:&lt;/p&gt;

&lt;h3&gt;
  
  
  If a role is compromised
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Credentials expire&lt;/li&gt;
&lt;li&gt;Sessions die&lt;/li&gt;
&lt;li&gt;Access collapses naturally&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  If an IAM user key is compromised
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Credentials persist&lt;/li&gt;
&lt;li&gt;Attackers can return later&lt;/li&gt;
&lt;li&gt;Rotation is manual (and often forgotten)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;An SSRF is just an entry point.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;IAM user hygiene determines the blast radius and lifespan.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🧹 What I Actually Found During an Audit
&lt;/h2&gt;

&lt;p&gt;During a routine IAM review, I found:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;IAM users created in &lt;strong&gt;2016–2018&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Active access keys&lt;/li&gt;
&lt;li&gt;Broad permissions (S3, EC2, IAM)&lt;/li&gt;
&lt;li&gt;No recent CloudTrail activity&lt;/li&gt;
&lt;li&gt;No documentation&lt;/li&gt;
&lt;li&gt;No owner&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Deleting them felt risky.&lt;/p&gt;

&lt;p&gt;That’s the real failure state:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Inaction feels safer than action.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And that’s how ecosystems rot.&lt;/p&gt;




&lt;h2&gt;
  
  
  🛡️ The Missing Species: Ephemeral Identity
&lt;/h2&gt;

&lt;p&gt;Modern AWS identity assumes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Short-lived credentials&lt;/li&gt;
&lt;li&gt;Clear ownership&lt;/li&gt;
&lt;li&gt;Contextual access&lt;/li&gt;
&lt;li&gt;Automatic expiration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;IAM roles&lt;/li&gt;
&lt;li&gt;OIDC&lt;/li&gt;
&lt;li&gt;SSO&lt;/li&gt;
&lt;li&gt;IMDSv2 only&lt;/li&gt;
&lt;li&gt;Explicit controls limiting IAM user creation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;IAM users should be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rare&lt;/li&gt;
&lt;li&gt;Documented&lt;/li&gt;
&lt;li&gt;Owned&lt;/li&gt;
&lt;li&gt;Audited&lt;/li&gt;
&lt;li&gt;Treated like radioactive material&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not defaults.&lt;/p&gt;




&lt;h2&gt;
  
  
  🌱 Rewilding the System
&lt;/h2&gt;

&lt;p&gt;Fixing the cascade looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;List all IAM users
&lt;/li&gt;
&lt;li&gt;Identify owners
&lt;/li&gt;
&lt;li&gt;Review last usage
&lt;/li&gt;
&lt;li&gt;Remove unused keys
&lt;/li&gt;
&lt;li&gt;Replace users with roles
&lt;/li&gt;
&lt;li&gt;Block new IAM users where possible
&lt;/li&gt;
&lt;li&gt;Treat unknown identity as a defect
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Yes, something might break.&lt;/p&gt;

&lt;p&gt;But something breaking is better than something silently owning your cloud.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧠 Final Lesson: This Is a Life Choice
&lt;/h2&gt;

&lt;p&gt;Using IAM users in 2026 isn’t about ignorance.&lt;/p&gt;

&lt;p&gt;It’s a choice to accept:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Permanent credentials&lt;/li&gt;
&lt;li&gt;Unbounded blast radius&lt;/li&gt;
&lt;li&gt;Identity archaeology&lt;/li&gt;
&lt;li&gt;Fragile security posture&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Cloud failures aren’t sudden.&lt;br&gt;&lt;br&gt;
They’re ecological.&lt;/p&gt;

&lt;p&gt;And finding IAM users from 2017 that no one understands anymore isn’t just technical debt. It’s a warning sign that the ecosystem is already collapsing.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>security</category>
      <category>iam</category>
    </item>
    <item>
      <title>DynamoDB Outage: Why Multi-Cloud Fails Startups (And Real DR Wins)</title>
      <dc:creator>Ahmad Kanj</dc:creator>
      <pubDate>Fri, 24 Oct 2025 13:42:38 +0000</pubDate>
      <link>https://dev.to/aws-builders/dynamodb-outage-why-multi-cloud-fails-startups-and-real-dr-wins-15cb</link>
      <guid>https://dev.to/aws-builders/dynamodb-outage-why-multi-cloud-fails-startups-and-real-dr-wins-15cb</guid>
      <description>&lt;p&gt;If you felt like half the internet was broken this week, you weren't wrong. 📉 A massive, 15-hour outage in Amazon's &lt;code&gt;us-east-1&lt;/code&gt; region took down DynamoDB and with it, a huge chunk of the web.&lt;/p&gt;

&lt;p&gt;This wasn't just "a server went down." It was a complex, cascading failure that exposed the deep interconnectedness of cloud services. For startups and scaleups, the immediate reaction is often, "We need to be multi-cloud to prevent this!"&lt;/p&gt;

&lt;p&gt;Hold on!&lt;/p&gt;

&lt;p&gt;The real lesson here isn't about running from your cloud provider. It's about understanding &lt;em&gt;what&lt;/em&gt; failed, why &lt;code&gt;us-east-1&lt;/code&gt; is a special kind of dangerous and how to build a &lt;em&gt;realistic&lt;/em&gt; Disaster Recovery (DR) plan that won't bankrupt you.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Anatomy of a Cascading Failure
&lt;/h2&gt;

&lt;p&gt;This outage was a masterclass in how modern, automated systems can fail in spectacular ways. It wasn't one thing; it was a chain of dominoes.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The Trigger: A DNS Race Condition&lt;/strong&gt;&lt;br&gt;
It all started with the system that manages the DNS for DynamoDB. Think of DNS as the Internet's phonebook. This automated system had a &lt;strong&gt;latent race condition&lt;/strong&gt;—a hidden bug. Two of its own processes tried to update the DynamoDB DNS record at the &lt;em&gt;exact same time&lt;/em&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One process (let's call it "Slow-Worker") grabbed an &lt;em&gt;old&lt;/em&gt; plan.&lt;/li&gt;
&lt;li&gt;A second process ("Fast-Worker") grabbed a &lt;em&gt;new&lt;/em&gt; plan and applied it successfully.&lt;/li&gt;
&lt;li&gt;"Fast-Worker" then did its cleanup, deleting the &lt;em&gt;old&lt;/em&gt; plan that "Slow-Worker" was &lt;em&gt;still holding&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;"Slow-Worker" finally woke up and applied its plan... which was now empty.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Result:&lt;/strong&gt; The main DNS record for &lt;code&gt;dynamodb.us-east-1.amazonaws.com&lt;/code&gt; was wiped clean. All its IP addresses vanished.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The First Domino: DynamoDB Goes Offline&lt;/strong&gt;&lt;br&gt;
Instantly, any application (including AWS's own internal services) attempting to access DynamoDB in that region received a "does not exist" error. The service was unreachable.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The Cascade: EC2, Lambda and IAM Fall Next&lt;/strong&gt;&lt;br&gt;
This is where it gets scary. Cloud services are built on top of &lt;em&gt;other&lt;/em&gt; cloud services. And DynamoDB is a &lt;strong&gt;Tier 0 service&lt;/strong&gt;—a foundational block.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;EC2&lt;/strong&gt; failed because its control plane (the "brain" that launches new servers) uses DynamoDB to track the state and leases of its physical hardware. No DynamoDB, no new EC2 instances.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lambda, ECS, EKS and Fargate&lt;/strong&gt; all failed because they all &lt;em&gt;run on&lt;/em&gt; EC2. They couldn't get new computing capacity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Network Load Balancers&lt;/strong&gt; started failing health checks, causing connection errors for services that were &lt;em&gt;technically&lt;/em&gt; still running.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IAM&lt;/strong&gt;, which handles authentication, was also impacted. This is critical: during the outage, some engineers were unable to log in to the console to fix the problem.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The 15-Hour Recovery and "Congestive Collapse"&lt;/strong&gt;&lt;br&gt;
Engineers fixed the DNS record relatively quickly, but the outage lasted 15 hours. Why?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DNS Caching:&lt;/strong&gt; The "empty" (and wrong) DNS record was cached by resolvers all over the internet. They had to wait for that cache to expire.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Congestive Collapse:&lt;/strong&gt; When the service finally came back, a "thundering herd" of &lt;em&gt;every single service&lt;/em&gt; retrying at once hammered DynamoDB. The system, in its weakened recovery state, was so overwhelmed by recovery work that it couldn't make forward progress. Engineers had to manually throttle traffic and drain backlogs to bring it back online safely.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The Global Blast Radius: Why You Should &lt;em&gt;Never&lt;/em&gt; Host in &lt;code&gt;us-east-1&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;"But I don't even use &lt;code&gt;us-east-1&lt;/code&gt;!" you might say. "I'm in &lt;code&gt;eu-west-3&lt;/code&gt; (Paris)!"&lt;/p&gt;

&lt;p&gt;It didn't matter. This outage had a global impact and it exposes the dirty secret of AWS: &lt;strong&gt;&lt;code&gt;us-east-1&lt;/code&gt; (N. Virginia) is not just another region.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Because it's the &lt;em&gt;oldest&lt;/em&gt; AWS region, many "global" services have their control planes homed there by default.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Global IAM Console:&lt;/strong&gt; The main IAM dashboard and API are, by default, in &lt;code&gt;us-east-1&lt;/code&gt;. During the outage, users in other regions reported being unable to manage permissions or roles.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;S3 Management Console:&lt;/strong&gt; The "global" S3 console is also hosted there. You could still &lt;em&gt;access&lt;/em&gt; your data in a bucket in Frankfurt, but you couldn't &lt;em&gt;manage&lt;/em&gt; the bucket (e.g., change policies, create new buckets).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Global Services:&lt;/strong&gt; Services like DynamoDB Global Tables, which replicate data worldwide, saw massive replication lag to and from the failed region.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Multi-Cloud Fallacy: Doubling Your Problems, Not Your Uptime
&lt;/h2&gt;

&lt;p&gt;When an event like this happens, the C-suite's first question is, "Why aren't we on GCP and Azure, too?"&lt;/p&gt;

&lt;p&gt;For a startup or scaleup, "multi-cloud" is a trap. It's a strategy for massive, risk-averse banks and Fortune 100s with regulatory requirements, not for a company that needs to move fast.&lt;/p&gt;

&lt;p&gt;Chasing multi-cloud to solve for availability is a terrible trade-off. Here’s why:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Exponential Complexity:&lt;/strong&gt; You think AWS IAM is hard? Now try to manage AWS IAM, Google Cloud IAM and Azure Entra ID and make them all talk to each other securely. Your 3-person DevOps team is now responsible for three entirely different networking stacks, security models and deployment pipelines.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;The "Lowest Common Denominator" Problem:&lt;/strong&gt; This is the &lt;em&gt;killer&lt;/em&gt;. The real power of AWS is in its managed services—DynamoDB, S3, Kinesis and Lambda. If you design your app to be "cloud-agnostic," you &lt;strong&gt;cannot use any of them.&lt;/strong&gt; You're forced to build on basic VMs and manage your own databases (PostgreSQL on EC2) and message queues (RabbitMQ on EC2). You've just sacrificed your biggest competitive advantage (velocity) for a false sense of security.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;The Talent Chasm:&lt;/strong&gt; Finding great AWS engineers is hard enough. Finding engineers who are &lt;em&gt;equally&lt;/em&gt; expert-level in AWS, GCP and Azure is finding a unicorn. 🦄 More likely, you'll have a team that is mediocre at all three.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;The Hidden Costs:&lt;/strong&gt; You won't save money. You'll lose all your volume discounts and you'll be hit with a constant stream of &lt;strong&gt;data egress fees&lt;/strong&gt; just to keep your data in sync between clouds. This cost alone can cripple a startup.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The &lt;em&gt;Right&lt;/em&gt; Answer: A Real DR Plan (Multi-Region, Not Multi-Cloud)
&lt;/h2&gt;

&lt;p&gt;The problem this week wasn't that &lt;strong&gt;AWS failed&lt;/strong&gt;. The problem was that &lt;strong&gt;a single region, &lt;code&gt;us-east-1&lt;/code&gt;, failed&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The smart, resilient and cost-effective solution for a startup is not to go multi-cloud, but to go &lt;strong&gt;multi-region&lt;/strong&gt; within your primary cloud.&lt;/p&gt;

&lt;p&gt;This is where you must have an honest conversation about &lt;strong&gt;Cost vs. Availability&lt;/strong&gt;. Your availability is a business decision, not just a technical one. Here are your options, from cheapest to most expensive:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Cold DR: Backup &amp;amp; Restore
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How it works:&lt;/strong&gt; You take regular backups (e.g., S3 snapshots, DynamoDB backups) and replicate them to another region using &lt;strong&gt;S3 Cross-Region Replication (CRR)&lt;/strong&gt;. If a disaster happens, you manually spin up a new environment from scratch in the new region and restore from the backup.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost:&lt;/strong&gt; Very low. Just storage costs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Availability (RTO/RPO):&lt;/strong&gt; Very poor. &lt;strong&gt;RTO&lt;/strong&gt; (Recovery Time Objective) is in &lt;strong&gt;hours or days&lt;/strong&gt;. &lt;strong&gt;RPO&lt;/strong&gt; (Recovery Point Objective) is high (e.g., "we lose the last 4 hours of data").&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case:&lt;/strong&gt; Good for non-critical systems, dev/test environments.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Warm DR: Pilot Light (The Startup Sweet Spot 💡)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How it works:&lt;/strong&gt; This is the best balance for most startups.

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data:&lt;/strong&gt; Your critical data is actively replicated to the second region. Use &lt;strong&gt;DynamoDB Global Tables&lt;/strong&gt; or &lt;strong&gt;Aurora Global Databases&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Infra:&lt;/strong&gt; A &lt;em&gt;minimal&lt;/em&gt; copy of your core infrastructure (e.g., your container images in ECR, a tiny app server, your IaC scripts) is "on" but idle in the DR region. The "pilot light" is lit.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failover:&lt;/strong&gt; When a disaster hits, you "turn up the gas." You run your scripts to scale up the app servers, promote the standby database to be the new primary and use &lt;strong&gt;Route 53 DNS Failover&lt;/strong&gt; to automatically redirect all traffic.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Cost:&lt;/strong&gt; Medium. You pay for data replication and minimal idle infrastructure.&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Availability (RTO/RPO):&lt;/strong&gt; Good. &lt;strong&gt;RTO&lt;/strong&gt; is in &lt;strong&gt;minutes&lt;/strong&gt;. &lt;strong&gt;RPO&lt;/strong&gt; is near-zero (you lose no data).&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Hot DR: Active-Active
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How it works:&lt;/strong&gt; You run your &lt;em&gt;full&lt;/em&gt; production stack in two or more regions simultaneously. Route 53 (or a global load balancer) splits traffic between them. If one region fails, it just takes on 100% of the traffic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost:&lt;/strong&gt; Very high. You are paying for 2x (or more) of everything.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Availability (RTO/RPO):&lt;/strong&gt; Excellent. &lt;strong&gt;RTO&lt;/strong&gt; is in &lt;strong&gt;seconds&lt;/strong&gt; (or zero). &lt;strong&gt;RPO&lt;/strong&gt; is zero.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case:&lt;/strong&gt; Only for your absolute, mission-critical, "company-dies-if-it's-down-for-1-minute" services.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Your Survival Checklist
&lt;/h2&gt;

&lt;p&gt;Don't wait for the next outage. As a startup, you can survive this.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Move out of &lt;code&gt;us-east-1&lt;/code&gt;&lt;/strong&gt; for your primary workloads. Seriously.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Define your RTO/RPO.&lt;/strong&gt; Have the business conversation: "How long can we be down and how much data can we afford to lose?" This dictates your budget.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Implement a Pilot Light strategy&lt;/strong&gt; for your core services.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Use native replication:&lt;/strong&gt; Use DynamoDB Global Tables, Aurora Global DBs and S3 CRR.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Replicate your CI/CD assets:&lt;/strong&gt; Make sure your container images (ECR) and deployment scripts are in your DR region, too. You can't recover if your recovery tools are in the fire.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Test your plan.&lt;/strong&gt; A DR plan you've never tested is not a plan. it's a prayer.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This outage was a wake-up call. But the lesson isn't to flee AWS. It's to stop treating "the cloud" as one magic box and start treating a &lt;strong&gt;region&lt;/strong&gt; as your true failure domain.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>disasterrecovery</category>
      <category>architecture</category>
      <category>cloud</category>
    </item>
    <item>
      <title>We Had Scrum Masters. Get Ready for the Vibe Code Cleanup Specialist</title>
      <dc:creator>Ahmad Kanj</dc:creator>
      <pubDate>Sat, 18 Oct 2025 09:26:45 +0000</pubDate>
      <link>https://dev.to/aws-builders/we-had-scrum-masters-get-ready-for-the-code-vibe-checker-3na4</link>
      <guid>https://dev.to/aws-builders/we-had-scrum-masters-get-ready-for-the-code-vibe-checker-3na4</guid>
      <description>&lt;p&gt;Remember when every tech company suddenly needed a Scrum Master?&lt;/p&gt;

&lt;p&gt;They were the person with the sticky notes and the sharpies. Their job was to make sure everyone followed the rules of Agile. They ran the daily stand-ups, planned the sprints, and kept an eye on the "velocity" chart.&lt;/p&gt;

&lt;p&gt;The goal was to help us build software better and faster. It was all about the &lt;em&gt;process&lt;/em&gt;. The focus was always on &lt;em&gt;how&lt;/em&gt; we were working.&lt;/p&gt;

&lt;p&gt;Sometimes it helped. Other times, it felt like we were just having meetings about meetings.&lt;/p&gt;

&lt;p&gt;Well, that trend cooled off. But the tech world loves a new job title, and as Gen Z floods the workforce, I think I know what's coming next. Because for Gen Z, it's all about the vibes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Meet the "Vibe Code Cleanup Specialist" ✨
&lt;/h2&gt;

&lt;p&gt;Forget about rigid processes. The new hotness is all about the &lt;em&gt;feeling&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The Vibe Code Cleanup Specialist (or "Code Vibe Checker") doesn't care about your Jira tickets. Their job is to make sure the codebase just &lt;em&gt;feels&lt;/em&gt; good. This is a role practically designed for a generation that trusts intuition and authenticity over everything else.&lt;/p&gt;

&lt;p&gt;What would they even do?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Run "Joy Checks":&lt;/strong&gt; They'd look at a function you wrote, turn to you, and ask with a straight face, "Does this code spark joy?" If not, you refactor it until it does.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fix the Code's Energy:&lt;/strong&gt; You know that part of the app that everyone hates working on? The Vibe Checker would say it has "bad energetic debt" and their job is to "cleanse" it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Organize Folders by Feeling:&lt;/strong&gt; They'd rearrange the project files and folders not just for logic, but for good "Feng Shui." So it just feels nice to look at.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Delete "Sad" Code:&lt;/strong&gt; They'd find code written during a stressful project launch and gently remove it, saying it "carries a negative energy."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Basically, their main KPI is whether the codebase "passes the vibe check." Instead of daily stand-ups, they'd host "weekly code meditations" to help everyone get in sync with the project's spirit.&lt;/p&gt;

&lt;h2&gt;
  
  
  Is This Really So Different?
&lt;/h2&gt;

&lt;p&gt;It sounds silly, right? But think about it. The Scrum Master was trying to fix the human side of coding with process. The Vibe Checker is trying to fix it with feelings.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Scrum Master&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Vibe Code Cleanup Specialist&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Focus&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The &lt;em&gt;process&lt;/em&gt; of work.&lt;/td&gt;
&lt;td&gt;The &lt;em&gt;feeling&lt;/em&gt; of the code.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Big Question&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;"Are we working efficiently?"&lt;/td&gt;
&lt;td&gt;"Are the vibes off here?"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tools&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Jira boards, velocity charts.&lt;/td&gt;
&lt;td&gt;Good feelings, nice folder names.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Goal&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ship features on a schedule.&lt;/td&gt;
&lt;td&gt;Have a codebase that's a joy to work in.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The Scrum Master was a very Millennial solution to a management problem: add a process, add more meetings, and track everything. The Vibe Checker is the pure Gen Z approach: if the vibes are off, nothing else matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  So, What's My Point?
&lt;/h2&gt;

&lt;p&gt;Okay, the "Vibe Code Cleanup Specialist" isn't a real job... yet.&lt;/p&gt;

&lt;p&gt;But it's a fun way to think about how our industry is always looking for a new solution to the same old problems. Each generation brings its own language to the workplace. We went from corporate "synergy" to Agile "velocity," so it's not a huge leap to get to "vibes."&lt;/p&gt;

&lt;p&gt;We're all just trying to find better ways to build cool things without burning out. And for a new generation of developers, the &lt;em&gt;feeling&lt;/em&gt; might just be the most important metric there is.&lt;/p&gt;

&lt;p&gt;What do you think? Would you want a Vibe Checker on your team? Let me know in the comments!&lt;/p&gt;

</description>
      <category>agile</category>
      <category>vibecoding</category>
      <category>jokes</category>
    </item>
    <item>
      <title>The Ripple Effect: How a Single Push Notification Brought Down Our Kubernetes Cluster</title>
      <dc:creator>Ahmad Kanj</dc:creator>
      <pubDate>Mon, 06 Jan 2025 21:17:41 +0000</pubDate>
      <link>https://dev.to/aws-builders/the-ripple-effect-how-a-single-push-notification-brought-down-our-kubernetes-cluster-c9i</link>
      <guid>https://dev.to/aws-builders/the-ripple-effect-how-a-single-push-notification-brought-down-our-kubernetes-cluster-c9i</guid>
      <description>&lt;p&gt;Ever notice how major system failures rarely start with major problems? That's exactly what happened to us when a simple push notification exposed the fragility of our Kubernetes infrastructure. But here's the twist: it wasn’t a bug that took us down—it was our own success.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Calm Before the Storm
&lt;/h2&gt;

&lt;p&gt;On January 28, 1986, a tiny rubber &lt;a href="https://en.wikipedia.org/wiki/O-ring" rel="noopener noreferrer"&gt;O-ring&lt;/a&gt; failed, leading to the devastating Challenger disaster. As a Kubernetes architect, this historical parallel haunts me daily. Why? Because in complex systems, there's no such thing as a "minor" decision. Every configuration choice ripples through your system like a stone dropped in a still pond. And just like that &lt;a href="https://en.wikipedia.org/wiki/O-ring" rel="noopener noreferrer"&gt;O-ring&lt;/a&gt;, our "small" product decision was about to create waves we never saw coming.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Incident That Changed Everything
&lt;/h2&gt;

&lt;p&gt;It started innocently enough. Our feature team had just rolled out a fancy new notification system, the kind of update that makes product managers smile and engineers sleep soundly, or so we thought.&lt;/p&gt;

&lt;p&gt;At exactly 4:00 PM, our new system did exactly what it was designed to do: send a push notification to our entire user base. What we hadn't considered was human psychology. When thousands of users receive the same notification simultaneously, guess what they do? They act simultaneously.&lt;/p&gt;

&lt;p&gt;Within seconds, our metrics painted a picture of digital chaos:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Traffic exploded by 12x requests per minute on some services&lt;/li&gt;
&lt;li&gt;Our normal 110ms latency skyrocketed to 20 seconds&lt;/li&gt;
&lt;li&gt;Nodes CPU utilization surged from 45% to 95%&lt;/li&gt;
&lt;li&gt;Nodes Memory pressure jumped from 50% to 87%&lt;/li&gt;
&lt;li&gt;Pods being killed or restarting&lt;/li&gt;
&lt;li&gt;Pod scheduling failures cascaded throughout the cluster, with pods being evicted faster than we could stabilize them&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our monitoring dashboards transformed into a sea of red. This wasn't just a scaling issue, it was a cascade of past decisions coming back to haunt us.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Technical Evolution
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Phase 1: Infrastructure Analysis
&lt;/h3&gt;

&lt;p&gt;Our initial platform setup revealed sobering limitations that would need to be addressed. Node provisioning was taking 4-6 minutes – an eternity in a crisis. Scale-up decision lag stretched to 2-3 minutes, while resource utilization languished at 35-40%. Average pod scheduling time crawled at 1.2 seconds. These numbers told a clear story: we needed a complete redesign.&lt;/p&gt;

&lt;p&gt;We set aggressive targets that would push our infrastructure to new levels:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rapid scaling capability: 0-800% in 3 minutes&lt;/li&gt;
&lt;li&gt;Resource efficiency: 75%+ utilization&lt;/li&gt;
&lt;li&gt;Cost optimization: 40% reduction&lt;/li&gt;
&lt;li&gt;Reliability: 99.99% availability&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Phase 2: Control Plane Architecture
&lt;/h3&gt;

&lt;p&gt;The redesign of our EKS control plane architecture became the foundation of our recovery. We implemented a robust Multi-AZ Configuration, spreading our control plane across three Availability Zones with dedicated node groups for each workload type. Our custom node labeling strategy for workload affinity proved crucial, driving our availability from 99.95% to 99.99%.&lt;/p&gt;

&lt;p&gt;Our network design saw equally dramatic improvements. We established a dedicated VPC for cluster operations, implemented private API endpoints, and fine-tuned our CNI settings for improved pod density. The impact was immediate: pod networking latency dropped by 45%.&lt;/p&gt;

&lt;p&gt;Security wasn't forgotten either. We implemented a zero-trust security model, comprehensive pod security policies, and network policies for namespace isolation. The result? Zero security incidents since implementation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 3: The Great Node Flood
&lt;/h3&gt;

&lt;p&gt;Then came what we now call "The Great Node Flood" our first major test. The initial symptoms were severe: pod scheduling delays averaged 5 seconds, node boot times stretched to 240-360 seconds, CNI attachment delays ran 45-60 seconds, and image pull times consumed 30-45 seconds of precious time.&lt;/p&gt;

&lt;p&gt;Our investigation revealed multiple bottlenecks: CNI configuration issues, suboptimal route tables, and DNS resolution delays. We methodically tackled each issue, analyzing kubelet startup procedures, container runtime configurations, and node initialization scripts.&lt;/p&gt;

&lt;p&gt;The improvements were dramatic:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Node boot time dropped from 300s to 90s&lt;/li&gt;
&lt;li&gt;CNI setup improved from 45s to 15s&lt;/li&gt;
&lt;li&gt;Image pulls accelerated from 45s to 10s&lt;/li&gt;
&lt;li&gt;Pod scheduling time decreased from 5s to 0.8s&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Phase 4: Karpenter Integration
&lt;/h3&gt;

&lt;p&gt;Karpenter proved to be a game-changer. Our performance benchmarks told the story:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Node provisioning time plummeted from 270s to 75s&lt;/li&gt;
&lt;li&gt;Scale-up decisions accelerated from 180s to 20s&lt;/li&gt;
&lt;li&gt;Resource utilization jumped from 65% to 85%&lt;/li&gt;
&lt;li&gt;Cost per node hour dropped from $0.76 to $0.52&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These configurations validated our improvements: we could now scale from x2 the nodes in 3 minutes, handle 800% workload increases without degradation, and maintain pod scheduling latency under 1 second with a 99.99% success rate.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 5: KEDA Implementation
&lt;/h3&gt;

&lt;p&gt;KEDA's implementation transformed our scaling dynamics. Before KEDA, scale-up reactions took 3-5 minutes, scale-down reactions dragged for 10-15 minutes, and false positive scaling events plagued us at 12%. After KEDA, those numbers improved dramatically: 15-30 second scale-ups, 3-5 minute scale-downs, and just 2% false positives.&lt;/p&gt;

&lt;p&gt;Production validation exceeded expectations. We successfully handled 800% traffic increases while maintaining sub 250ms latency during the wave. Scaling-related incidents dropped by 90%, and cost efficiency improved by 35%.&lt;/p&gt;

&lt;h2&gt;
  
  
  Current State and Future Directions
&lt;/h2&gt;

&lt;p&gt;Today, our platform runs with newfound confidence. Last quarter's metrics tell the story of our transformation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Average node provisioning time: 82 seconds&lt;/li&gt;
&lt;li&gt;P95 pod scheduling latency: 0.8 seconds&lt;/li&gt;
&lt;li&gt;Resource utilization: 82%&lt;/li&gt;
&lt;li&gt;Platform availability: 99.995%&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Looking Ahead
&lt;/h2&gt;

&lt;p&gt;Remember this: in Kubernetes, as in space flight, there are no minor decisions. Every setting, limit, and policy creates its own ripple effect. Success isn't about preventing these ripples—it's about understanding and harnessing them.&lt;/p&gt;

&lt;p&gt;Want to dive deeper? In my next post, we'll explore:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Component-level analysis that'll change how you think about system design&lt;/li&gt;
&lt;li&gt;Performance optimization techniques we learned the hard way&lt;/li&gt;
&lt;li&gt;Testing methodologies that catch problems before production&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Have you ever experienced a similar cascade of events in your infrastructure? Share your stories in the comments below, let's learn from each other's hard lessons. 🚀&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>performance</category>
      <category>eks</category>
    </item>
    <item>
      <title>Navigating the Vocabulary of Gen AI with GIFs</title>
      <dc:creator>Ahmad Kanj</dc:creator>
      <pubDate>Mon, 11 Nov 2024 19:26:23 +0000</pubDate>
      <link>https://dev.to/aws-builders/navigating-the-vocabulary-of-gen-ai-with-gifs-5ao5</link>
      <guid>https://dev.to/aws-builders/navigating-the-vocabulary-of-gen-ai-with-gifs-5ao5</guid>
      <description>&lt;p&gt;If there’s one thing I’ve truly mastered, it’s using GIFs (and yes, GenAI too). Generative AI is everywhere now, showing up in everything from customer support to adding creative twists to memes. But all that jargon? It can be overwhelming. So here’s the plan: I’m breaking down the world of GenAI in a way that’s clear, informative, and a little bit fun, with plenty of GIFs to keep things interesting. &lt;/p&gt;

&lt;p&gt;Whether you want to learn the basics or just want to outsmart your tech-savvy friend, stick around you’ll get a lot out of it!&lt;/p&gt;

&lt;h2&gt;
  
  
  🎩 Artificial Intelligence (AI): More Than Just “Smarter Than Me”
&lt;/h2&gt;

&lt;p&gt;AI sounds like Hollywood robots, but it's actually software that mimics certain human abilities, like decision-making and learning from experience. Think of it as a really, &lt;em&gt;really&lt;/em&gt; smart version of your phone's autocorrect.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Futp5qsz0s0knlki8olfr.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Futp5qsz0s0knlki8olfr.gif" alt="Navigating the Vocabulary of Gen AI with GIFs - I understood that reference" width="480" height="260"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🧠 Machine Learning: The Fuel of AI Magic
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Machine Learning (ML)&lt;/strong&gt; is how AIs learn to do stuff. It’s like teaching a dog new tricks, only the "dog" is a model, and instead of treats, it gets data. ML has three styles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Supervised Learning&lt;/strong&gt;: You show it labeled data. It’s like training with flashcards.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fenk7v4g4flcijfm86d5h.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fenk7v4g4flcijfm86d5h.gif" alt="Navigating the Vocabulary of Gen AI with GIFs" width="500" height="281"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unsupervised Learning&lt;/strong&gt;: No labels, just vibes. The model figures things out solo.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fefcfh5jyzej76ellmoyf.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fefcfh5jyzej76ellmoyf.gif" alt="Navigating the Vocabulary of Gen AI with GIFs" width="640" height="394"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Semi-supervised Learning&lt;/strong&gt;: A mix of both, like letting your dog run free but calling it back sometimes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ihbel21bd41y0w3be3x.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ihbel21bd41y0w3be3x.gif" alt="Navigating the Vocabulary of Gen AI with GIFs" width="594" height="640"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🕸️ Artificial Neural Networks (ANN): The Brain of AI
&lt;/h2&gt;

&lt;p&gt;Imagine neurons from the human brain but digital! In &lt;strong&gt;Artificial Neural Networks&lt;/strong&gt;, each "neuron" learns how to pass info to the next, forming the brain of AI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbz6ceh6da111mgh88nvw.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbz6ceh6da111mgh88nvw.gif" alt="Navigating the Vocabulary of Gen AI with GIFs" width="498" height="405"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  📚 Deep Learning: More Layers, More Power
&lt;/h2&gt;

&lt;p&gt;When these networks get &lt;em&gt;thick&lt;/em&gt; with layers, they’re called &lt;strong&gt;Deep Learning&lt;/strong&gt;. Perfect for heavy-duty jobs like recognizing faces in photos or translating languages. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftlc7jtseyygvv5oqilgi.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftlc7jtseyygvv5oqilgi.gif" alt="Navigating the Vocabulary of Gen AI with GIFs" width="640" height="344"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🤖 Large Language Models and Foundation Models: The Big Brains of AI
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Foundation Models&lt;/strong&gt; like &lt;strong&gt;Large Language Models (LLMs)&lt;/strong&gt; are trained on massive amounts of data and can be tuned for specific tasks, like writing emails or understanding memes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkwuqolzaf6vbpk6yk8ev.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkwuqolzaf6vbpk6yk8ev.gif" alt="Navigating the Vocabulary of Gen AI with GIFs" width="498" height="498"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🔥 Transformer Models and GPT: The Buzzwords
&lt;/h2&gt;

&lt;p&gt;Thanks to &lt;strong&gt;Transformer Models&lt;/strong&gt;, AI can handle all words in a sentence simultaneously instead of one by one. This is what makes &lt;strong&gt;Generative Pretrained Transformers (GPT)&lt;/strong&gt; the star of text generation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff6evwvyt1i9g2omngfgi.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff6evwvyt1i9g2omngfgi.gif" alt="Navigating the Vocabulary of Gen AI with GIFs" width="500" height="359"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🤹‍♀️ Prompt Engineering and Prompt Chaining: AI’s Command Line
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Prompt Engineering&lt;/strong&gt; is all about crafting the perfect question to get the right answer from the AI. And &lt;strong&gt;Prompt Chaining&lt;/strong&gt;? It’s like breadcrumbing AI through a maze. Fun for you; stressful for the AI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fozp83y6590hqfu3nsiui.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fozp83y6590hqfu3nsiui.gif" alt="Navigating the Vocabulary of Gen AI with GIFs" width="478" height="640"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🔍 Retrieval-Augmented Generation (RAG): The Anti-Hallucination Technique
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;RAG&lt;/strong&gt; is like giving the AI a fact-checking buddy. It pulls in info from databases to keep the AI from “hallucinating” nonsense answers.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7uaapat2oki48dh8fm3m.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7uaapat2oki48dh8fm3m.gif" alt="Navigating the Vocabulary of Gen AI with GIFs" width="640" height="430"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🔧 Fine-Tuning and Parameters: Tweak ‘Til You Peak
&lt;/h2&gt;

&lt;p&gt;Fine-tuning gets your AI model hyper-specialized. In this stage, you adjust &lt;strong&gt;parameters&lt;/strong&gt; tiny dials that control how the model behaves. Think of it like tuning a car engine.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv8egakmbes51ne78xucv.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv8egakmbes51ne78xucv.gif" alt="Navigating the Vocabulary of Gen AI with GIFs" width="640" height="360"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🔥 Bias and Hallucinations in AI: When Things Go Weird
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Bias&lt;/strong&gt; is when the AI model’s data has blind spots. It might lean too far left, right, or just get things plain wrong. And &lt;strong&gt;Hallucinations&lt;/strong&gt;? That’s when AI decides to get &lt;em&gt;creative&lt;/em&gt;—making up facts that sound convincing but are 100% made up.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe0mx7rzrpj166e1ynqak.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe0mx7rzrpj166e1ynqak.gif" alt="Navigating the Vocabulary of Gen AI with GIFs" width="640" height="360"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  📏 Important Metrics: Temperature, Anthropomorphism, Completion
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Temperature&lt;/strong&gt;: Controls randomness. High = wild, low = safe. Adjust for the “surprise” level.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropomorphism&lt;/strong&gt;: Giving AI human traits. Let’s not forget: it’s not human.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Completion&lt;/strong&gt;: It’s about finishing a thought or sentence AI’s “period.”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbqj7hknq3q07dmw408sl.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbqj7hknq3q07dmw408sl.gif" alt="Navigating the Vocabulary of Gen AI with GIFs" width="600" height="363"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🧩 Tokens, Embeddings, and Emergence: AI Building Blocks
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tokens&lt;/strong&gt;: Tiny chunks of text. The smaller the chunk, the more accurate the AI.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embeddings&lt;/strong&gt;: Vectors (math things) that give words meaning. Helps the AI understand language.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Emergence in AI&lt;/strong&gt;: When the model randomly learns new tricks, like a kid suddenly reciting Shakespeare.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F59rllgb74x3664vppqns.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F59rllgb74x3664vppqns.gif" alt="Navigating the Vocabulary of Gen AI with GIFs" width="496" height="280"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  📝 NLP and Text Classification: Generative AI in Action
&lt;/h2&gt;

&lt;p&gt;Natural Language Processing (NLP) is where AI shines in understanding and generating human like text.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9pfejw7qdfqprquex854.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9pfejw7qdfqprquex854.gif" alt="Navigating the Vocabulary of Gen AI with GIFs" width="640" height="360"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🔒 Responsible AI: Keeping AI on a Leash
&lt;/h2&gt;

&lt;p&gt;Responsible AI ensures the models are fair, accurate, and trustworthy. Think of it as an AI ethics board, keeping things cool and accountable.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyx08716t7yyujmfa2ymk.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyx08716t7yyujmfa2ymk.gif" alt="Navigating the Vocabulary of Gen AI with GIFs" width="480" height="480"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Did I forget any vocabulary? Feel free to drop it in the comments below.&lt;/p&gt;

</description>
      <category>aivocabulary</category>
      <category>machinelearningbasics</category>
      <category>aiwithgifs</category>
      <category>gpt3</category>
    </item>
    <item>
      <title>Optimizing Performance: A Comprehensive Guide to Choosing the Right T-Family Instance with Metrics and Amazon Q</title>
      <dc:creator>Ahmad Kanj</dc:creator>
      <pubDate>Sun, 04 Feb 2024 15:37:08 +0000</pubDate>
      <link>https://dev.to/aws-builders/optimizing-performance-a-comprehensive-guide-to-choosing-the-right-t-family-instance-with-metrics-and-amazon-q-16ij</link>
      <guid>https://dev.to/aws-builders/optimizing-performance-a-comprehensive-guide-to-choosing-the-right-t-family-instance-with-metrics-and-amazon-q-16ij</guid>
      <description>&lt;p&gt;Amazon Web Services (AWS) offers a diverse range of EC2 instances tailored to meet the specific needs of different workloads. Among these, the T-family instances, including previous generation T2, and latest generation: T3, T3a and T4g instances, are unique as they belong to the burstable performance category. In this detailed technical article, we will explore the key concepts, best practices, and features associated with these instances, shedding light on their inner workings and helping you make informed decisions for your applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Concepts and Definitions for Burstable Performance Instances:
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Earn CPU Credits:&lt;/strong&gt; Burstable performance instances operate on a credit-based system. Credits are earned during periods of low CPU utilization and spent when the CPU needs to burst to higher performance levels. A t3.nano instance, for example, earns 6 credits per hour with 2 vCPUs. These credits act as a currency that allows the instance to burst beyond its baseline capacity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CPU Credit Earn Rate:&lt;/strong&gt; The rate at which credits are earned varies based on the instance type. It is crucial to understand this metric to estimate how quickly the instance can accumulate credits during low utilization periods. AWS provides detailed documentation on the earn rate for each instance type, aiding users in making informed decisions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CPU Credit Accrual Limit:&lt;/strong&gt; To prevent excessive accumulation of credits, AWS imposes a limit on the maximum number of credits an instance can accrue. This limit ensures fair usage and prevents instances from gaining an unfair advantage during burst periods. Users should be aware of this limit and plan accordingly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Accrued CPU Credits Lifespan:&lt;/strong&gt; CPU credits have a limited lifespan, and they expire if not used within that period. The accrual limit, therefore, becomes critical to avoid unnecessary credit wastage. By monitoring and understanding the lifespan of accrued credits, users can optimize the burstable performance of their instances.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Baseline Utilization:&lt;/strong&gt; The baseline utilization represents the average CPU usage of an instance over time, calculated as &lt;code&gt;(number of credits earned/number of vCPUs)/60 minutes&lt;/code&gt;. For instance, a T3.nano instance with 2 vCPUs earning 6 credits per hour results in a baseline utilization of 5%.&lt;br&gt;
This metric helps users gauge the efficiency of their instances and determine if they are operating within the expected baseline. &lt;/p&gt;
&lt;h2&gt;
  
  
  Unlimited Mode for Burstable Performance Instances
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faf8z20qcppfn8cz0u2sv.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faf8z20qcppfn8cz0u2sv.gif" alt="Unlimited Mode for Burstable Performance Instances" width="256" height="254"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AWS offers an "Unlimited" mode for burstable performance instances, allowing them to burst beyond their baseline capacity without the fear of credit depletion. This mode is useful for workloads with unpredictable or spiky CPU demands. When an instance operates in Unlimited mode, it incurs an additional charge for surplus credits beyond the maximum daily limit.&lt;/p&gt;

&lt;p&gt;Knowing when to use Unlimited mode versus Fixed CPU mode is crucial. For applications with consistent and predictable workloads, Fixed CPU mode may be more cost-effective, as it avoids the additional charges associated with Unlimited mode.&lt;/p&gt;
&lt;h2&gt;
  
  
  Standard Mode for Burstable Performance Instances
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzeel1olfqmlinqd9jsng.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzeel1olfqmlinqd9jsng.gif" alt="Standard Mode for Burstable Performance Instances" width="480" height="270"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In Standard mode, burstable performance instances operate within their baseline capacity, and users can control the burst behavior by managing the available launch credits. Launch credits are granted at the start of each instance and are spent during burst periods.&lt;/p&gt;

&lt;p&gt;Understanding launch credit limits is essential for optimizing performance. Users should consider adjusting these limits based on the specific requirements of their workloads.&lt;/p&gt;
&lt;h2&gt;
  
  
  Monitoring CPU Credits
&lt;/h2&gt;

&lt;p&gt;Effectively monitoring CPU credits is vital to ensure optimal performance and cost management. AWS provides CloudWatch metrics specifically designed for burstable performance instances, updated every five minutes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CPUCreditUsage&lt;/strong&gt;: The number of CPU credits spent during the measurement period.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CPUCreditBalance&lt;/strong&gt;: The number of CPU credits accrued by the instance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CPUSurplusCreditBalance&lt;/strong&gt;: Surplus credits spent to sustain CPU utilization when &lt;code&gt;CPUCreditBalance&lt;/code&gt; is zero.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CPUSurplusCreditsCharged&lt;/strong&gt;: Surplus credits exceeding the maximum daily limit, incurring additional charges.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Determine CPU credit utilization for Standard instances by assessing the movement in the CPU credit balance. An increase in the CPU credit balance occurs when CPU utilization falls below the baseline, signifying that the credits spent are less than those earned in the preceding five-minute interval.&lt;/p&gt;

&lt;p&gt;Conversely, a decrease in the CPU credit balance is observed when CPU utilization surpasses the baseline, indicating that the credits spent exceed those earned in the prior five-minute interval. Mathematically, this relationship can be expressed through the following equation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CPUCreditBalance= prior CPUCreditBalance+[Credits earned per hour×(60/5)−CPUCreditUsage]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These metrics provide a comprehensive view of an instance's credit utilization, helping users make informed decisions about their workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  Instance Type Recommendations
&lt;/h2&gt;

&lt;p&gt;Recommendations for Selecting Optimal Instance Types&lt;br&gt;
Amazon Web Services (AWS) offers valuable tools to simplify the process of choosing the most suitable instance types for your workloads. With a diverse range of instance options available, finding the right balance between performance and cost can be challenging. AWS provides two key tools for making informed decisions based on your workload characteristics:&lt;/p&gt;

&lt;h2&gt;
  
  
  New Workloads: Amazon Q EC2 Instance Type Selector
&lt;/h2&gt;

&lt;p&gt;For new workloads, the Amazon Q EC2 Instance Type Selector proves invaluable. This tool considers your use case, workload type, CPU manufacturer preference, and your prioritization of price and performance. By leveraging this data, it provides guidance and suggestions for Amazon EC2 instance types that align best with your specific requirements.&lt;/p&gt;

&lt;p&gt;Navigating through the Amazon EC2 console, you can access the Amazon Q EC2 instance type selector to stay updated on the latest instance types and ensure optimal price-performance for your workloads. Whether seeking advice directly from Amazon Q or using the console, this tool streamlines the process of selecting the right instance type. To utilize the Amazon Q EC2 instance type selector:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Follow the procedure to &lt;a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-launch-instance-wizard.html#liw-quickly-launch-instance" rel="noopener noreferrer"&gt;launch an instance&lt;/a&gt; &lt;/li&gt;
&lt;li&gt;&lt;p&gt;Next to the Instance type, click on the &lt;code&gt;Get advice&lt;/code&gt; link.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fae65z8mkzxwumfonnh5x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fae65z8mkzxwumfonnh5x.png" alt="ec2 Instance type Get advice" width="800" height="273"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the "Get advice on instance type selection from Amazon Q" window, specify your requirements by choosing options from the drop-down lists, including Use Case and Workload type. Click on &lt;code&gt;Get instance type advice&lt;/code&gt; button.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff49ip08uvfeebs0zmhm6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff49ip08uvfeebs0zmhm6.png" alt="Amazon Q Window" width="800" height="393"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The Amazon Q AI assistant opens with personalized suggestions for instance types based on your specified requirements.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4k6ksvfbjuumyw478a90.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4k6ksvfbjuumyw478a90.png" alt="Amazon Q AI assistant opens with personalized suggestions" width="800" height="699"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Once you've decided on an instance type, proceed to the launch instance wizard or launch template, and select the recommended instance type.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Existing Workloads: AWS Compute Optimizer
&lt;/h2&gt;

&lt;p&gt;For existing workloads, AWS Compute Optimizer steps in to provide recommendations aimed at enhancing performance, reducing costs, or achieving a balance of both. By analyzing your current instance specifications and utilization metrics, Compute Optimizer determines which Amazon EC2 instance types are most suitable for handling your existing workload. The recommendations come complete with per-hour instance pricing to aid in decision-making.  For a comprehensive guide on utilizing AWS Compute Optimizer, refer to the &lt;a href="https://docs.aws.amazon.com/compute-optimizer/latest/ug/viewing-dashboard.html" rel="noopener noreferrer"&gt;AWS Compute Optimizer User Guide&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In summary, gaining a deep understanding of AWS burstable performance instances empowers users to make informed decisions about their infrastructure. Proficiency in concepts like CPU credits, baseline utilization, and the monitoring of metrics through CloudWatch is crucial. Additionally, leveraging tools such as Amazon Q for selecting the right instance type further enhances users' ability to achieve cost-effective and efficient performance in the cloud.&lt;/p&gt;

&lt;p&gt;Sources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/burstable-performance-instances.html" rel="noopener noreferrer"&gt;Burstable performance instances&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/amazonq/latest/aws-builder-use-ug/what-is.html" rel="noopener noreferrer"&gt;What is Amazon Q (for AWS builder use)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>amazonq</category>
      <category>ec2</category>
      <category>devops</category>
      <category>aws</category>
    </item>
    <item>
      <title>Incident vs Crisis: Understanding the Critical Distinction in SRE</title>
      <dc:creator>Ahmad Kanj</dc:creator>
      <pubDate>Mon, 08 Jan 2024 11:28:51 +0000</pubDate>
      <link>https://dev.to/ahmadkanj/incident-vs-crisis-understanding-the-critical-distinction-in-sre-1nef</link>
      <guid>https://dev.to/ahmadkanj/incident-vs-crisis-understanding-the-critical-distinction-in-sre-1nef</guid>
      <description>&lt;p&gt;In the world of Site Reliability Engineering (SRE), telling apart an incident from a crisis matters. At first, they might seem similar, but understanding the little details between them is super important. It helps a ton in managing problems, fixing them, and making sure everything stays working smoothly. This article is all about showing the differences between incidents and crises, explaining when, how, and why it's super important to call them out in an SRE setup.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Incident: The Unplanned Disruption&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;An incident in SRE parlance denotes an unexpected event that disrupts normal system functionality or performance. It could range from a temporary service degradation to a complete outage. Incidents are typically delineated by their scope, impact, and urgency in remediation. They are characterized by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Localized Impact:&lt;/strong&gt; Incidents tend to affect a specific component, service, or subset of users rather than the entire system.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Measurable Impact:&lt;/strong&gt; These disruptions often come with quantifiable metrics, such as increased error rates, latency spikes, or service unavailability.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Mitigable with Known Procedures:&lt;/strong&gt; Incidents are usually managed using documented runbooks or predefined procedures that SRE teams have developed over time.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Crisis: The Pervasive Threat&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Contrarily, a crisis represents an escalated and pervasive situation, surpassing the severity and scope of an incident. It transcends the boundaries of a singular system or service, posing a substantial risk to the entire infrastructure, reputation, or business continuity. Key attributes of a crisis include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Global or Wide-Spread Impact:&lt;/strong&gt; Crises have the potential to affect multiple systems, services, or even an entire organization, causing widespread disruptions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Escalating Severity:&lt;/strong&gt; They often escalate rapidly, demanding immediate attention and response due to their criticality.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Unknown or Evolving Solutions:&lt;/strong&gt; Unlike incidents, crises may lack well-defined mitigation procedures as they might involve unforeseen scenarios or complex interdependencies.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Declaring Incidents and Crises: When, How, and Why?&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The declaration of an incident or a crisis within an SRE framework is not merely semantic but holds immense operational significance. Clear and accurate identification enables efficient resource allocation, communication, and resolution. The process involves:&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;When to Declare:&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Incident:&lt;/strong&gt; Declare an incident when there is a deviation from normal system behavior, impacting a specific service or functionality, and it can be managed within existing procedures.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Crisis:&lt;/strong&gt; Declare a crisis when the disruption escalates, poses a significant risk to the entire system or organization, and demands immediate, dynamic, and possibly novel solutions.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;How to Declare:&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Incident:&lt;/strong&gt; Utilize predefined protocols or runbooks to declare an incident, promptly initiating the established response processes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Crisis:&lt;/strong&gt; Invoke higher-level escalation channels, engage cross-functional teams, and establish dedicated crisis management protocols to handle the situation.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Why It's Important:&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Operational Triage:&lt;/strong&gt; Accurate declaration aids in prioritization and resource allocation, ensuring a focused response aligned with the severity of the situation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Clear Communication:&lt;/strong&gt; It facilitates transparent communication both within the SRE team and with stakeholders, managing expectations and sharing pertinent information.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Learning and Improvement:&lt;/strong&gt; Distinguishing between incidents and crises helps in post-incident analysis, fostering continuous improvement by refining response strategies.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In conclusion, the distinction between an incident and a crisis is pivotal in the SRE landscape. Recognizing and declaring them accurately empowers teams to navigate disruptions effectively, safeguarding the reliability and resilience of systems while fostering a culture of continuous improvement and adaptability.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Load Balancer, Reverse Proxy, and API Gateway: Analogies to Real Life Scenarios</title>
      <dc:creator>Ahmad Kanj</dc:creator>
      <pubDate>Fri, 08 Sep 2023 12:28:28 +0000</pubDate>
      <link>https://dev.to/aws-builders/load-balancer-reverse-proxy-and-api-gateway-analogies-to-real-life-scenarios-54el</link>
      <guid>https://dev.to/aws-builders/load-balancer-reverse-proxy-and-api-gateway-analogies-to-real-life-scenarios-54el</guid>
      <description>&lt;p&gt;In the fast paced of tech world it's easy to get overwhelmed by the jargon and technicalities. However, understanding some fundamental concepts can help you make informed decisions about which Cloud/Infra services to use for your needs. In this article, we'll demystify three essential AWS services Load Balancers, Reverse Proxies, and API Gateways in simple  everyday terms.&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Balancers: The Traffic Directors
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx35zv4leuwkm9o4yippo.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx35zv4leuwkm9o4yippo.gif" alt="Load Balancers" width="305" height="228"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Imagine you run a busy restaurant with multiple chefs in the kitchen. Sometimes, lots of customers walk in, and it can be hard to serve everyone quickly and evenly. That's where Load Balancers come in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Load Balancers&lt;/strong&gt; distribute incoming customer traffic among the chefs (servers or instances), ensuring that everyone gets their food without waiting too long. If one chef is busy or takes a break, the Load Balancer directs customers to other chefs to keep things moving smoothly. It's like having a friendly host or hostess who ensures everyone in your restaurant gets served efficiently, even during the busiest times.&lt;/p&gt;

&lt;p&gt;In AWS you can choose between different types of Load Balancers, each suited for specific needs. For example, the Application Load Balancer (ALB) works well for web apps and can even send certain dishes to one chef and others to a different chef based on the type of food.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reverse Proxies: The Mailroom Organizers
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffaaq2xzidsvg8ks7fe8x.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffaaq2xzidsvg8ks7fe8x.gif" alt="Reverse Proxies" width="320" height="240"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now picture you work in a big office building with a bustling mailroom that handles packages and letters. Sometimes, you need to do extra things to keep everything organized and secure, and that's where Reverse Proxies come into play.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reverse Proxies&lt;/strong&gt; are like friendly receptionists in your mailroom who take care of packages and letters. They keep a copy of commonly used documents in a special room to save time. When a special package arrives, they check it for security and make sure it goes to the right department. They also handle letters and packages that need extra protection, like opening envelopes to ensure they are safe before delivering them.&lt;/p&gt;

&lt;p&gt;In AWS you can set up a Reverse Proxy to sit in front of your servers and help organize incoming requests, keep things secure, and even help with tasks like handling encrypted data (like secret letters) to protect your valuable information.&lt;/p&gt;

&lt;h3&gt;
  
  
  API Gateways: The Library Guides
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ls51bf41193oirjdpcz.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ls51bf41193oirjdpcz.gif" alt="API Gateways" width="400" height="289"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now think of yourself as the librarian of a big library with tons of books and resources. You want to make it easy for people to access information while keeping everything organized and secure. That's where API Gateways come into play.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;API Gateways&lt;/strong&gt; are like friendly librarians who help people find books and resources. They ask everyone to show their library card before they can borrow books to keep things organized. The librarians make sure no one takes too many books at once to ensure everyone gets a fair chance. When someone asks for information, they check a special guide to ensure they get the right answers. They help people find the information they need, making sure everything is accurate and easy to understand.&lt;/p&gt;

&lt;p&gt;In AWS you can create an API Gateway to help organize and secure access to your app's information and services. It's like having a helpful librarian who ensures that everyone can access the information they want with ease.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion: Picking the Right Tool for the Job
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flhw5k7ct5f2zo2r4f7at.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flhw5k7ct5f2zo2r4f7at.gif" alt="Picking the Right Tool for the Job" width="498" height="280"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the world of AWS Load Balancers, Reverse Proxies and API Gateways are essential tools to help your apps run efficiently, securely, and smoothly. Just like in real life choosing the right tool for your needs is crucial. Load Balancers distribute traffic, Reverse Proxies keep things organized and secure, and API Gateways guide people to the right information and services. By understanding these everyday comparisons, you can make informed decisions about which AWS service best suits your needs ensuring a successful and hassle-free cloud journey.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>networking</category>
      <category>cloud</category>
      <category>devops</category>
    </item>
    <item>
      <title>Building a Greener Cloud: The Role of an Architect for Sustainability in AWS</title>
      <dc:creator>Ahmad Kanj</dc:creator>
      <pubDate>Sun, 19 Feb 2023 16:17:56 +0000</pubDate>
      <link>https://dev.to/aws-builders/building-a-greener-cloud-the-role-of-an-architect-for-sustainability-in-aws-cge</link>
      <guid>https://dev.to/aws-builders/building-a-greener-cloud-the-role-of-an-architect-for-sustainability-in-aws-cge</guid>
      <description>&lt;p&gt;In recent years, the term '&lt;em&gt;sustainability&lt;/em&gt;' has become increasingly important, and for good reason. Climate change is one of the biggest threats facing our planet, and we need to take immediate action to mitigate its effects. One area where we might not expect to find an impact is in the world of technology, and specifically cloud computing. However, as it turns out, the cloud could be doing more damage to the planet than we realize.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is the case?
&lt;/h2&gt;

&lt;p&gt;First, we need to look at how cloud computing works. Essentially, when we use cloud-based services, we are outsourcing the processing and storage of our data to massive data centers. These data centers are run by Amazon, Microsoft, and Google, and they are some of the largest energy consumers in the world. To power these centers, they require vast amounts of electricity, much of which comes from non-renewable sources like coal and natural gas.&lt;/p&gt;

&lt;p&gt;The result of all this energy consumption is a significant carbon footprint. In fact, according to a report by Greenpeace, the internet and technology account for around 4% of global carbon emissions, more than double the emissions produced by the airline industry. By 2030, it's expected that this figure could double again, with technology and cloud computing accounting for over 8% of all carbon emissions.&lt;/p&gt;

&lt;h2&gt;
  
  
  What can we do about this?
&lt;/h2&gt;

&lt;p&gt;The good news is that there are a number of initiatives underway to reduce the impact of cloud computing on the environment. Many of the major cloud providers have committed to using renewable energy sources to power their data centers. For example, Google has been carbon-neutral since 2007, and plans to be powered entirely by renewable energy by 2030. Microsoft has pledged to be carbon negative by 2030, while Amazon has committed to powering its operations with 100% renewable energy by 2025.&lt;/p&gt;

&lt;p&gt;That’s why AWS is taking steps to make their operations more sustainable. They know that in order to keep our planet healthy, we need to work together to reduce our carbon footprint and minimize our impact on the environment.&lt;/p&gt;

&lt;h2&gt;
  
  
  The AWS Sustainability Pillar
&lt;/h2&gt;

&lt;p&gt;At re:Invent 2021, AWS announced that it’s adding a new Sustainability Pillar to the AWS Well-Architected Framework. This new pillar is designed to help developers and organizations incorporate sustainability into their cloud architecture and operations.&lt;/p&gt;

&lt;p&gt;The Sustainability Pillar provides guidance on how to design and operate cloud workloads in a way that minimizes their impact on the environment. It covers a wide range of topics, including energy efficiency, waste reduction, and sustainable sourcing.&lt;/p&gt;

&lt;p&gt;By incorporating the Sustainability Pillar into their architecture, developers can create cloud solutions that are not only more environmentally friendly but also more cost-effective and scalable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architect for Sustainability
&lt;/h2&gt;

&lt;p&gt;This is where the concept of Architect for Sustainability comes in – a framework for designing cloud solutions that are environmentally responsible. Architect for Sustainability is a set of best practices for cloud architecture that prioritize sustainability in design, development, and deployment. It is a holistic approach that considers the entire lifecycle of cloud solutions, from design to end-of-life disposal. &lt;/p&gt;

&lt;p&gt;To be sustainable on the cloud, businesses must work with cloud providers that are committed to Architect for Sustainability. This means choosing providers that have a clear sustainability strategy and have implemented best practices to reduce their environmental impact. It also means businesses must be conscious of their own carbon footprint and take steps to reduce their energy consumption and carbon emissions. Here are some key principles of Architect for Sustainability on AWS:&lt;/p&gt;

&lt;h2&gt;
  
  
  AWS Customer Carbon Footprint Tool
&lt;/h2&gt;

&lt;p&gt;In addition to the Sustainability Pillar, AWS is also working on a &lt;a href="https://aws.amazon.com/aws-cost-management/aws-customer-carbon-footprint-tool/" rel="noopener noreferrer"&gt;customer carbon footprint tool&lt;/a&gt;. This tool will allow AWS customers to measure and analyze the carbon footprint of their cloud operations.&lt;/p&gt;

&lt;p&gt;By understanding the environmental impact of their cloud operations, customers can take steps to reduce their carbon footprint and make their operations more sustainable. This will not only benefit the environment but also help customers save money on their cloud operations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use AWS’s carbon-free regions
&lt;/h2&gt;

&lt;p&gt;AWS has carbon-free regions that use renewable energy sources to power their data centers. By the end of 2021, several AWS regions were powered by over 95% renewable energy, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;US East (Northern Virginia)&lt;/li&gt;
&lt;li&gt;US West (Northern California)&lt;/li&gt;
&lt;li&gt;US East (Ohio)&lt;/li&gt;
&lt;li&gt;US West (Oregon)&lt;/li&gt;
&lt;li&gt;GovCloud (US-East)&lt;/li&gt;
&lt;li&gt;GovCloud (US-West) &lt;/li&gt;
&lt;li&gt;Canada (Central)&lt;/li&gt;
&lt;li&gt;Europe (Ireland)&lt;/li&gt;
&lt;li&gt;Europe (Frankfurt) &lt;/li&gt;
&lt;li&gt;Europe (London)&lt;/li&gt;
&lt;li&gt;Europe (Milan)&lt;/li&gt;
&lt;li&gt;Europe (Paris)&lt;/li&gt;
&lt;li&gt;Europe (Stockholm)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This means that using AWS in these regions can significantly reduce carbon emissions and help companies move towards their sustainability goals.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use serverless computing
&lt;/h2&gt;

&lt;p&gt;According to a &lt;a href="https://www.accenture.com/_acnmedia/PDF-177/Accenture-Tech-Sustainability-uniting-Sustainability-and-Technology.pdf" rel="noopener noreferrer"&gt;study by Accenture&lt;/a&gt;, &lt;a href="https://aws.amazon.com/serverless/" rel="noopener noreferrer"&gt;serverless computing&lt;/a&gt; can reduce carbon emissions by up to 70% compared to traditional server-based computing. This is because serverless computing platforms, such as AWS Lambda, can quickly scale up or down to match the demand for resources. As a result, businesses using AWS Lambda can reduce the number of servers they require, leading to a significant reduction in carbon emissions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use auto-scaling
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/autoscaling/" rel="noopener noreferrer"&gt;AWS Auto-Scaling&lt;/a&gt; is a service that automatically adjusts the capacity of an application in response to changing demand. It monitors resource utilization and scales resources up or down as necessary. By using AWS Auto Scaling, businesses can ensure that their applications are always running at optimal performance levels, without wasting resources or energy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Choose the most energy-efficient instance type
&lt;/h2&gt;

&lt;p&gt;AWS offers a wide range of instance types to choose from, some of which are more energy-efficient than others. For example, instances that use ARM-based processors are more energy-efficient than those that use Intel processors.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use CloudFront
&lt;/h2&gt;

&lt;p&gt;One of the main ways that &lt;a href="https://aws.amazon.com/cloudfront/" rel="noopener noreferrer"&gt;AWS CloudFront&lt;/a&gt; helps to reduce energy consumption and carbon emissions is through the use of edge locations. Edge locations are data centers that are located in different parts of the world and are designed to cache content so that it can be delivered quickly to customers in that region. By using edge locations, AWS CloudFront reduces the need for content to travel long distances, which in turn reduces the amount of energy needed to deliver that content. This helps to lower the carbon footprint of companies that use AWS CloudFront, as less energy is needed to power the delivery of their content.&lt;/p&gt;

&lt;p&gt;Another way that AWS CloudFront helps to reduce energy consumption and carbon emissions is through the use of caching. When content is requested by a customer, AWS CloudFront checks to see if that content is already stored in an edge location. If the content is already cached in an edge location, AWS CloudFront can deliver it quickly without having to retrieve it from the original source. This reduces the amount of energy needed to retrieve and deliver the content, as well as the carbon emissions associated with that process.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use AWS Trusted Advisor
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/premiumsupport/technology/trusted-advisor/" rel="noopener noreferrer"&gt;AWS Trusted Advisor&lt;/a&gt; is a tool that provides recommendations to optimize AWS infrastructure and resources. It analyzes an organization's AWS environment and provides guidance on cost optimization, security, performance, and fault tolerance. One of the lesser-known features of Trusted Advisor is its ability to help reduce energy consumption and carbon emissions. It provides a list of best practices for reducing energy consumption and carbon emissions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Future of Sustainable Cloud
&lt;/h2&gt;

&lt;p&gt;As cloud technology continues to evolve, it’s important that we prioritize sustainability and work to reduce our impact on the environment. By incorporating sustainability into our cloud operations, we can create a more sustainable future for our planet and ensure that we can continue to enjoy the things that make Earth so unique, like pizza.&lt;/p&gt;

&lt;p&gt;AWS’s new Sustainability Pillar and customer carbon footprint tool are just two examples of how cloud providers are working to make their operations more sustainable. As more organizations prioritize sustainability in their cloud architecture, we can create a more sustainable future for our planet and ensure that we can continue to thrive on Earth for generations to come.&lt;/p&gt;

</description>
      <category>sustainability</category>
      <category>aws</category>
      <category>greenit</category>
      <category>cloud</category>
    </item>
    <item>
      <title>Provisioning Basic AWS Resources with Pulumi and TypeScript: A Step-by-Step Tutorial</title>
      <dc:creator>Ahmad Kanj</dc:creator>
      <pubDate>Fri, 27 Jan 2023 15:34:14 +0000</pubDate>
      <link>https://dev.to/aws-builders/provisioning-basic-aws-resources-with-pulumi-and-typescript-a-step-by-step-tutorial-34j1</link>
      <guid>https://dev.to/aws-builders/provisioning-basic-aws-resources-with-pulumi-and-typescript-a-step-by-step-tutorial-34j1</guid>
      <description>&lt;p&gt;Pulumi is a cloud-native infrastructure as code (IAC) tool that allows developers to provision, manage, and update cloud resources using familiar programming languages such as JavaScript, TypeScript, Python, and Go. It can be used to automate the deployment and management of resources on various cloud platforms, including AWS.&lt;/p&gt;

&lt;p&gt;In this article, we will go over how to use Pulumi to provision basic AWS resources using TypeScript.&lt;/p&gt;

&lt;p&gt;Before we begin, you will need to have an AWS account set up, and have your AWS access key and secret key handy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Setting up and configuring Pulumi to access your AWS account:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;First, install the Pulumi CLI on your machine. You can download it from the Pulumi website at:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://www.pulumi.com/docs/get-started/install/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run the command &lt;code&gt;pulumi login&lt;/code&gt; to log in to your Pulumi account.&lt;/p&gt;

&lt;p&gt;Run the command &lt;code&gt;pulumi config set aws:region &amp;lt;region&amp;gt;&lt;/code&gt; to set the desired region for your deployment. Replace  with your preferred region (e.g. &lt;code&gt;eu-central-1&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Run the command &lt;code&gt;pulumi config set aws:access_key &amp;lt;access_key&amp;gt;&lt;/code&gt; and &lt;code&gt;pulumi config set aws:secret_key &amp;lt;secret_key&amp;gt;&lt;/code&gt; to set your AWS access and secret keys.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Creating a new Pulumi project:&lt;/strong&gt; Run the command &lt;code&gt;pulumi new aws-typescript&lt;/code&gt; to create a new Pulumi project using the AWS and TypeScript templates.&lt;br&gt;
This will create a new directory with a basic Pulumi project structure and some starter code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Create an S3 bucket:&lt;/strong&gt; One of the most basic resources you can create in AWS is an S3 bucket. You can create an S3 bucket using the following TypeScript code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const bucket = new aws.s3.Bucket("my-bucket");
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Create an EC2 instance:&lt;/strong&gt; Another basic resource you can create in AWS is an EC2 instance. You can create an EC2 instance using the following TypeScript code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const instance = new aws.ec2.Instance("my-instance", {
  instanceType: "t2.micro",
  ami: "ami-0c94855ba95c71c99",
  vpcSecurityGroupIds: [],
  subnetId: ""
});
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Create a security group:&lt;/strong&gt; To secure your EC2 instance, you can create a security group using the following TypeScript code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const securityGroup = new aws.ec2.SecurityGroup("my-security-group", {
  ingress: [
    { protocol: "tcp", fromPort: 22, toPort: 22, cidrBlocks: ["0.0.0.0/0"] }
  ]
});
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Deploy the resources:&lt;/strong&gt; After you have created your resources, you can deploy them to AWS by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pulumi up
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Destroy resources:&lt;/strong&gt; Pulumi also allows you to update and destroy resources. You can update resources by modifying the code and running the pulumi up command again. To destroy resources, you can run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pulumi destroy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By using Pulumi with TypeScript, you can easily provision and manage basic AWS resources such as S3 buckets, EC2 instances, and security groups. Pulumi provides a simple and efficient way to manage cloud infrastructure while also allowing developers to use a familiar programming language.&lt;/p&gt;

&lt;p&gt;In summary, Pulumi is a powerful tool that makes it easy to provision and manage resources on AWS using familiar programming languages. It is a great choice for teams looking to automate their infrastructure and make it more scalable and manageable.&lt;/p&gt;

</description>
      <category>gratitude</category>
    </item>
  </channel>
</rss>
