<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: AWS Community Builders </title>
    <description>The latest articles on DEV Community by AWS Community Builders  (@aws-builders).</description>
    <link>https://dev.to/aws-builders</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F2794%2F88da75b6-aadd-4ea1-8083-ae2dfca8be94.png</url>
      <title>DEV Community: AWS Community Builders </title>
      <link>https://dev.to/aws-builders</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/aws-builders"/>
    <language>en</language>
    <item>
      <title>Event-driven media intelligence with AWS Step Functions and Bedrock</title>
      <dc:creator>Collins Ushi</dc:creator>
      <pubDate>Sat, 18 Apr 2026 19:00:05 +0000</pubDate>
      <link>https://dev.to/aws-builders/event-driven-media-intelligence-with-aws-step-functions-and-bedrock-46lp</link>
      <guid>https://dev.to/aws-builders/event-driven-media-intelligence-with-aws-step-functions-and-bedrock-46lp</guid>
      <description>&lt;p&gt;Every modern product that handles user-generated media; say a podcast platform, a video CMS, a learning product, a content-moderation layer   - eventually runs into the same problem. A file lands in storage, and now the system needs to understand it: extract the speech, identify what's on screen, summarise it, tag it, and make it queryable. Doing that on a single server is expensive, fragile, and impossible to scale predictably.&lt;/p&gt;

&lt;p&gt;This article walks through a serverless design for that problem on AWS. The pipeline ingests audio, video, or images, runs them through managed AI services (Rekognition, Transcribe, Bedrock), and persists the extracted intelligence into DynamoDB for downstream use - all without operating a single EC2 instance. The focus is on why each piece exists and how it fits together, not just naming the services.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why serverless is the right shape for this problem
&lt;/h2&gt;

&lt;p&gt;Media workloads are spiky and long-running in a way that punishes traditional compute. &lt;br&gt;
A single 45 minute video can take several minutes to transcribe; ten of them landing at once shouldn't require keeping ten servers warm all day. The workflow also fans out naturally; transcription, visual analysis, and thumbnail generation are independent steps that want to run in parallel.&lt;/p&gt;

&lt;p&gt;Three properties of a serverless, event-driven design map cleanly onto this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Elastic concurrency: Lambda, Rekognition, and Transcribe scale out on demand. You pay per request and per second of processing, not for idle capacity.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Native asynchrony: Step Functions turn a multi-stage AI pipeline into a declarative state machine with retries, timeouts, and parallel branches built in - no message queues or cron jobs to wire by hand.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Composability: Every stage is a managed service with a clear IAM contract. Swapping Transcribe for a different ASR provider, or Bedrock for a self-hosted model, is a local change.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  Architecture at a glance
&lt;/h2&gt;

&lt;p&gt;The pipeline has five logical layers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ingestion&lt;/strong&gt;: S3 receives the upload.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event routing&lt;/strong&gt;: EventBridge picks up the &lt;code&gt;Object Created&lt;/code&gt; event and triggers the workflow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Orchestration&lt;/strong&gt;: Step Functions coordinates the processing stages.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Intelligence&lt;/strong&gt;: Lambda functions call Rekognition, Transcribe, and Bedrock to extract structured data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Persistence&lt;/strong&gt;: DynamoDB stores the metadata, keyed by media ID, for downstream querying.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3yk07lx3aczm988zhs2c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3yk07lx3aczm988zhs2c.png" alt=" " width="632" height="289"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Walking through each component
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. S3 as the ingestion surface&lt;/strong&gt;&lt;br&gt;
The pipeline starts where most media pipelines start: a single S3 bucket configured for object uploads. A few things are worth doing at this layer that are easy to skip:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Separate buckets (or at least prefixes) for raw input and processed artefacts&lt;/strong&gt;. Keeping &lt;code&gt;incoming/&lt;/code&gt; and &lt;code&gt;processed/&lt;/code&gt; distinct means lifecycle rules, replication, and IAM policies stay clean as the system grows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multipart upload configuration&lt;/strong&gt;. Large video files need multipart uploads; the default S3 SDKs handle this, but the bucket should have a lifecycle rule to abort incomplete multipart uploads after N days, or costs quietly accumulate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Server-side encryption and versioning&lt;/strong&gt;. If the system ever handles sensitive content (KYC video, medical, private podcasts), this is non-negotiable.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. EventBridge as the event router&lt;/strong&gt;&lt;br&gt;
You could wire S3 notifications directly to a Lambda, it works but could be rigid. Putting EventBridge between S3 and the rest of the pipeline gives you something much more valuable: a declarative event bus where rules pattern-match on the event payload.&lt;br&gt;
The immediate benefit is filtering. A rule can fire the pipeline only for specific prefixes, file extensions, or size ranges:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"aws.s3"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"detail-type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Object Created"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"detail"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"bucket"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"media-ingest-prod"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"prefix"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"incoming/"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"size"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"numeric"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The longer-term benefit is that the event bus becomes a seam. When a second consumer appears, say; an analytics team that wants to count uploads per tenant, they attach their own rule without touching the core pipeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Step Functions as the orchestrator&lt;/strong&gt;&lt;br&gt;
Interestingly, this appears to be the heart of the design. Step Functions expresses the pipeline as a JSON (or YAML) state machine, and the payoff is enormous: retries with exponential backoff, per-state timeouts, parallel branches, and error paths are all configuration rather than code.&lt;/p&gt;

&lt;p&gt;A reasonable shape for the state machine looks like this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ClassifyMedia&lt;/strong&gt;; a Lambda that inspects the MIME type and branches on whether the file is image, audio, or video.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ParallelAnalysis&lt;/strong&gt;; a Parallel state that runs, for a video:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;StartTranscriptionJob&lt;/strong&gt; (Transcribe) on the extracted audio track.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;StartLabelDetection&lt;/strong&gt; and &lt;strong&gt;StartContentModeration&lt;/strong&gt; (Rekognition Video) on the visual track.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;WaitForCompletion&lt;/strong&gt;; asynchronous Rekognition and Transcribe jobs publish completion events; the state machine uses .waitForTaskToken or polling to resume once results are ready.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SummariseWithBedrock&lt;/strong&gt;; a Lambda that sends the transcript and label set to a Bedrock model and receives a structured summary.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;PersistMetadata&lt;/strong&gt;; writes the final object into DynamoDB.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CatchAll&lt;/strong&gt;; a catch block that routes any failure to a dead-letter state, writes the error into a failures table, and optionally notifies via SNS.&lt;/p&gt;

&lt;p&gt;The thing to internalise is that Step Functions is not "just" a workflow visualiser, it's the reason the pipeline is resilient. A transient Bedrock throttling error retries automatically; a malformed file fails into the catch branch instead of silently leaving the system in a half-processed state.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Lambda as the glue&lt;/strong&gt;&lt;br&gt;
Each state in the machine is either a direct service integration (Step Functions can call Rekognition, Transcribe, and DynamoDB without a Lambda in between) or a small Lambda function for the bits that need custom logic: MIME classification, payload shaping before Bedrock, post-processing transcripts.&lt;/p&gt;

&lt;p&gt;A rule of thumb: prefer direct service integrations wherever possible. Every Lambda you add is code to test, package, monitor, and patch. The direct integrations are zero-code and zero-cold-start.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. The AI layer: Rekognition, Transcribe, and Bedrock&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each of these handles a different modality:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Amazon Rekognition&lt;/strong&gt; handles visual analysis, object and scene labels, celebrity detection, content moderation flags, and text-in-image (OCR) for video and still images. For video, jobs are asynchronous: you start a job and receive an SNS notification when results are ready.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Amazon Transcribe&lt;/strong&gt; turns audio into a structured transcript with timestamps, speaker diarisation, and optional custom vocabularies. For a podcast pipeline, custom vocabularies dramatically improve accuracy on domain-specific terms (product names, acronyms, people).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Amazon Bedrock&lt;/strong&gt; is where the "AI-powered" part lives in the modern sense. Given the raw transcript and Rekognition labels, a Bedrock model (Claude, Nova, Llama, depending on preference) produces the artefacts downstream systems actually want: a one-paragraph summary, chapter markers with timestamps, topic tags, a short SEO description, a list of named entities.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The prompt design for the Bedrock step deserves its own attention. A pattern that works well is to ask the model for a &lt;strong&gt;strict JSON response&lt;/strong&gt; against a schema you define, rather than free-form text so the persistence step can store it without fragile parsing:&lt;/p&gt;

&lt;p&gt;You will receive a transcript and a list of visual labels from a video.&lt;/p&gt;

&lt;p&gt;Return ONLY a JSON object with this exact shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;string&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;(max&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;words)&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"chapters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"start_seconds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;number&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"topics"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"entities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"people"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"organisations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"places"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;6. DynamoDB as the metadata store&lt;/strong&gt;&lt;br&gt;
DynamoDB is the right fit here because the access patterns are simple and the shape is predictable. A single table keyed by &lt;code&gt;media_id&lt;/code&gt; (partition key) with created_at as the sort key handles the primary "fetch processing result for this upload" pattern. A GSI on tenant or status lets you answer "show me all videos processed in the last 24 hours" without scanning.&lt;br&gt;
Keep the Bedrock-generated summary and the raw transcript reference separate: summaries are small and hot, transcripts can be large and belong back in S3 with just a pointer in DynamoDB.&lt;/p&gt;

&lt;h2&gt;
  
  
  Putting it together: an end-to-end request
&lt;/h2&gt;

&lt;p&gt;Concretely, here's what happens when a user uploads a 20 minute interview video:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;The client performs a multipart upload to &lt;code&gt;s3://media-ingest-prod/incoming/&amp;lt;tenant-id&amp;gt;/&amp;lt;uuid&amp;gt;.mp4&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;S3 emits an Object Created event to the default event bus.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;An EventBridge rule matches on the &lt;code&gt;incoming&lt;/code&gt;/ prefix and starts a Step Functions execution, passing the bucket and key as input.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The state machine classifies the file as video, then fans out in parallel: Transcribe starts on the audio track, Rekognition starts label detection and content moderation on the visual track.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Both jobs complete asynchronously; the state machine resumes via task tokens.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A Lambda gathers the transcript and labels, constructs a Bedrock prompt, and receives a structured JSON summary.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A final step writes the full metadata record to DynamoDB and moves the original file from &lt;code&gt;incoming/&lt;/code&gt; to &lt;code&gt;processed/&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Any failure along the way is caught, logged to a failures table, and surfaced via an SNS topic that pages the on-call engineer if it crosses a threshold.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The whole thing runs without a single long-lived server.&lt;/p&gt;

&lt;h2&gt;
  
  
  Things that bite you in production
&lt;/h2&gt;

&lt;p&gt;Diagrams make this look clean. A few practical notes from taking a pipeline like this to production:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;IAM is the hard part&lt;/strong&gt;. Each Lambda and each Step Functions state needs a narrowly scoped role. Do not share a single execution role across the pipeline, one overly broad policy is how &lt;code&gt;s3:GetObject *&lt;/code&gt; turns into a security incident.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Bedrock throttling is real&lt;/strong&gt;. Regional model quotas can and will throttle you under load. Wrap the Bedrock call in a retry policy with jitter, and consider provisioned throughput if the pipeline is on a critical path.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Large transcripts exceed context windows&lt;/strong&gt;. A two-hour podcast transcript can be larger than a single model context. Chunk on speaker turns or paragraph boundaries, summarise each chunk, then do a reduce pass - a classic map-reduce over text.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cost lurks in the idle corners&lt;/strong&gt;. The obvious costs (Transcribe per audio-minute, Rekognition per video-minute, Bedrock per token) are easy to forecast. The easily-missed ones are Step Functions state transitions on high-volume pipelines and CloudWatch Logs ingestion if you log every event verbatim. Sample your logs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Observability needs to span the whole flow&lt;/strong&gt;. A single correlation ID; the S3 object key or a generated media ID should travel through every Lambda, Step Functions state, and DynamoDB record, so that when something breaks you can reconstruct exactly what happened with a single query.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Where this design shines
&lt;/h2&gt;

&lt;p&gt;The same skeleton supports a surprising range of products with only the Bedrock prompt and the DynamoDB schema changing:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Podcast platforms&lt;/strong&gt; that want automatic show notes, chapter markers, and searchable transcripts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Video intelligence tools&lt;/strong&gt; for media libraries tagging, searching, and moderating large content archives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Learning and compliance products&lt;/strong&gt; that need to extract key points and generate quizzes from lecture recordings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Content moderation systems&lt;/strong&gt; combining Rekognition's moderation labels with LLM-based policy reasoning for edge cases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Customer support analytics&lt;/strong&gt; processing recorded calls to surface sentiment, topics, and escalation signals.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where to take it next
&lt;/h2&gt;

&lt;p&gt;Once the base pipeline is in place, the interesting extensions are mostly at the edges:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Realtime mode&lt;/strong&gt;. Swap batch Transcribe for Transcribe Streaming and emit partial results over WebSockets for live captioning.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Semantic search&lt;/strong&gt;. Pipe the Bedrock-generated summary and transcript chunks into an embedding model and store the vectors in OpenSearch or a vector store now the media library is searchable by meaning, not just tags.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Human-in-the-loop review&lt;/strong&gt;. For content moderation or compliance, route low-confidence Bedrock decisions to an SQS queue backed by a reviewer UI, and feed the decisions back as training data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Multi-tenant isolation&lt;/strong&gt;. Use S3 access points and dynamic Step Functions execution roles to enforce tenant boundaries at the infrastructure layer rather than in application code.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;My closing thought&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The interesting thing about this architecture isn't any individual service; it's that the boundaries between them are event-driven and declarative. You can reason about the system by looking at the state machine definition and the EventBridge rules, not by reading through layers of application code. That's what makes it durable: when the product team asks for a new capability ("can we also detect languages automatically?"), you add a state, not a service.&lt;/p&gt;

&lt;p&gt;Serverless isn't the right answer for every system, but for "something landed in storage, now go understand it," it's hard to beat. The pipeline scales with the work, costs track usage, and the blast radius of any one failure is a single execution not the whole platform.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>lambda</category>
      <category>aws</category>
      <category>awscommunitybuilder</category>
    </item>
    <item>
      <title>Resurface Claude Code Usage Across Your Team with CloudWatch OTEL (No Lambda)</title>
      <dc:creator>Gabriel Koo</dc:creator>
      <pubDate>Sat, 18 Apr 2026 12:59:36 +0000</pubDate>
      <link>https://dev.to/aws-builders/resurface-claude-code-usage-across-your-team-with-cloudwatch-otel-no-lambda-4p0i</link>
      <guid>https://dev.to/aws-builders/resurface-claude-code-usage-across-your-team-with-cloudwatch-otel-no-lambda-4p0i</guid>
      <description>&lt;p&gt;I've been building AI tooling infrastructure to empower a team of 50+ software engineers to do vibe coding safely. We went from 3 engineers using AI full-time to 50+ in 6 months — including non-engineers. (I co-presented on this journey at &lt;a href="https://the-quantum-nargle.github.io/agentcon-2026-hk-slides/" rel="noopener noreferrer"&gt;AgentCon Hong Kong 2026&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;One thing we learned: &lt;strong&gt;you can't improve what you can't measure.&lt;/strong&gt; Once you give a team AI coding tools, you want visibility into how they're being used — not to evaluate individual engineers, but to understand adoption patterns. Which tools are people reaching for? How large are the prompts? What tool calls are being made? Making these metrics visible to everyone helps the team learn from each other and helps champions pull others forward.&lt;/p&gt;

&lt;p&gt;This post is about the plumbing: how to get that telemetry data from coding agents into CloudWatch with minimal infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"But we already have an LLM gateway."&lt;/strong&gt; If your team routes AI traffic through a gateway like &lt;a href="https://github.com/BerriAI/litellm" rel="noopener noreferrer"&gt;LiteLLM&lt;/a&gt; or &lt;a href="https://aws.amazon.com/bedrock/" rel="noopener noreferrer"&gt;AWS Bedrock&lt;/a&gt;, you already have token-level usage data. But if your engineers are on coding plans — Claude Team/Max, OpenCode Go, GitHub Copilot seats, ChatGPT Codex — the LLM calls bypass your gateway entirely. You lose visibility into the interesting stuff: how many tool calls per session, prompt sizes, which tools are being invoked, who's active at what times. That's where OTEL telemetry fills the gap.&lt;/p&gt;

&lt;p&gt;AI coding tools are shipping with built-in OpenTelemetry support. &lt;a href="https://docs.anthropic.com/en/docs/claude-code/monitoring-usage" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt;, &lt;a href="https://support.claude.com/en/articles/14477985-monitor-claude-cowork-activity-with-opentelemetry" rel="noopener noreferrer"&gt;Claude CoWork&lt;/a&gt;, &lt;a href="https://docs.github.com/copilot/how-tos/copilot-sdk/observability/opentelemetry" rel="noopener noreferrer"&gt;GitHub Copilot&lt;/a&gt;, &lt;a href="https://geminicli.com/docs/cli/telemetry/" rel="noopener noreferrer"&gt;Gemini CLI&lt;/a&gt;, and &lt;a href="https://github.com/LangGuard-AI/cursor-otel-hook" rel="noopener noreferrer"&gt;Cursor&lt;/a&gt; (via hooks) all export metrics, traces, and log events over OTLP/HTTP — token counts, tool durations, model latency, the works. &lt;a href="https://github.com/kirodotdev/Kiro/issues/6319" rel="noopener noreferrer"&gt;Kiro has an open feature request&lt;/a&gt; for native OTEL support too.&lt;/p&gt;

&lt;p&gt;There's one catch: &lt;strong&gt;CloudWatch's OTLP endpoints require SigV4 signing.&lt;/strong&gt; These tools' OTEL SDKs can't do that. Neither can most OTEL SDKs without an AWS-specific exporter or a collector sidecar.&lt;/p&gt;

&lt;p&gt;The usual fix is a Lambda function that receives OTLP, signs it, and forwards it. That means cold starts, packaging, and another thing to maintain.&lt;/p&gt;

&lt;p&gt;Here's a simpler way: &lt;strong&gt;API Gateway REST API with AWS Service Integration.&lt;/strong&gt; APIGW signs the request with SigV4 using an execution role. No Lambda. No collector. No code.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fef5flvd70ipoiq5n5psj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fef5flvd70ipoiq5n5psj.png" alt="Expanding Brain Meme"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Timeline:&lt;/strong&gt; CloudWatch has supported OTLP ingestion for &lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-OTLPEndpoint.html" rel="noopener noreferrer"&gt;traces and logs&lt;/a&gt; for some time (availability varies by region — check the &lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-OTLPEndpoint.html" rel="noopener noreferrer"&gt;OTLP endpoints doc&lt;/a&gt;). &lt;a href="https://aws.amazon.com/about-aws/whats-new/2026/04/amazon-cloudwatch-opentelemetry-metrics/" rel="noopener noreferrer"&gt;Native OTLP metrics support launched April 2, 2026&lt;/a&gt; in public preview, completing all three pillars of observability via OTLP.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3fguhkeofv8x6gf2e2w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk3fguhkeofv8x6gf2e2w.png" alt="Architecture"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AI Coding Tool (OTEL SDK)
  ↓ OTLP/HTTP + x-api-key
API Gateway REST API
  ├→ POST /v1/metrics  → AWS Integration → monitoring (SigV4) → CloudWatch Metrics
  ├→ POST /v1/traces   → AWS Integration → xray (SigV4)      → X-Ray / CloudWatch Logs
  └→ POST /v1/logs     → AWS Integration → logs (SigV4)       → CloudWatch Logs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The client sends standard OTLP/HTTP requests with an API key. APIGW validates the key, assumes an IAM role, signs the request with SigV4, and forwards it to the CloudWatch OTLP endpoint. That's it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Works
&lt;/h2&gt;

&lt;p&gt;API Gateway REST API has an integration type called &lt;strong&gt;&lt;a href="https://docs.aws.amazon.com/apigateway/latest/developerguide/getting-started-aws-proxy.html" rel="noopener noreferrer"&gt;AWS Service Integration&lt;/a&gt;&lt;/strong&gt;. It can call any AWS service API and sign the request with SigV4 using an execution role. The &lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-OTLPEndpoint.html" rel="noopener noreferrer"&gt;CloudWatch OTLP endpoints&lt;/a&gt; are standard AWS service endpoints:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Signal&lt;/th&gt;
&lt;th&gt;Endpoint&lt;/th&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Metrics&lt;/td&gt;
&lt;td&gt;&lt;code&gt;monitoring.{region}.amazonaws.com/v1/metrics&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;monitoring&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Traces&lt;/td&gt;
&lt;td&gt;&lt;code&gt;xray.{region}.amazonaws.com/v1/traces&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;xray&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Logs&lt;/td&gt;
&lt;td&gt;&lt;code&gt;logs.{region}.amazonaws.com/v1/logs&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;logs&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;APIGW's integration URI format maps directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;arn:aws:apigateway:{region}:monitoring:path/v1/metrics
arn:aws:apigateway:{region}:xray:path/v1/traces
arn:aws:apigateway:{region}:logs:path/v1/logs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Setup
&lt;/h2&gt;

&lt;p&gt;The full infrastructure is defined in a CloudFormation template (link at the bottom). Here's what it creates:&lt;/p&gt;

&lt;h3&gt;
  
  
  IAM Execution Role
&lt;/h3&gt;

&lt;p&gt;APIGW needs an IAM role to sign requests to CloudWatch. The policy is scoped to only the actions and resources needed for OTLP ingestion:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;Parameters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;OtlpLogGroupName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;String&lt;/span&gt;
    &lt;span class="na"&gt;Default&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;otlp-logs"&lt;/span&gt;
    &lt;span class="na"&gt;Description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CloudWatch Logs log group for OTLP log ingestion&lt;/span&gt;

&lt;span class="na"&gt;Resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;OtlpExecutionRole&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::IAM::Role&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;AssumeRolePolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
            &lt;span class="na"&gt;Principal&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;Service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apigateway.amazonaws.com&lt;/span&gt;
            &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sts:AssumeRole&lt;/span&gt;
      &lt;span class="na"&gt;Policies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;PolicyName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;otlp-metrics&lt;/span&gt;
          &lt;span class="na"&gt;PolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
                &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;cloudwatch:PutMetricData&lt;/span&gt;
                &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;PolicyName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;otlp-traces&lt;/span&gt;
          &lt;span class="na"&gt;PolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
                &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;xray:PutTraceSegments&lt;/span&gt;
                  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;xray:PutTelemetryRecords&lt;/span&gt;
                &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*"&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;PolicyName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;otlp-logs&lt;/span&gt;
          &lt;span class="na"&gt;PolicyDocument&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
                &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;logs:PutLogEvents&lt;/span&gt;
                  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;logs:CreateLogStream&lt;/span&gt;
                  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;logs:DescribeLogStreams&lt;/span&gt;
                &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:${OtlpLogGroupName}:*"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: &lt;code&gt;cloudwatch:PutMetricData&lt;/code&gt; doesn't support resource-level ARNs. The &lt;code&gt;cloudwatch:namespace&lt;/code&gt; condition key exists but does not apply to the OTLP ingestion path — metrics are accepted regardless of namespace. X-Ray &lt;code&gt;PutTraceSegments&lt;/code&gt; also doesn't support resource-level restrictions. Logs permissions are scoped to a specific log group via the &lt;code&gt;OtlpLogGroupName&lt;/code&gt; parameter.&lt;/p&gt;

&lt;h3&gt;
  
  
  API Gateway with AWS Service Integration
&lt;/h3&gt;

&lt;p&gt;Each OTLP signal gets its own resource with an AWS integration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;MetricsMethod&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::ApiGateway::Method&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;HttpMethod&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;POST&lt;/span&gt;
    &lt;span class="na"&gt;AuthorizationType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;NONE&lt;/span&gt;
    &lt;span class="na"&gt;ApiKeyRequired&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;Integration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS&lt;/span&gt;
      &lt;span class="na"&gt;IntegrationHttpMethod&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;POST&lt;/span&gt;
      &lt;span class="na"&gt;Uri&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Sub&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;arn:aws:apigateway:${AWS::Region}:monitoring:path/v1/metrics"&lt;/span&gt;
      &lt;span class="na"&gt;Credentials&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;OtlpExecutionRole.Arn&lt;/span&gt;
      &lt;span class="na"&gt;PassthroughBehavior&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;WHEN_NO_MATCH&lt;/span&gt;
      &lt;span class="na"&gt;ContentHandling&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CONVERT_TO_TEXT&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same pattern for &lt;code&gt;/v1/traces&lt;/code&gt; (service: &lt;code&gt;xray&lt;/code&gt;) and &lt;code&gt;/v1/logs&lt;/code&gt; (service: &lt;code&gt;logs&lt;/code&gt;).&lt;/p&gt;

&lt;h3&gt;
  
  
  API Key Authentication
&lt;/h3&gt;

&lt;p&gt;Protect the endpoint with an API key so only your tools can send telemetry:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;ApiKey&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::ApiGateway::ApiKey&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="na"&gt;UsagePlan&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::ApiGateway::UsagePlan&lt;/span&gt;
  &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;ApiStages&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ApiId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;Api&lt;/span&gt;
        &lt;span class="na"&gt;Stage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;Stage&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Configure Your Tools
&lt;/h2&gt;

&lt;p&gt;The proxy works with any tool that supports standard OTEL environment variables. Here's how to configure each:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Disclaimer:&lt;/strong&gt; I personally use Claude Code routed through a custom LLM gateway (not a coding plan), since some coding plans aren't available in the region I live in. The configurations below are based on each tool's official documentation — your mileage may vary.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Claude Code
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://docs.anthropic.com/en/docs/claude-code/monitoring-usage" rel="noopener noreferrer"&gt;Official monitoring docs&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CLAUDE_CODE_ENABLE_TELEMETRY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OTEL_METRICS_EXPORTER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;otlp
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OTEL_LOGS_EXPORTER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;otlp
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OTEL_EXPORTER_OTLP_PROTOCOL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http/json
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OTEL_EXPORTER_OTLP_ENDPOINT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://xxx.execute-api.us-west-2.amazonaws.com/prod
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OTEL_EXPORTER_OTLP_HEADERS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;x-api-key&lt;span class="o"&gt;=&lt;/span&gt;your-api-key
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OTEL_SERVICE_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;claude-code
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For short-lived tasks, lower the export interval so data flushes before the process exits:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OTEL_METRIC_EXPORT_INTERVAL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OTEL_LOGS_EXPORT_INTERVAL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For traces (beta):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CLAUDE_CODE_ENHANCED_TELEMETRY_BETA&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OTEL_TRACES_EXPORTER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;otlp
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OTEL_TRACES_EXPORT_INTERVAL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Enforcing OTEL across your team:&lt;/strong&gt; Claude Code supports &lt;a href="https://code.claude.com/docs/en/settings#settings-files" rel="noopener noreferrer"&gt;managed settings&lt;/a&gt; via &lt;code&gt;managed-settings.json&lt;/code&gt;, deployable through MDM (Jamf, Intune, etc.). This lets you enforce OTEL configuration org-wide — engineers don't need to set environment variables manually, and they can't opt out.&lt;/p&gt;

&lt;h3&gt;
  
  
  Claude CoWork (Team &amp;amp; Enterprise)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://support.claude.com/en/articles/14477985-monitor-claude-cowork-activity-with-opentelemetry" rel="noopener noreferrer"&gt;CoWork monitoring docs&lt;/a&gt; — configure via Admin Settings → Cowork → Monitoring:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OTLP endpoint: your APIGW URL&lt;/li&gt;
&lt;li&gt;OTLP protocol: &lt;code&gt;http/json&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;OTLP headers: &lt;code&gt;x-api-key=your-api-key&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;CoWork streams user prompts, tool/MCP invocations, file access, human approval decisions, and API request details. It shares the same OTel event schema as Claude Code via the Claude Agent SDK — you can distinguish them by &lt;code&gt;terminal.type&lt;/code&gt; (&lt;code&gt;cowork&lt;/code&gt; vs &lt;code&gt;cli&lt;/code&gt;).&lt;/p&gt;

&lt;h3&gt;
  
  
  GitHub Copilot CLI
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://docs.github.com/en/copilot/reference/copilot-cli-reference/cli-command-reference#opentelemetry-monitoring" rel="noopener noreferrer"&gt;Copilot CLI OTel reference&lt;/a&gt; — available since Copilot CLI 1.0.4:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;COPILOT_OTEL_ENDPOINT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://xxx.execute-api.us-west-2.amazonaws.com/prod
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;COPILOT_OTEL_HEADERS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;x-api-key&lt;span class="o"&gt;=&lt;/span&gt;your-api-key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Gemini CLI
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://geminicli.com/docs/cli/telemetry/" rel="noopener noreferrer"&gt;Gemini CLI telemetry docs&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;GEMINI_CLI_OTEL_EXPORT_ENDPOINT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://xxx.execute-api.us-west-2.amazonaws.com/prod
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Cursor (via Hooks)
&lt;/h3&gt;

&lt;p&gt;Cursor doesn't have native OTEL export yet, but the community &lt;a href="https://github.com/LangGuard-AI/cursor-otel-hook" rel="noopener noreferrer"&gt;cursor-otel-hook&lt;/a&gt; project captures agent activity via Cursor's hook system and exports traces to any OTLP endpoint. Configure via &lt;code&gt;otel_config.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"OTEL_EXPORTER_OTLP_ENDPOINT"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://xxx.execute-api.us-west-2.amazonaws.com/prod/v1/traces"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"OTEL_EXPORTER_OTLP_PROTOCOL"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http/json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"OTEL_EXPORTER_OTLP_HEADERS"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"x-api-key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"your-api-key"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What You Get
&lt;/h2&gt;

&lt;p&gt;CloudWatch receives standard OTLP data. For Claude Code specifically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Metrics&lt;/strong&gt;: &lt;code&gt;claude_code.token.usage&lt;/code&gt; (by &lt;code&gt;token.type&lt;/code&gt;: input/output/cache_read/cache_creation), &lt;code&gt;claude_code.cost.usage&lt;/code&gt; (USD), &lt;code&gt;claude_code.session.count&lt;/code&gt;, &lt;code&gt;claude_code.lines_of_code.count&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Traces&lt;/strong&gt; (beta): Spans linking each user prompt → API requests → tool executions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Log events&lt;/strong&gt;: &lt;code&gt;claude_code.user_prompt&lt;/code&gt;, &lt;code&gt;claude_code.tool_decision&lt;/code&gt;, &lt;code&gt;claude_code.tool_result&lt;/code&gt;, &lt;code&gt;claude_code.api_request&lt;/code&gt; — all tagged with &lt;code&gt;session.id&lt;/code&gt; and &lt;code&gt;service.name=claude-code&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's what real Claude Code log events look like after flowing through the proxy into CloudWatch Logs. This is actual data from an E2E test — a single prompt that triggered a Bash tool call:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;claude_code.user_prompt&lt;/code&gt;&lt;/strong&gt; — emitted when the user sends a prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"attributes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"host.arch"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arm64"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"os.type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"linux"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"service.name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude-code"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"service.version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2.1.114"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"os.version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"6.17.0-1010-aws"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scope"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"com.anthropic.claude_code.events"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2.1.114"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude_code.user_prompt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"attributes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"event.sequence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"user.id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1c257d04..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"prompt_length"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"40"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"terminal.type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"non-interactive"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"event.name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user_prompt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"event.timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-18T11:23:13.187Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"prompt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;REDACTED&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"session.id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"846ab649-8bba-471e-8ec5-8756116d0840"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"prompt.id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"88475ee2-59c2-4137-9201-5540c6a6cad1"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;code&gt;claude_code.tool_result&lt;/code&gt;&lt;/strong&gt; — emitted after each tool execution:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude_code.tool_result"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"attributes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tool_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Bash"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tool_result_size_bytes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"899"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tool_input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"{&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;command&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;ls&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;,&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;description&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;List files in current directory&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"duration_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"95"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"success"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"true"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"session.id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"846ab649-8bba-471e-8ec5-8756116d0840"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"prompt.id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"88475ee2-59c2-4137-9201-5540c6a6cad1"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;code&gt;claude_code.api_request&lt;/code&gt;&lt;/strong&gt; — emitted after each API call with token counts and cost:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"body"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude_code.api_request"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"attributes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude-sonnet-4-5-20250929"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"input_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"142"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"output_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"61"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"cache_read_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"cache_creation_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"25848"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"cost_usd"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0.098271"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"duration_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"4950"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"speed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"normal"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"session.id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"846ab649-8bba-471e-8ec5-8756116d0840"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"prompt.id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"88475ee2-59c2-4137-9201-5540c6a6cad1"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All events share the same &lt;code&gt;prompt.id&lt;/code&gt;, linking them into a single interaction. The &lt;code&gt;event.sequence&lt;/code&gt; field orders events within a prompt. Every record carries &lt;code&gt;service.name=claude-code&lt;/code&gt; in resource attributes, so isolating Claude Code telemetry in a mixed pipeline is trivial — just filter on that in CloudWatch Logs Insights:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;fields&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cost_usd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;duration_ms&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;attributes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;`service.name`&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'claude-code'&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;filter&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'claude_code.api_request'&lt;/span&gt;
&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;sort&lt;/span&gt; &lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Region Availability
&lt;/h2&gt;

&lt;p&gt;CloudWatch OTLP endpoints are available in most regions but &lt;strong&gt;not all&lt;/strong&gt;. The &lt;a href="https://aws.amazon.com/about-aws/whats-new/2026/04/amazon-cloudwatch-opentelemetry-metrics/" rel="noopener noreferrer"&gt;OTLP metrics preview&lt;/a&gt; launched in 5 regions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Signal&lt;/th&gt;
&lt;th&gt;Regions&lt;/th&gt;
&lt;th&gt;Docs&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Metrics (preview)&lt;/td&gt;
&lt;td&gt;us-east-1, us-west-2, ap-southeast-1, ap-southeast-2, eu-west-1&lt;/td&gt;
&lt;td&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2026/04/amazon-cloudwatch-opentelemetry-metrics/" rel="noopener noreferrer"&gt;Announcement&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Traces&lt;/td&gt;
&lt;td&gt;Most commercial regions&lt;/td&gt;
&lt;td&gt;&lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-OTLPEndpoint.html" rel="noopener noreferrer"&gt;OTLP Endpoints&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Logs&lt;/td&gt;
&lt;td&gt;Most commercial regions&lt;/td&gt;
&lt;td&gt;&lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-OTLPEndpoint.html" rel="noopener noreferrer"&gt;OTLP Endpoints&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Tested and confirmed:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Region&lt;/th&gt;
&lt;th&gt;Metrics&lt;/th&gt;
&lt;th&gt;Traces&lt;/th&gt;
&lt;th&gt;Logs&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;us-east-1&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;us-west-2&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ap-southeast-1&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ap-east-1 (Hong Kong)&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If your primary region doesn't support it, deploy the proxy in a supported region. The APIGW endpoint is accessible from anywhere.&lt;/p&gt;

&lt;p&gt;For the full list of CloudWatch service endpoints by region, see the &lt;a href="https://docs.aws.amazon.com/general/latest/gr/cw_region.html" rel="noopener noreferrer"&gt;AWS General Reference&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gotchas
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;XRay traces require manual setup.&lt;/strong&gt; The CloudFormation template creates the proxy endpoints, but X-Ray traces need two additional steps that aren't in the template:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Set CloudWatch Logs as the trace segment destination:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws xray update-trace-segment-destination &lt;span class="nt"&gt;--destination&lt;/span&gt; CloudWatchLogs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create a CloudWatch Logs resource policy allowing X-Ray to write to the &lt;code&gt;aws/spans&lt;/code&gt; log group:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws logs put-resource-policy &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--policy-name&lt;/span&gt; XRayAccessPolicy &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--policy-document&lt;/span&gt; &lt;span class="s1"&gt;'{"Version":"2012-10-17","Statement":[{"Effect":"Allow","Principal":{"Service":"xray.amazonaws.com"},"Action":["logs:PutLogEvents","logs:CreateLogGroup","logs:CreateLogStream"],"Resource":"*"}]}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without these, traces will return &lt;code&gt;AccessDeniedException&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CloudWatch Logs supports bearer token auth.&lt;/strong&gt; The &lt;code&gt;/v1/logs&lt;/code&gt; endpoint supports &lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-OTLPEndpoint.html" rel="noopener noreferrer"&gt;bearer token authentication&lt;/a&gt; without SigV4 — but only for logs. Metrics and traces still require SigV4, which is why the APIGW proxy is needed for a unified endpoint.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;http/json&lt;/code&gt;, not &lt;code&gt;http/protobuf&lt;/code&gt;.&lt;/strong&gt; CloudWatch accepts both formats, but API Gateway's &lt;code&gt;CONVERT_TO_TEXT&lt;/code&gt; content handling can corrupt binary protobuf payloads in transit. Set &lt;code&gt;OTEL_EXPORTER_OTLP_PROTOCOL=http/json&lt;/code&gt; to avoid this. JSON is also easier to debug in APIGW execution logs. Most coding tools default to protobuf, so you'll need to override this explicitly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;API Gateway payload limit.&lt;/strong&gt; REST API has a 10MB payload limit. OTLP batches from coding tools are well under this, but keep it in mind if you're aggregating from multiple sources. CloudWatch's own limits are 1MB for metrics and logs, 5MB for traces (&lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-OTLPEndpoint.html" rel="noopener noreferrer"&gt;full limits&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;REST API, not HTTP API.&lt;/strong&gt; Only REST API supports the &lt;code&gt;AWS&lt;/code&gt; integration type needed for SigV4 service proxying. HTTP API does not.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cost
&lt;/h2&gt;

&lt;p&gt;This is about as cheap as it gets for a telemetry pipeline:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;API Gateway&lt;/td&gt;
&lt;td&gt;~$3.50 / million requests&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CloudWatch Metrics&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://aws.amazon.com/cloudwatch/pricing/" rel="noopener noreferrer"&gt;Standard CW pricing&lt;/a&gt; (free during OTel metrics preview)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CloudWatch Logs&lt;/td&gt;
&lt;td&gt;&lt;a href="https://aws.amazon.com/cloudwatch/pricing/" rel="noopener noreferrer"&gt;Standard CW pricing&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lambda&lt;/td&gt;
&lt;td&gt;$0 (there is none)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;No idle cost. No provisioned capacity. Pure pay-per-request.&lt;/p&gt;

&lt;p&gt;For comparison: a Lambda-based OTLP forwarder would add ~$0.20/million invocations plus compute time, but gives you retry logic and transformation capabilities. At typical coding agent volumes (a few hundred requests/day per developer), the cost difference is negligible — the real win is operational simplicity.&lt;/p&gt;

&lt;h2&gt;
  
  
  When NOT to Use This
&lt;/h2&gt;

&lt;p&gt;This proxy is optimized for simplicity. It's the right choice for low-to-moderate telemetry volumes from coding agents and developer tools. But it has tradeoffs:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Complexity&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Retries&lt;/th&gt;
&lt;th&gt;Multi-destination&lt;/th&gt;
&lt;th&gt;Transformation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;This proxy (APIGW)&lt;/td&gt;
&lt;td&gt;Minimal&lt;/td&gt;
&lt;td&gt;~$3.50/M req&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OTel Collector&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Compute cost&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lambda forwarder&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;~$0.20/M + compute&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ADOT SDK (in-app)&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Free (SigV4 native)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SaaS (Datadog, etc.)&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;$$$&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Consider an OTel Collector or Lambda forwarder instead if you need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;High throughput&lt;/strong&gt; — thousands of requests/second from many sources&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retry and buffering&lt;/strong&gt; — this proxy is fire-and-forget; if CloudWatch returns an error, the data is lost. OTEL SDKs have built-in retry, but only for transient failures&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-destination routing&lt;/strong&gt; — fan out to CloudWatch + Datadog + S3 simultaneously&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Payload transformation&lt;/strong&gt; — filter, enrich, or redact telemetry before ingestion&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliance requirements&lt;/strong&gt; — audit trails, guaranteed delivery, or data residency controls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For most coding agent monitoring use cases (a team of 5-50 developers), this proxy handles the volume comfortably.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security Considerations
&lt;/h2&gt;

&lt;p&gt;The proxy uses API key authentication — simple but not the strongest option. Here's how to harden it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Attach AWS WAF to the REST API.&lt;/strong&gt; Add rate limiting, IP allowlisting, or geo-blocking to prevent abuse. A single WAF WebACL with a rate-based rule (e.g., 1000 req/5min per IP) costs ~$6/month and stops most abuse patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rotate API keys.&lt;/strong&gt; APIGW supports multiple API keys per usage plan. Create a new key, distribute it, then disable the old one — zero downtime rotation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Consider IAM auth for internal use.&lt;/strong&gt; If your tools run inside AWS (EC2, ECS, Lambda), switch &lt;code&gt;AuthorizationType&lt;/code&gt; from &lt;code&gt;NONE&lt;/code&gt; to &lt;code&gt;AWS_IAM&lt;/code&gt; and drop the API key entirely. The caller signs requests with SigV4 using their IAM role — no shared secrets. This doesn't work for external tools like Claude Code on developer laptops, but it's ideal for CI/CD pipelines or server-side agents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Egress control.&lt;/strong&gt; If you're running coding agents in a controlled environment, restrict outbound traffic to only your APIGW endpoint. This prevents telemetry from leaking to unauthorized collectors.&lt;/p&gt;

&lt;h2&gt;
  
  
  Beyond Coding Agents
&lt;/h2&gt;

&lt;p&gt;This proxy works with &lt;strong&gt;any OTEL SDK&lt;/strong&gt; that supports OTLP/HTTP. If your tool can set &lt;code&gt;OTEL_EXPORTER_OTLP_ENDPOINT&lt;/code&gt; and &lt;code&gt;OTEL_EXPORTER_OTLP_HEADERS&lt;/code&gt;, it can ship telemetry to CloudWatch through this proxy.&lt;/p&gt;

&lt;p&gt;Potential use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AI coding agents&lt;/strong&gt; (Claude Code, CoWork, Copilot, Cursor, Gemini CLI) — track token usage, costs, and tool calls across your org&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Internal tools&lt;/strong&gt; — ship metrics without embedding AWS credentials in client apps&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CI/CD pipelines&lt;/strong&gt; — export build/test telemetry to CloudWatch&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;On-premises services&lt;/strong&gt; — send OTLP from outside AWS without running ADOT Collector&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For apps running inside AWS with IAM roles available, consider the &lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-OTLP-UsingADOT.html" rel="noopener noreferrer"&gt;ADOT SDK&lt;/a&gt; for collector-less telemetry with native SigV4 signing — no proxy needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Source Code &amp;amp; One-Click Deploy
&lt;/h2&gt;

&lt;p&gt;The CloudFormation template and full documentation are on GitHub:&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://github.com/gabrielkoo/otlp-cloudwatch-proxy" rel="noopener noreferrer"&gt;gabrielkoo/otlp-cloudwatch-proxy&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One-click deploy to supported regions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Region&lt;/th&gt;
&lt;th&gt;Deploy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;US East (N. Virginia)&lt;/td&gt;
&lt;td&gt;&lt;a href="https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://raw.githubusercontent.com/gabrielkoo/otlp-cloudwatch-proxy/main/template.yaml&amp;amp;stackName=otlp-cloudwatch-proxy" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr2ghcs7c9kfqqlnbap7j.png" alt="Launch Stack"&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;US West (Oregon)&lt;/td&gt;
&lt;td&gt;&lt;a href="https://console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks/create/review?templateURL=https://raw.githubusercontent.com/gabrielkoo/otlp-cloudwatch-proxy/main/template.yaml&amp;amp;stackName=otlp-cloudwatch-proxy" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr2ghcs7c9kfqqlnbap7j.png" alt="Launch Stack"&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Asia Pacific (Singapore)&lt;/td&gt;
&lt;td&gt;&lt;a href="https://console.aws.amazon.com/cloudformation/home?region=ap-southeast-1#/stacks/create/review?templateURL=https://raw.githubusercontent.com/gabrielkoo/otlp-cloudwatch-proxy/main/template.yaml&amp;amp;stackName=otlp-cloudwatch-proxy" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr2ghcs7c9kfqqlnbap7j.png" alt="Launch Stack"&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Asia Pacific (Sydney)&lt;/td&gt;
&lt;td&gt;&lt;a href="https://console.aws.amazon.com/cloudformation/home?region=ap-southeast-2#/stacks/create/review?templateURL=https://raw.githubusercontent.com/gabrielkoo/otlp-cloudwatch-proxy/main/template.yaml&amp;amp;stackName=otlp-cloudwatch-proxy" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr2ghcs7c9kfqqlnbap7j.png" alt="Launch Stack"&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Europe (Ireland)&lt;/td&gt;
&lt;td&gt;&lt;a href="https://console.aws.amazon.com/cloudformation/home?region=eu-west-1#/stacks/create/review?templateURL=https://raw.githubusercontent.com/gabrielkoo/otlp-cloudwatch-proxy/main/template.yaml&amp;amp;stackName=otlp-cloudwatch-proxy" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr2ghcs7c9kfqqlnbap7j.png" alt="Launch Stack"&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;em&gt;Built and validated on a Saturday morning with Claude Code + OpenClaw. Zero Lambda functions were harmed in the making of this article.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/aws-solutions-library-samples/guidance-for-claude-code-with-amazon-bedrock/blob/main/assets/docs/MONITORING.md" rel="noopener noreferrer"&gt;AWS Guidance for Claude Code with Amazon Bedrock — Monitoring&lt;/a&gt;&lt;/strong&gt; — A comprehensive (and admittedly overkill) reference implementation using ECS Fargate + ALB + ADOT Collector + Lambda + DynamoDB + Kinesis + Athena. Great if you want to see the full spectrum of what can be measured: per-user token tracking, quota monitoring, cost dashboards, and an analytics data lake. If you need all of that, use it. If you just need telemetry flowing to CloudWatch, the one-template proxy in this post will do.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://code.claude.com/docs/en/monitoring-usage" rel="noopener noreferrer"&gt;Claude Code Monitoring Docs&lt;/a&gt;&lt;/strong&gt; — Official OTEL configuration reference, including all metrics, events, and traces.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://code.claude.com/docs/en/settings#settings-files" rel="noopener noreferrer"&gt;Claude Code Managed Settings&lt;/a&gt;&lt;/strong&gt; — How to deploy &lt;code&gt;managed-settings.json&lt;/code&gt; via MDM for org-wide OTEL enforcement.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-OTLPEndpoint.html" rel="noopener noreferrer"&gt;CloudWatch OTLP Endpoints&lt;/a&gt;&lt;/strong&gt; — AWS docs on native OTLP ingestion for metrics, traces, and logs.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>opentelemetry</category>
      <category>cloudwatch</category>
      <category>serverless</category>
    </item>
    <item>
      <title>AWS Data &amp; AI Stories #02: Amazon Bedrock Data Automation</title>
      <dc:creator>Sedat SALMAN</dc:creator>
      <pubDate>Sat, 18 Apr 2026 05:01:56 +0000</pubDate>
      <link>https://dev.to/aws-builders/aws-data-ai-stories-02-amazon-bedrock-data-automation-1gg7</link>
      <guid>https://dev.to/aws-builders/aws-data-ai-stories-02-amazon-bedrock-data-automation-1gg7</guid>
      <description>&lt;p&gt;In the first article, I talked about multimodal AI at a high level.&lt;/p&gt;

&lt;p&gt;Now it is time to go one step deeper.&lt;/p&gt;

&lt;p&gt;When we say multimodal AI, one of the first real challenges is not the model itself. The first challenge is the data. In most environments, the input is messy, unstructured, and spread across different formats such as documents, images, audio, and video. Amazon Bedrock Data Automation, or BDA, is designed for exactly this problem: extracting useful insights from unstructured multimodal content and turning it into structured output that applications can use.&lt;/p&gt;

&lt;p&gt;For me, BDA is not the “chat” layer. It is the processing layer.&lt;/p&gt;

&lt;p&gt;That is what makes it important.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Amazon Bedrock Data Automation?
&lt;/h2&gt;

&lt;p&gt;Amazon Bedrock Data Automation is a managed AWS capability that automates insight generation from unstructured content such as documents, images, audio, and video. Instead of building separate extraction pipelines for each format, you can use BDA to generate structured outputs from multimodal input in a more consistent way.&lt;/p&gt;

&lt;p&gt;This is useful because many AI projects fail at the beginning.&lt;/p&gt;

&lt;p&gt;Not because the model is weak, but because the source data is not ready.&lt;/p&gt;

&lt;p&gt;Think about a few common examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;scanned PDFs&lt;/li&gt;
&lt;li&gt;invoices&lt;/li&gt;
&lt;li&gt;screenshots&lt;/li&gt;
&lt;li&gt;call recordings&lt;/li&gt;
&lt;li&gt;inspection videos&lt;/li&gt;
&lt;li&gt;photos from the field&lt;/li&gt;
&lt;li&gt;reports with mixed text and visuals&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are all valuable, but none of them are naturally clean inputs for downstream AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why does BDA matter?
&lt;/h2&gt;

&lt;p&gt;Because AI systems need structure.&lt;/p&gt;

&lt;p&gt;Before you build search, RAG, analytics, or assistants, you usually need to answer a simpler question:&lt;/p&gt;

&lt;p&gt;How do I turn raw content into usable information?&lt;/p&gt;

&lt;p&gt;That is where BDA fits.&lt;/p&gt;

&lt;p&gt;AWS describes BDA as a service that can produce both standard output and custom output depending on the use case. Standard output gives predefined insights for a data type, while custom output lets you define tailored extraction logic. This makes BDA useful not only for generic processing, but also for business-specific workflows.&lt;/p&gt;

&lt;p&gt;So in practical terms, BDA can help when you want to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;extract content from complex documents&lt;/li&gt;
&lt;li&gt;summarize audio or video&lt;/li&gt;
&lt;li&gt;generate structured metadata&lt;/li&gt;
&lt;li&gt;prepare content for retrieval&lt;/li&gt;
&lt;li&gt;feed another AI workflow&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How I would position BDA in an AWS architecture
&lt;/h2&gt;

&lt;p&gt;I would place BDA near the beginning of the workflow.&lt;/p&gt;

&lt;p&gt;A simple view looks like this:&lt;/p&gt;

&lt;p&gt;Input data → BDA processing → structured output → storage/indexing → retrieval/generation&lt;/p&gt;

&lt;p&gt;This is also how AWS examples position it. In AWS guidance and solution examples, BDA is commonly used after content lands in S3, and before services such as Knowledge Bases, vector stores, or agentic applications use the extracted results.&lt;/p&gt;

&lt;p&gt;So if Part 1 was about multimodal AI, Part 2 is about making multimodal content usable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Main concepts to understand
&lt;/h2&gt;

&lt;p&gt;There are two core ideas in BDA that matter most: projects and blueprints.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Projects
&lt;/h3&gt;

&lt;p&gt;A project is the main configuration container in BDA. AWS documentation describes it as the grouping that holds standard and optional custom output settings for processing. When you call the async API with a project ARN, BDA uses that project’s configuration to process the file and produce the defined outputs.&lt;/p&gt;

&lt;p&gt;In simple terms, a project is where you define how BDA should behave for your use case.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Blueprints
&lt;/h3&gt;

&lt;p&gt;Blueprints are what make custom extraction possible. AWS documentation explains that blueprints define the extraction logic and output format for custom outputs, allowing you to tailor BDA to your own business fields and data structures.&lt;/p&gt;

&lt;p&gt;This is one of the most valuable parts of BDA.&lt;/p&gt;

&lt;p&gt;Because in real projects, we usually do not want only generic output. We want specific fields, specific structure, and specific business meaning.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;invoice number&lt;/li&gt;
&lt;li&gt;customer name&lt;/li&gt;
&lt;li&gt;incident category&lt;/li&gt;
&lt;li&gt;equipment ID&lt;/li&gt;
&lt;li&gt;inspection result&lt;/li&gt;
&lt;li&gt;priority level&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is where blueprints become important.&lt;/p&gt;

&lt;h2&gt;
  
  
  Standard output vs custom output
&lt;/h2&gt;

&lt;p&gt;This is one of the most important design choices.&lt;/p&gt;

&lt;h3&gt;
  
  
  Standard output
&lt;/h3&gt;

&lt;p&gt;Standard output is faster to start with. AWS says it provides predefined insights based on the data type being processed, such as document semantics, audio transcripts, or video summaries and chapter summaries.&lt;/p&gt;

&lt;p&gt;This is a good option when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you want speed&lt;/li&gt;
&lt;li&gt;you are validating a use case&lt;/li&gt;
&lt;li&gt;you do not need very specific extraction fields yet&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Custom output
&lt;/h3&gt;

&lt;p&gt;Custom output is for more targeted use cases. With blueprints, you define the extraction logic and expected structure so the output matches your business need more closely. AWS has also added features such as blueprint instruction optimization to improve custom extraction accuracy using example assets and ground-truth labels.&lt;/p&gt;

&lt;p&gt;This is a better option when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you need specific fields&lt;/li&gt;
&lt;li&gt;you need consistency&lt;/li&gt;
&lt;li&gt;you are building a production workflow&lt;/li&gt;
&lt;li&gt;your documents or media are domain-specific&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For me, the normal journey is:&lt;br&gt;
start with standard output, then move to custom output when the use case becomes clearer.&lt;/p&gt;

&lt;h2&gt;
  
  
  What types of content can BDA process?
&lt;/h2&gt;

&lt;p&gt;BDA is built for multimodal content. AWS documentation and product pages describe support for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;documents&lt;/li&gt;
&lt;li&gt;images&lt;/li&gt;
&lt;li&gt;audio&lt;/li&gt;
&lt;li&gt;video&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That matters because many organizations have all four.&lt;/p&gt;

&lt;p&gt;And the value is not only in “understanding” each file type independently. The real value is creating a single processing layer for all of them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where BDA fits best
&lt;/h2&gt;

&lt;p&gt;I think BDA is strongest in these scenarios:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Intelligent document processing
&lt;/h3&gt;

&lt;p&gt;AWS has positioned BDA strongly for document-heavy workflows, and AWS blog content around intelligent document processing shows it being used to accelerate extraction and automation for business documents.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Preparing content for RAG&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;AWS examples show BDA being used before Knowledge Bases and vector indexing so that multimodal content can be turned into cleaner, more useful retrieval input.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Audio and video understanding
&lt;/h3&gt;

&lt;p&gt;BDA can generate outputs such as transcripts and summaries from audio and video, and AWS recently expanded it with custom vocabulary support through Data Automation Library to improve speech recognition accuracy for domain-specific terms.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Compliance and content review
&lt;/h3&gt;

&lt;p&gt;AWS has also shown BDA in workflows such as extracting attachment content for later PII detection and redaction with Guardrails, which makes it relevant beyond simple summarization.&lt;/p&gt;

&lt;h2&gt;
  
  
  A practical workflow example
&lt;/h2&gt;

&lt;p&gt;Let’s take a simple example.&lt;/p&gt;

&lt;p&gt;Imagine you are building a support or operations workflow.&lt;/p&gt;

&lt;p&gt;The input may include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a maintenance PDF&lt;/li&gt;
&lt;li&gt;a photo from the field&lt;/li&gt;
&lt;li&gt;a voice note from an engineer&lt;/li&gt;
&lt;li&gt;a short inspection video&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without a processing layer, each file stays isolated.&lt;/p&gt;

&lt;p&gt;With BDA, the system can extract usable outputs from these files, and the rest of the architecture can work with those results more easily. Those extracted outputs can then be stored, indexed, sent to a knowledge base, or used by an agentic workflow. This is consistent with AWS’s documented BDA flow and solution examples that combine S3, BDA, Knowledge Bases, OpenSearch, and AgentCore.&lt;/p&gt;

&lt;p&gt;That is why I see BDA as the bridge between raw content and usable AI workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Newer capabilities worth watching
&lt;/h2&gt;

&lt;p&gt;Two additions make BDA more interesting for real projects.&lt;/p&gt;

&lt;p&gt;First, AWS added blueprint instruction optimization, which helps improve custom field extraction accuracy using example documents and labels. This is useful because custom extraction often needs tuning before it becomes reliable.&lt;/p&gt;

&lt;p&gt;Second, AWS added custom vocabulary through Data Automation Library for audio and video processing. This helps when your environment uses domain-specific terms, product names, or technical language that general transcription may miss.&lt;/p&gt;

&lt;p&gt;These are good signs that BDA is moving from “interesting feature” toward “serious processing layer.”&lt;/p&gt;

&lt;h2&gt;
  
  
  Things to keep in mind
&lt;/h2&gt;

&lt;p&gt;BDA is powerful, but I would still keep a few points in mind.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Start simple
&lt;/h3&gt;

&lt;p&gt;AWS recommends starting with standard output if you are new to the service. That makes sense because it helps validate the value quickly before you invest in custom extraction logic.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Design for the business output
&lt;/h3&gt;

&lt;p&gt;Do not begin with “what model do I want?”&lt;br&gt;
Begin with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what fields do I need?&lt;/li&gt;
&lt;li&gt;what decision will use this output?&lt;/li&gt;
&lt;li&gt;what system will consume it?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Watch input requirements
&lt;/h3&gt;

&lt;p&gt;AWS documents prerequisites and file requirements for BDA, including file-specific constraints that differ by content type and processing mode.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Treat prompts and blueprints carefully
&lt;/h3&gt;

&lt;p&gt;AWS explicitly notes that blueprint prompt input should come from trusted sources, which is an important reminder for secure enterprise design.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final thoughts
&lt;/h2&gt;

&lt;p&gt;For me, Amazon Bedrock Data Automation is one of the most practical pieces in the current AWS multimodal stack.&lt;/p&gt;

&lt;p&gt;It does not replace retrieval, RAG, or agents.&lt;/p&gt;

&lt;p&gt;It enables them.&lt;/p&gt;

&lt;p&gt;If multimodal AI is the bigger vision, BDA is one of the first services that helps turn that vision into a usable workflow. It helps convert raw documents, images, audio, and video into outputs that the rest of your architecture can actually work with.&lt;/p&gt;

&lt;p&gt;That is why I would not treat BDA as a side feature.&lt;/p&gt;

&lt;p&gt;I would treat it as a foundational building block.&lt;/p&gt;

&lt;p&gt;In the next article, I will move one step further and focus on multimodal knowledge bases and how retrieval fits after the processing layer.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>aws</category>
      <category>datascience</category>
      <category>awsbigdata</category>
    </item>
    <item>
      <title>Supercharge AWS Diagrams in VSCode with Mermaid and Custom Icons</title>
      <dc:creator>Moya Richards</dc:creator>
      <pubDate>Sat, 18 Apr 2026 04:45:00 +0000</pubDate>
      <link>https://dev.to/aws-builders/supercharge-aws-diagrams-in-vscode-with-mermaid-and-custom-icons-d0m</link>
      <guid>https://dev.to/aws-builders/supercharge-aws-diagrams-in-vscode-with-mermaid-and-custom-icons-d0m</guid>
      <description>&lt;p&gt;Want to turn your architecture docs into visual gold? With the new &lt;a href="https://mermaid.js.org/syntax/architecture.html" rel="noopener noreferrer"&gt;Mermaid.js architecture diagram syntax&lt;/a&gt; and custom icon packs, you can create AWS diagrams—all from within VSCode using Markdown.&lt;/p&gt;

&lt;p&gt;This guide walks you through how to:&lt;/p&gt;

&lt;p&gt;✅ Use Mermaid’s new architecture syntax&lt;br&gt;
✅ Set up custom AWS and icon libraries&lt;br&gt;
✅ Preview it all in VSCode using the Markdown Preview Enhanced extension&lt;/p&gt;

&lt;p&gt;Let’s dive in. 👇&lt;/p&gt;




&lt;h2&gt;
  
  
  🔧 Icon Packs You'll Use
&lt;/h2&gt;

&lt;p&gt;Here are the icon libraries we’ll pull in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://icones.js.org/collection/logos" rel="noopener noreferrer"&gt;Iconify Logos&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://icones.js.org/collection/lucide" rel="noopener noreferrer"&gt;Lucide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://icones.js.org/collection/fa" rel="noopener noreferrer"&gt;Font Awesome&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/awslabs/aws-icons-for-plantuml" rel="noopener noreferrer"&gt;AWS Icons-for-plantuml Pack (GitHub)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These icons can be referenced in your Mermaid diagrams by name (e.g. &lt;code&gt;logos:aws-lambda&lt;/code&gt;, &lt;code&gt;aws:aurora&lt;/code&gt;, &lt;code&gt;fa: ,&lt;/code&gt;).&lt;/p&gt;




&lt;h2&gt;
  
  
  ✅ Step 1: Install Markdown Preview Enhanced
&lt;/h2&gt;

&lt;p&gt;First, install the &lt;strong&gt;&lt;a href="https://marketplace.visualstudio.com/items?itemName=shd101wyy.markdown-preview-enhanced" rel="noopener noreferrer"&gt;Markdown Preview Enhanced&lt;/a&gt;&lt;/strong&gt; extension in VSCode.&lt;/p&gt;

&lt;p&gt;It enables rich previews for &lt;code&gt;.md&lt;/code&gt; files—supporting Mermaid, LaTeX, charts, diagrams, and more.&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚙️ Step 2: Inject Icon Packs into Mermaid
&lt;/h2&gt;

&lt;p&gt;We’ll now customize the Markdown preview to load external icon packs into Mermaid.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to set it up:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;In VSCode, press &lt;code&gt;Ctrl + P&lt;/code&gt;, then type:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   &amp;gt; Markdown Preview Enhanced: Customize Preview Html Head (WORKSPACE)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Select the command. It will create a file:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   .crossnote/head.html
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Paste the following code into that file:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;!-- Load custom icon packs for Mermaid architecture diagrams --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;script &lt;/span&gt;&lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"text/javascript"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;configureMermaidIconPacks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mermaid&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;registerIconPacks&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;logos&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
          &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://unpkg.com/@iconify-json/logos/icons.json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
            &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
          &lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;lucide&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
          &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://unpkg.com/@iconify-json/lucide/icons.json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
            &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
          &lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;fa&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
          &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://unpkg.com/@iconify-json/fa/icons.json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
            &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
          &lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;aws&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
          &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://raw.githubusercontent.com/awslabs/aws-icons-for-plantuml/aa30729ab2e125f13526020fa98ed5eb0ed86cc1/dist/aws-icons-mermaid.json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
          &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;readyState&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;loading&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;configureMermaidIconPacks&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;DOMContentLoaded&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;configureMermaidIconPacks&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ensures Mermaid can load and render your custom icons directly in preview.&lt;/p&gt;




&lt;h2&gt;
  
  
  🖼️ Step 3: Create a Diagram with AWS Icons
&lt;/h2&gt;

&lt;p&gt;Now create a &lt;code&gt;.md&lt;/code&gt; file (e.g. &lt;code&gt;README.md&lt;/code&gt;) and embed a Mermaid diagram using the new architecture syntax and icons.&lt;/p&gt;

&lt;p&gt;Here’s a full example using AWS services:&lt;/p&gt;

&lt;p&gt;Click to expand the Mermaid diagram example&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;---
config:
  theme: base
  themeVariables:
    darkMode: false
    archEdgeColor: "#232F3E"
    archEdgeArrowColor: "#232F3E"
    archGroupBorderColor: "#ff862a"
---
architecture-beta


service user_service(lucide:user)[Users via Web Browser]

group cdk_infra(cloud)[AWS CDK Infrastructure]

service waf(logos:aws-waf)[AWS WAF] in cdk_infra
service cloudfront(logos:aws-cloudfront)[Amazon CloudFront] in cdk_infra
service cognito(logos:aws-cognito)[Amazon Cognito] in cdk_infra
service s3_front(logos:aws-s3)[Amazon S3 Frontend Hosting] in cdk_infra
service apigw(logos:aws-api-gateway)[Amazon API Gateway] in cdk_infra
service appsync(logos:aws-appsync)[AWS AppSync] in cdk_infra
service cdk_deploy(aws:cloudformation)[AWS CDK Deployment] in cdk_infra

group vpc_private_subnet(cloud)[VPC Private Subnet] in cdk_infra

service lambda(logos:aws-lambda)[AWS Lambda] in vpc_private_subnet
service dynamodb(logos:aws-dynamodb)[Amazon DynamoDB] in vpc_private_subnet
service aurora(logos:aws-aurora)[Amazon Aurora PostgreSQL] in vpc_private_subnet
service sagemaker(aws:sagemaker)[Amazon SageMaker] in vpc_private_subnet
service bedrock(aws:bedrock)[Amazon Bedrock] in vpc_private_subnet
service step_functions(logos:aws-step-functions)[AWS Step Functions] in vpc_private_subnet
service s3_ingest(logos:aws-s3)[Amazon S3 Ingestion] in vpc_private_subnet
service lambda_ingest(logos:aws-lambda)[AWS Lambda Ingestion] in vpc_private_subnet


user_service:R --&amp;amp;gt; L:waf
waf:R --&amp;amp;gt; L:cloudfront
cloudfront:R --&amp;amp;gt; L:cognito
cognito:R --&amp;amp;gt; L:s3_front
s3_front:R --&amp;amp;gt; L:apigw
s3_front:B --&amp;amp;gt; L:appsync
apigw:R --&amp;amp;gt; L:lambda
appsync:R --&amp;amp;gt; L:lambda
lambda:R --&amp;amp;gt; L:dynamodb
lambda:B --&amp;amp;gt; T:aurora
lambda:T --&amp;amp;gt; B:bedrock
sagemaker:R --&amp;amp;gt; B:s3_ingest
step_functions:R --&amp;amp;gt; L:s3_ingest
s3_ingest:R --&amp;amp;gt; L:lambda_ingest
lambda_ingest:R --&amp;amp;gt; L:dynamodb
lambda_ingest:R --&amp;amp;gt; L:aurora
cdk_deploy:R --&amp;amp;gt; L:step_functions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;a href="https://mermaid.live/edit#pako:eNqNVm1v2jAQ_iuW94VKpGpIC9QfJvFSumqtxEbXSmsq5CQmjUrsyElgtOp_39kmTihiTEjge-7u8RP7csc7DkXEMMGO4_g8FHyRxMTnCBUvLGUEBTRn1nygMqHBkuU6AqGIytc7SCdoQZcmDiEqw5erKGYjsRSSIB9_6XidiXfl413_QEqx_lfQtRRlNhQyYrIOWyz63Q5VYVqwiksKFhalZE7ACupz9cmZXCUhQyUs5lujtSzDBLQq7OTpF3znaJVQ9MgCNAQpADyr3Fhti8LodZ7whaStcCnK6ORp8DhDo_F3dKPAvJCl3vO5uduaLlpLEYuc0HXugGWyHgeTZ5TwmrLO0NwLKXjRSKxByE_pm-BopKCJgg4yiZgnhWjSGKTmMPYhgtybfxaSezZ55iG9PeMR-ibyIuHxISKaJfG6wQK2E9OCrenG0g2mN-jaYIdpsnzDwx0ijZhDHWTZDIyDxwFQxLKl2LQglZgjFTKlRSJ4fZljHZKy_XOt6mCVhfNMJisQO8_LgLOiKoiH6QhNjQfNtGefo5KzpGkQ0cazGMAIudVrnby_W80RbThNRRQ0WCrInutYA-PhUTJaSiGbggxQX5A20RSuOpZs9uP2KGNOY5bSVyb1gVurriBA7hRylClgkRThq-bZri3L0NjH1RQsmy9KHqr7zps1DQ7HOswFzABDkwo7zu3BBccsP_yq3Gg3kB3lMnWwz7dfH_9B6vNmvyM_keN8RbcEGpHP4csCdYOBjm_Xtds0CjUN9MI6qhYB8rcr69IvfQMfWly_s9CoVYANN4-nUO3ew82vhas6tw5Df09M1Vr4XsPDqmpAT1WGW6ohsZcHzp0aaT6ljaiWnwTagB3zoNzP_kp13aTqzXc04TaOZRJhosdrG6cMOpiy8bsakz7WU9nHajJCd1OSYDB-QFpG-W8hUkxgUEEitLL4xdKUWQRlM05oLGlqUVoWQrXUKgcoMHnHfzDxXO_03O15_f6l2z1zL7ptvMGkc9Y77fR6fQBcr9tz-_2PNn7Tm56ddl23e9HtnnsXl5ce_LQxi6CU5J35s6H_c4AsGCdquJe8wOTc63z8BTMIDjo" rel="noopener noreferrer"&gt;view in the Mermaid Live Editor&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🎉 That’s It!
&lt;/h2&gt;

&lt;p&gt;You now have a fully working setup to design AWS diagrams using Mermaid in VSCode—with custom icons and rich architecture syntax. It’s perfect for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Internal docs and wikis&lt;/li&gt;
&lt;li&gt;Cloud architecture planning&lt;/li&gt;
&lt;li&gt;README files that &lt;em&gt;actually&lt;/em&gt; explain things&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  🛠 Bonus Tips
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;If icons don’t appear, double-check you’re in the Markdown &lt;strong&gt;Preview Enhanced&lt;/strong&gt; window (not the default preview).&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>architecture</category>
      <category>aws</category>
      <category>documentation</category>
      <category>vscode</category>
    </item>
    <item>
      <title>Building a Multimodal Agent with the ADK, AWS Fargate, and Gemini Flash Live 3.1</title>
      <dc:creator>xbill</dc:creator>
      <pubDate>Fri, 17 Apr 2026 19:20:10 +0000</pubDate>
      <link>https://dev.to/aws-builders/building-a-multimodal-agent-with-the-adk-aws-fargate-and-gemini-flash-live-31-1613</link>
      <guid>https://dev.to/aws-builders/building-a-multimodal-agent-with-the-adk-aws-fargate-and-gemini-flash-live-31-1613</guid>
      <description>&lt;p&gt;Leveraging the Google Agent Development Kit (ADK) and the underlying Gemini LLM to build Agentic apps using the Gemini Live API with the Python programming language deployed to Amazon Fargate.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftsswy7qrkejwmjphhl7z.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftsswy7qrkejwmjphhl7z.jpeg" width="800" height="451"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Aren’t There a Billion Python ADK Demos?
&lt;/h4&gt;

&lt;p&gt;Yes there are.&lt;/p&gt;

&lt;p&gt;Python has traditionally been the main coding language for ML and AI tools. The goal of this article is to provide a minimal viable basic working ADK streaming multi-modal agent using the latest Gemini Live Models.&lt;/p&gt;

&lt;h4&gt;
  
  
  In the Spirit of Mr. McConaughey’s “alright, alright, alright”
&lt;/h4&gt;

&lt;p&gt;So what is different about this lab compared to all the others out there?&lt;/p&gt;

&lt;p&gt;This is one of the first implementations of the latest Gemini 3.1 Flash Live Model with the Agent Development Kit (ADK). The starting point for the demo was an existing Code lab- which was updated and re-engineered with Gemini CLI.&lt;/p&gt;

&lt;p&gt;The original Codelab- is here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://codelabs.developers.google.com/way-back-home-level-3/instructions#0" rel="noopener noreferrer"&gt;Way Back Home - Building an ADK Bi-Directional Streaming Agent | Google Codelabs&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  What Is Python?
&lt;/h4&gt;

&lt;p&gt;Python is an interpreted language that allows for rapid development and testing and has deep libraries for working with ML and AI:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.python.org/" rel="noopener noreferrer"&gt;Welcome to Python.org&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Python Version Management
&lt;/h4&gt;

&lt;p&gt;One of the downsides of the wide deployment of Python has been managing the language versions across platforms and maintaining a supported version.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;pyenv&lt;/strong&gt; tool enables deploying consistent versions of Python:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/pyenv/pyenv" rel="noopener noreferrer"&gt;GitHub - pyenv/pyenv: Simple Python version management&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As of writing — the mainstream python version is 3.13. To validate your current Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;python --version
Python 3.13.13
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Amazon Fargate
&lt;/h4&gt;

&lt;p&gt;AWS Fargate is a serverless, pay-as-you-go compute engine for containers that works with &lt;a href="https://aws.amazon.com/documentation-overview/fargate/" rel="noopener noreferrer"&gt;Amazon Elastic Container Service (ECS)&lt;/a&gt; or Elastic Kubernetes Service (EKS). It eliminates the need to manage, patch, or scale underlying &lt;a href="https://www.geeksforgeeks.org/devops/introduction-to-aws-fargate/" rel="noopener noreferrer"&gt;EC2 virtual machines&lt;/a&gt;. Fargate automatically allocates, scales, and manages compute infrastructure, allowing developers to focus solely on designing and operating applications.&lt;/p&gt;

&lt;p&gt;Details are here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/fargate/" rel="noopener noreferrer"&gt;Serverless Compute - AWS Fargate - AWS&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;More information on Fargate is available here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/AmazonECS/latest/developerguide/AWS_Fargate.html" rel="noopener noreferrer"&gt;Architect for AWS Fargate for Amazon ECS&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Gemini Live Models
&lt;/h4&gt;

&lt;p&gt;Gemini Live is a conversational AI feature from Google that enables free-flowing, real-time voice, video, and screen-sharing interactions, allowing you to brainstorm, learn, or problem-solve through natural dialogue. Powered by the &lt;strong&gt;Gemini 3.1 Flash Live model&lt;/strong&gt; , it provides low-latency, human-like, and emotionally aware speech in over 200 countries.&lt;/p&gt;

&lt;p&gt;More details are available here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://ai.google.dev/gemini-api/docs/models/gemini-3.1-flash-live-preview" rel="noopener noreferrer"&gt;Gemini 3.1 Flash Live Preview | Gemini API | Google AI for Developers&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Gemini Live Models bring unique real-time capabilities than can be used directly from an Agent. A summary of the model is also available here:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://deepmind.google/models/model-cards/gemini-3-1-flash-live/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5cpkde11pc37h41rmfar.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5cpkde11pc37h41rmfar.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Gemini CLI
&lt;/h4&gt;

&lt;p&gt;If not pre-installed you can download the Gemini CLI to interact with the source files and provide real-time assistance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;npm install -g @google/gemini-cli
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Testing the Gemini CLI Environment
&lt;/h4&gt;

&lt;p&gt;Once you have all the tools and the correct Node.js version in place- you can test the startup of Gemini CLI. You will need to authenticate with a Key or your Google Account:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;▝▜▄ Gemini CLI v0.33.1
    ▝▜▄
   ▗▟▀ Logged in with Google /auth
  ▝▀ Gemini Code Assist Standard /upgrade no sandbox (see /docs) /model Auto (Gemini 3) | 239.8 MB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Node Version Management
&lt;/h4&gt;

&lt;p&gt;Gemini CLI needs a consistent, up to date version of Node. The &lt;strong&gt;nvm&lt;/strong&gt; command can be used to get a standard Node environment:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/nvm-sh/nvm" rel="noopener noreferrer"&gt;GitHub - nvm-sh/nvm: Node Version Manager - POSIX-compliant bash script to manage multiple active node.js versions&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Agent Development Kit
&lt;/h4&gt;

&lt;p&gt;The &lt;a href="https://www.google.com/search?q=Google+Agent+Development+Kit&amp;amp;rlz=1CAIWTJ_enUS1114&amp;amp;oq=what+is+the+adk+google&amp;amp;gs_lcrp=EgZjaHJvbWUyBggAEEUYOTIICAEQABgWGB4yCAgCEAAYFhgeMggIAxAAGBYYHjIICAQQABgWGB4yCAgFEAAYFhgeMggIBhAAGBYYHjIKCAcQABgKGBYYHjINCAgQABiGAxiABBiKBTIKCAkQABiABBiiBNIBCDMxODlqMGo3qAIAsAIA&amp;amp;sourceid=chrome&amp;amp;ie=UTF-8&amp;amp;mstk=AUtExfB5Oo7ZHHcDEHu7aqZiPBA2l1c-QGh5dB7xkkDPIiYcn8O1Imt2IHNR7bzA6JnyDCSDCUGpGWTeBW14namlN_QqzJLLI5-px1BE9jfSxwli6njPDPERjm5pRqNP3uC6HhUKiRcTJ1T8x5LHQrCkVxylw7QWg0N8B4dQDIcWpnVX9Gc&amp;amp;csui=3&amp;amp;ved=2ahUKEwjYu-G8p-uSAxXrv4kEHUbpLo0QgK4QegQIARAB" rel="noopener noreferrer"&gt;Google Agent Development Kit&lt;/a&gt; (ADK) is an open-source, Python-based framework designed to streamline the creation, deployment, and orchestration of sophisticated, multi-agent AI systems. It treats agent development like software engineering, offering modularity, state management, and built-in tools (like Google Search) to build autonomous agents.&lt;/p&gt;

&lt;p&gt;The ADK can be installed from here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://google.github.io/adk-docs/" rel="noopener noreferrer"&gt;Agent Development Kit (ADK)&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Where do I start?
&lt;/h4&gt;

&lt;p&gt;The strategy for starting multimodal real time agent development is a incremental step by step approach.&lt;/p&gt;

&lt;p&gt;First, the basic development environment is setup with the required system variables, and a working Gemini CLI configuration.&lt;/p&gt;

&lt;p&gt;Then, a minimal ADK Agent is built and tested locally. Next — the entire solution is deployed to Amazon ECS Fargate.&lt;/p&gt;

&lt;h4&gt;
  
  
  Setup the Basic Environment
&lt;/h4&gt;

&lt;p&gt;At this point you should have a working Python environment and a working Gemini CLI installation. All of the relevant code examples and documentation is available in GitHub. This repo has a wide variety of samples- but this lab will focus on the ‘gemini31-fargate’ setup.&lt;/p&gt;

&lt;p&gt;The next step is to clone the GitHub repository to your local environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;cd ~
git clone https://github.com/xbill9/gemini-cli-aws
cd gemini31-fargate

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then run &lt;strong&gt;init.sh&lt;/strong&gt; from the cloned directory.&lt;/p&gt;

&lt;p&gt;The script will attempt to determine your shell environment and set the correct variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;
&lt;/span&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/gemini31-fargate$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;source &lt;/span&gt;init.sh
&lt;span class="go"&gt;Environment setup complete.
GOOGLE_GENAI_USE_VERTEXAI=false
GOOGLE_CLOUD_PROJECT=aisprint-491218
GOOGLE_CLOUD_LOCATION=us-central1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your session times out or you need to re-authenticate- you can run the &lt;strong&gt;set_env.sh&lt;/strong&gt; script to reset your environment variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;source set_env.sh
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Variables like PROJECT_ID need to be setup for use in the various build scripts- so the &lt;strong&gt;set_env&lt;/strong&gt; script can be used to reset the environment if you time-out.&lt;/p&gt;

&lt;h4&gt;
  
  
  Build the User Interface
&lt;/h4&gt;

&lt;p&gt;The front end files provide the user interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/gemini31-fargate$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;make frontend
&lt;span class="go"&gt;cd frontend &amp;amp;&amp;amp; npm install &amp;amp;&amp;amp; npm run build

up to date, audited 219 packages in 800ms

49 packages are looking for funding
  run `npm fund` for details

1 high severity vulnerability

To address all issues, run:
  npm audit fix

Run `npm audit` for details.

&lt;/span&gt;&lt;span class="gp"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;frontend@0.0.0 build
&lt;span class="gp"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;vite build
&lt;span class="go"&gt;
vite v7.3.1 building client environment for production...
✓ 33 modules transformed.
dist/index.html 0.46 kB │ gzip: 0.29 kB
dist/assets/index-xOQlTZZB.css 21.60 kB │ gzip: 4.54 kB
dist/assets/index-DZmIx3HW.js 214.58 kB │ gzip: 67.45 kB
✓ built in 1.18s
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Test The User Interface
&lt;/h4&gt;

&lt;p&gt;The mock server test script allows the interface and Browser settings to be set to allow multimedia — without using any external Model calls or tokens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/gemini31-fargate$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;make mock
&lt;span class="go"&gt;python mock/mock_server.py
Serving static files from: /home/xbill/gemini-cli-aws/gemini31-fargate/frontend/dist
INFO: Started server process [8689]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Deployed mock front-end will look similar to:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvkwtr15968j3a2quyc3z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvkwtr15968j3a2quyc3z.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Verify The ADK Installation
&lt;/h4&gt;

&lt;p&gt;To verify the setup, run the ADK CLI locally with the biometric_agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/gemini31-fargate$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;source &lt;/span&gt;testadk.sh
&lt;span class="go"&gt;connect to local ADK CLI 

/home/xbill/.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/features/_feature_decorator.py:72: UserWarning: [EXPERIMENTAL] feature FeatureName.PLUGGABLE_AUTH is enabled.
  check_feature_enabled()
Log setup complete: /tmp/agents_log/agent.20260415_200105.log
To access latest log: tail -F /tmp/agents_log/agent.latest.log
/home/xbill/.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/cli/cli.py:204: UserWarning: [EXPERIMENTAL] InMemoryCredentialService: This feature is experimental and may change or be removed in future versions without notice. It may introduce breaking changes at any time.
  credential_service = InMemoryCredentialService()
/home/xbill/.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/auth/credential_service/in_memory_credential_service.py:33: UserWarning: [EXPERIMENTAL] BaseCredentialService: This feature is experimental and may change or be removed in future versions without notice. It may introduce breaking changes at any time.
  super(). __init__ ()
Running agent biometric_agent, type exit to exit.

[biometric_agent]: Scanner Online.

[user]: 

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Test The ADK Web Interface
&lt;/h4&gt;

&lt;p&gt;This tests the Audio / Video ADK agent interactions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/gemini31-fargate$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;source &lt;/span&gt;runadk.sh 
&lt;span class="go"&gt;connect on http://127.0.0.1:8000/

/home/xbill/.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/features/_feature_decorator.py:72: UserWarning: [EXPERIMENTAL] feature FeatureName.PLUGGABLE_AUTH is enabled.
  check_feature_enabled()
2026-04-15 20:01:46,272 - INFO - service_factory.py:266 - Using in-memory memory service
2026-04-15 20:01:46,272 - INFO - local_storage.py:84 - Using per-agent session storage rooted at /home/xbill/gemini-cli-aws/gemini31-fargate/backend/app
2026-04-15 20:01:46,272 - INFO - local_storage.py:110 - Using file artifact service at /home/xbill/gemini-cli-aws/gemini31-fargate/backend/app/.adk/artifacts
/home/xbill/.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/cli/fast_api.py:198: UserWarning: [EXPERIMENTAL] InMemoryCredentialService: This feature is experimental and may change or be removed in future versions without notice. It may introduce breaking changes at any time.
  credential_service = InMemoryCredentialService()
/home/xbill/.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/auth/credential_service/in_memory_credential_service.py:33: UserWarning: [EXPERIMENTAL] BaseCredentialService: This feature is experimental and may change or be removed in future versions without notice. It may introduce breaking changes at any time.
  super(). __init__ ()
INFO: Started server process [10520]
INFO: Waiting for application startup.

+-----------------------------------------------------------------------------+
| ADK Web Server started |
| |
| For local testing, access at http://0.0.0.0:8000. |
+-----------------------------------------------------------------------------+

INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO: 127.0.0.1:41986 - "GET / HTTP/1.1" 307 Temporary Redirect
INFO: 127.0.0.1:41986 - "GET /dev-ui/ HTTP/1.1" 200 OK
INFO: 127.0.0.1:41986 - "GET /dev-ui/styles-YY6V3TJU.css HTTP/1.1" 200 OK
INFO: 127.0.0.1:41990 - "GET /dev-ui/chunk-RGCH6K7F.js HTTP/1.1" 200 OK
INFO: 127.0.0.1:42002 - "GET /dev-ui/chunk-W7GRJBO5.js HTTP/1.1" 200 OK
INFO: 127.0.0.1:42026 - "GET /dev-ui/main-7SJG752M.js HTTP/1.1" 200 OK
INFO: 127.0.0.1:42016 - "GET /dev-ui/polyfills-5CFQRCPP.js HTTP/1.1" 200 OK
INFO: 127.0.0.1:42026 - "GET /dev-ui/assets/config/runtime-config.json HTTP/1.1" 200 OK
INFO: 127.0.0.1:42026 - "GET /list-apps?relative_path=./ HTTP/1.1" 200 OK
INFO: 127.0.0.1:41986 - "GET /dev-ui/assets/ADK-512-color.svg HTTP/1.1" 200 OK
INFO: 127.0.0.1:42026 - "GET /dev-ui/adk_favicon.svg HTTP/1.1" 200 OK
2026-04-15 20:01:49,369 - INFO - local_storage.py:60 - Creating local session service at /home/xbill/gemini-cli-aws/gemini31-fargate/backend/app/biometric_agent/.adk/session.db
INFO: 127.0.0.1:42016 - "GET /builder/app/biometric_agent?ts=1776297709357 HTTP/1.1" 200 OK
2026-04-15 20:01:49,393 - INFO - adk_web_server.py:867 - New session created: b1b2e791-b792-414a-9d46-90a3ddac1e53
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then use the web interface — either on the local interface &lt;strong&gt;127.0.0.1&lt;/strong&gt; or the catch-all web interface &lt;strong&gt;0.0.0.0&lt;/strong&gt; -depending on your environment:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0k252zwo6necaoqeydni.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0k252zwo6necaoqeydni.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Special note for Google Cloud Shell Deployments- add a CORS &lt;strong&gt;allow_origins&lt;/strong&gt; configuration exemption to allow the ADK agent to run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;adk web --host 0.0.0.0 --allow_origins 'regex:.*'
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Lint and Test the Main Python Code
&lt;/h4&gt;

&lt;p&gt;The final step is to build, lint, and test the main Python code.&lt;/p&gt;

&lt;p&gt;To Lint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/gemini31-fargate$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;make lint
&lt;span class="go"&gt;Linting Python code with Ruff...
ruff check backend
All checks passed!
Linting Frontend code with ESLint...
cd frontend &amp;amp;&amp;amp; npm run lint

&lt;/span&gt;&lt;span class="gp"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;frontend@0.0.0 lint
&lt;span class="gp"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;eslint &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To Test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/gemini31-fargate$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;make &lt;span class="nb"&gt;test&lt;/span&gt;
&lt;span class="go"&gt;Running backend and connectivity tests...
python3 -m pytest test_live_connection.py test_ws_backend.py test_ws_backend_v2.py backend/app/biometric_agent/test_agent.py
================================================================ test session starts ================================================================
platform linux -- Python 3.13.13, pytest-9.0.3, pluggy-1.6.0
rootdir: /home/xbill/gemini-cli-aws/gemini31-fargate
plugins: anyio-4.13.0, asyncio-1.3.0
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 8 items                                                                                                                                   

test_live_connection.py . [12%]
test_ws_backend.py . [25%]
test_ws_backend_v2.py . [37%]
backend/app/biometric_agent/test_agent.py ..... [100%]

================================================================= warnings summary ==================================================================
../../.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/features/_feature_decorator.py:72
  /home/xbill/.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/features/_feature_decorator.py:72: UserWarning: [EXPERIMENTAL] feature FeatureName.PLUGGABLE_AUTH is enabled.
    check_feature_enabled()

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================================================== 8 passed, 1 warning in 2.67s ============================================================
&lt;/span&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/gemini31-fargate$&lt;/span&gt;&lt;span class="w"&gt; 
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Running Locally
&lt;/h4&gt;

&lt;p&gt;The main Python Code can then be run locally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/gemini31-fargate$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;source &lt;/span&gt;biosync.sh
&lt;span class="go"&gt;Local URL
http://127.0.0.1:8080/
/home/xbill/.pyenv/versions/3.13.13/lib/python3.13/site-packages/google/adk/features/_feature_decorator.py:72: UserWarning: [EXPERIMENTAL] feature FeatureName.PLUGGABLE_AUTH is enabled.
  check_feature_enabled()
2026-04-15 20:06:48,642 - INFO - System Config: 2.0 FPS, 10.0s Heartbeat
Serving static files from: /home/xbill/gemini-cli-aws/gemini31-fargate/frontend/dist
INFO: Started server process [11513]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then connect to the local front end:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxtxz2qtacinjpbb4sgbc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxtxz2qtacinjpbb4sgbc.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Deploying to ECS
&lt;/h4&gt;

&lt;p&gt;A utility script runs the deployment to AWS ECS Fargate. Use the deploy version from the local system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;aws login --remote

&lt;/span&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/gemini31-fargate$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;source &lt;/span&gt;save-aws-creds.sh 
&lt;span class="go"&gt;Exporting AWS credentials...
Successfully saved credentials to .aws_creds
The Makefile will now automatically use these for deployments.
&lt;/span&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/gemini31-fargate$&lt;/span&gt;&lt;span class="w"&gt; 
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The system can now be deployed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/gemini31-fargate$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;make deploy
&lt;span class="go"&gt;./save-aws-creds.sh
Exporting AWS credentials...
Successfully saved credentials to .aws_creds
The Makefile will now automatically use these for deployments.
./deploy-fargate.sh
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And status checked:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/gemini31-fargate$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;make status
&lt;span class="go"&gt;--- Fargate Cluster Status ---
-------------------------------------------------------------
| DescribeClusters |
+--------------------------+----------+----------+----------+
| Name | Pending | Running | Status |
+--------------------------+----------+----------+----------+
| biometric-scout-cluster | 0 | 1 | ACTIVE |
+--------------------------+----------+----------+----------+
--- Fargate Service Status ---
-------------------------------------------------------------
| DescribeServices |
+---------+----------+---------------------------+----------+
| Desired | Running | Service | Status |
+---------+----------+---------------------------+----------+
| 1 | 1 | biometric-scout-service | ACTIVE |
+---------+----------+---------------------------+----------+
&lt;/span&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/gemini31-fargate$&lt;/span&gt;&lt;span class="w"&gt; 
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the container is deployed- you can then get the endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/gemini31-fargate$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;make endpoint
&lt;span class="go"&gt;--- Fargate HTTPS Endpoint ---
Application URL: https://biometric-scout-alb-1410555012.us-east-1.elb.amazonaws.com
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The service will be visible in the AWS console:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5s9t47eexdq0mngu4nvz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5s9t47eexdq0mngu4nvz.png" width="800" height="389"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Running the Web Interface
&lt;/h4&gt;

&lt;p&gt;Start a connection to the deployed app:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://biometric-scout-alb-1410555012.us-east-1.elb.amazonaws.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then connect to the app :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffsd0jys3igkz3kari9pj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffsd0jys3igkz3kari9pj.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then use the Live model to process audio and video:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw4muhndap65r4ou8fsa5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw4muhndap65r4ou8fsa5.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Finally — complete the sequence:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw8xd82i11fookffa2kgh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw8xd82i11fookffa2kgh.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Gemini CLI Code Review
&lt;/h4&gt;

&lt;p&gt;As a final step — Gemini CLI was used for a full code review of the project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;✦ Based on my comprehensive review of the "Alpha Rescue Drone - Biometric Security System," the codebase is exceptionally well-structured,
  follows modern engineering standards, and strictly adheres to the specialized requirements for Gemini 3.1 Flash Live and the Google Agent
  Development Kit (ADK).

Core Architecture &amp;amp; Integration
  The system excels at orchestrating a high-frequency, multimodal data stream. The use of a binary protocol (0x01 for audio, 0x02 for video)
  in useGeminiSocket.js minimizes overhead, while the monkey-patching utility in patch_adk.py demonstrates a sophisticated understanding of
  the evolving Gemini 3.1 Live API. This proactive approach ensures compatibility with features like send_realtime_input that are not yet
  natively handled by the standard ADK.

Backend Analysis
   - Agent Intelligence: The instructions in agent.py are precise and prioritize safety (offensive gesture detection) and hidden overrides
     ("Devil's Horns") effectively.
   - Resilience: The backend implements a heartbeat stimulus (CONTINUE_SURVEILLANCE) and a neural handshake, which are critical for
     maintaining model engagement during visual-only periods.
   - Model Fallback: The get_model_id() logic is clever, automatically switching to gemini-2.5-flash for CLI runs to prevent 404 errors,
     while defaulting to the high-performance gemini-3.1-flash-live-preview for production.

Frontend &amp;amp; UI/UX
   - Aesthetic: The "Neon Cyan" HUD in BiometricLock.jsx provides a high-fidelity, futuristic feel that aligns with the "Mission Alpha"
     theme.
   - Feedback Loop: The synchronization between the random sequence generation, visual matching, and the special "Heavy Metal" audio event
     creates a compelling user experience.
   - Efficiency: Utilizing toBlob with JPEG compression (0.6) for the 2 FPS video stream is an optimal choice for balancing visual clarity
     with network latency.

Testing &amp;amp; Maintainability
   - Validation: The project maintains a clean slate with 100% test and linting compliance. The inclusion of mock.sh is a significant
     developer productivity boost.
   - Standards: The use of Ruff for Python and ESLint for the frontend ensures long-term code quality.

Recommendations
   1. Connection Resilience: While the SYSTEM_ERROR closes the socket correctly, the frontend could benefit from a more graceful
      "Reconnecting..." state if the network fluctuates during a mission.
   2. Patch Management: As the Gemini Live API stabilizes, keep an eye on official ADK updates to deprecate the manual unrolling of
      media_chunks in patch_adk.py.
   3. Security: Verify that the .env variables are correctly injected into the Cloud Run environment via the Makefile or cloudbuild.yaml
      without being exposed in the build logs.

Overall, the system is a robust and creative implementation of real-time multimodal AI. Authentication Protocol: Stable. 🤘
                                                                                                                             ? for shortcuts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Summary
&lt;/h4&gt;

&lt;p&gt;The Agent Development Kit was used to enable a multi-modal agent using the Gemini Live Model. This Agent was tested locally with the CLI and then deployed to Amazon Fargate. Several key take-aways and lessons learned were summarized from working with the transition to a new Live Gemini LLM model. Finally, Gemini CLI was used for a complete project code review.&lt;/p&gt;

</description>
      <category>gemini</category>
      <category>multimodal</category>
      <category>aws</category>
      <category>awsfargate</category>
    </item>
    <item>
      <title>I Replaced 47 DevOps Scripts With One AI Agent — Here’s What Happened</title>
      <dc:creator>POTHURAJU JAYAKRISHNA YADAV</dc:creator>
      <pubDate>Fri, 17 Apr 2026 10:15:34 +0000</pubDate>
      <link>https://dev.to/aws-builders/i-replaced-47-devops-scripts-with-one-ai-agent-heres-what-happened-4jbo</link>
      <guid>https://dev.to/aws-builders/i-replaced-47-devops-scripts-with-one-ai-agent-heres-what-happened-4jbo</guid>
      <description>&lt;h2&gt;
  
  
  The Hook: I Was Wrong About Automation
&lt;/h2&gt;

&lt;p&gt;I thought I automated DevOps.&lt;/p&gt;

&lt;p&gt;I had 47 deployment scripts.&lt;/p&gt;

&lt;p&gt;Then I started replacing them with an AI agent —&lt;br&gt;
and most scripts became unnecessary.&lt;/p&gt;

&lt;p&gt;Not by following instructions.&lt;br&gt;
By making decisions.&lt;/p&gt;

&lt;p&gt;And the 2 AM debugging stopped.&lt;/p&gt;



&lt;p&gt;Note: The code shown here is simplified for clarity.&lt;br&gt;
The GitHub repo contains a more modular implementation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;That's when I realized:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I wasn't automating.&lt;br&gt;
I was hardcoding decisions.&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;I gave Claude a list of AWS tools and one instruction: "Deploy this app."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No hardcoded logic.&lt;br&gt;
No decision trees.&lt;/p&gt;

&lt;p&gt;Just: &lt;strong&gt;describe the goal, Claude figures out how.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deployment time:&lt;/strong&gt; 3 hours → in minutes for most cases.&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;em&gt;(Built this while deploying real workloads on AWS: Docker, ECS, EC2, IAM.)&lt;/em&gt;
&lt;/h2&gt;
&lt;h2&gt;
  
  
  🎯 Is This For You?
&lt;/h2&gt;

&lt;p&gt;This post is for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DevOps engineers&lt;/strong&gt; with 10+ deployment scripts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Platform engineers&lt;/strong&gt; building internal developer platforms
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anyone&lt;/strong&gt; exploring AI agents beyond chatbots&lt;/li&gt;
&lt;li&gt;Teams on AWS (Docker, EC2, ECS, IAM)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If manual deployment takes &amp;gt; 30 minutes, read this.&lt;/p&gt;


&lt;h2&gt;
  
  
  ✨ What's Possible
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Old way (manual):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ecs create-cluster &lt;span class="nt"&gt;--cluster-name&lt;/span&gt; prod
aws ecs register-task-definition &lt;span class="nt"&gt;--family&lt;/span&gt; myapp ...
&lt;span class="c"&gt;# [50+ commands, 3 hours, manual debugging]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;New way (agent):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deploy FastAPI with PostgreSQL, auto-scale to 20, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
          &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;minimal IAM perms, CloudWatch monitoring&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# ✅ Done in minutes for most cases, with significantly reduced debugging
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No scripts.&lt;/p&gt;

&lt;p&gt;Just Claude thinking out loud about your infrastructure.&lt;/p&gt;




&lt;h2&gt;
  
  
  🚀 How It Actually Works
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;You describe the goal (natural language):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Deploy my app on 5 ECS tasks, auto-scale to 20 on high CPU"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Claude gets these tools:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;dispatcher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ecs__create_cluster&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ecs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create_cluster&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ecs__register_task&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ecs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;register_task_definition&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ecs__create_service&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ecs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create_service&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iam__create_role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;iam&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create_role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# For permissions
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Claude reasons autonomously:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"User wants ECS deployment.
 I need to:
 1. Check if cluster exists
 2. Register task definition
 3. Create service with 5 tasks
 4. Setup auto-scaling
 5. Verify it's running"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;It executes:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Call &lt;code&gt;ecs__create_cluster()&lt;/code&gt; → Get cluster ARN&lt;/li&gt;
&lt;li&gt;Call &lt;code&gt;ecs__register_task()&lt;/code&gt; → Get task definition &lt;/li&gt;
&lt;li&gt;Call &lt;code&gt;ecs__create_service()&lt;/code&gt; → Get running tasks&lt;/li&gt;
&lt;li&gt;Call &lt;code&gt;ecs__setup_autoscaling()&lt;/code&gt; → Confirmed&lt;/li&gt;
&lt;li&gt;Return: "All 5 tasks running, auto-scaling 5-20"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;** Minimal manual intervention in most cases. Just reasoning + execution + feedback loops.**&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚠️ The Problem I Started With
&lt;/h2&gt;

&lt;p&gt;I had:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;deploy_docker.py&lt;/code&gt; — 150 lines of Docker logic&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;deploy_ec2.py&lt;/code&gt; — 200 lines of EC2 logic&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;deploy_ecs.py&lt;/code&gt; — 300 lines of ECS logic&lt;/li&gt;
&lt;li&gt;3 routers trying to chain them together&lt;/li&gt;
&lt;li&gt;0 way to handle "deploy to both EC2 AND ECS"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Every new service = new script. Every new workflow = rewrite everything.&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Solution: Reduce rigid scripts — let the agent handle orchestration logic dynamically.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instead of routing to specific tools, give Claude all available tools and let it decide which ones to use, in which order, adapting as it goes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# All tools in one place
&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;docker__run&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;run_container&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ec2__create&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;create_instance&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ecs__deploy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;create_service&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iam__create_role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;create_role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;# ... more tools
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Let Claude orchestrate
&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deploy with Docker locally, then scale to ECS production&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Claude figures out: Docker first, then ECS, then IAM for permissions
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🏗️ The Architecture (3 Layers)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────────────┐
│ User Goal (natural language) │  
│ "Deploy scalable production  │
│  stack with auto-scaling"    │
└──────────┬───────────────────┘
           │
           ▼
    ┌─────────────────────────────────────────────┐
    │  Claude (Bedrock)                           │
    │  ← Reads goal + available tools             │
    │  ← Decides sequence of actions              │
    │  ← Adapts when things fail                  │
    └──────────────┬────────────────────────────┘
                   │
    ┌──────────────┴──────────────┐
    │                             │
    ▼                             ▼
┌─────────────┐  ┌──────────────────┐
│ AWS APIs    │  │ Conversation     │
│ (via boto3) │  │ Memory (DynamoDB)│
│ ← Executes  │  │ ← Recalls setup  │
│   decisions │  │   from last week │
└─────────────┘  └──────────────────┘
    │                    │
    └────────┬───────────┘
             │
    ┌────────▼─────────┐
    │ Actual Resources │
    │ EC2, ECS, Docker │
    │      IAM, etc    │
    └──────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3 moving parts:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Claude&lt;/strong&gt; — Reasons about the task&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tools&lt;/strong&gt; — Execute AWS API calls&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory&lt;/strong&gt; — Remember past deployments for coherence&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  🔧 The Setup (Code Foundation)
&lt;/h2&gt;

&lt;p&gt;Here's the entire base system in ~100 lines. &lt;a href="https://github.com/jayakrishnayadav24/ai-agents" rel="noopener noreferrer"&gt;Full code on GitHub&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agents.memory&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_history&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;save_message&lt;/span&gt;

&lt;span class="n"&gt;MODEL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;apac.amazon.nova-lite-v1:0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# Fast, cheap 
&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;BaseAgent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Foundation for all agents (Docker, EC2, ECS, IAM)&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;AGENT_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;  &lt;span class="c1"&gt;# "docker", "ec2", etc.
&lt;/span&gt;    &lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;  &lt;span class="c1"&gt;# Agent's personality
&lt;/span&gt;    &lt;span class="n"&gt;CAPABILITIES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;  &lt;span class="c1"&gt;# What this agent can do
&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;us-east-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;region_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_bedrock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

    &lt;span class="nd"&gt;@property&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;bedrock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Lazy load Bedrock connection (only when needed)&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_bedrock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_bedrock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bedrock-runtime&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_bedrock&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;The agentic loop: think → decide → act → repeat&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_tools&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;dispatcher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_dispatcher&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_history&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;}]}]&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  &lt;span class="c1"&gt;# Max 10 iterations to prevent infinite loops
&lt;/span&gt;            &lt;span class="c1"&gt;# Ask Claude what to do
&lt;/span&gt;            &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bedrock&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;converse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;toolUseDepth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="c1"&gt;# Is Claude done?
&lt;/span&gt;            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stopReason&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;endTurn&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUCCESS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}&lt;/span&gt;

            &lt;span class="c1"&gt;# Execute tools Claude wants to call
&lt;/span&gt;            &lt;span class="n"&gt;tool_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tool_use&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                &lt;span class="n"&gt;tool_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tool_use&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;toolUse&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dispatcher&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;href&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_use[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="n"&gt;toolUse&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
                &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;failed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

                &lt;span class="n"&gt;tool_results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;toolUseId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool_use&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;toolUse&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;toolUseId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;}]})&lt;/span&gt;

            &lt;span class="c1"&gt;# Add Claude's decision + results to conversation
&lt;/span&gt;            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]})&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;toolResult&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tr&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tr&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tool_results&lt;/span&gt;&lt;span class="p"&gt;]})&lt;/span&gt;

            &lt;span class="c1"&gt;# Save for future reference
&lt;/span&gt;            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nf"&gt;save_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FAILED&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reason&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Max iterations reached&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What's happening:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Lazy loading&lt;/strong&gt; (&lt;code&gt;@property bedrock&lt;/code&gt;): Don't connect to Claude until needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool loop&lt;/strong&gt;: Call tools, capture results, show Claude the output&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error handling&lt;/strong&gt;: Don't crash—tell Claude what failed, let it adapt&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory&lt;/strong&gt;: Save each turn so Claude remembers next week&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🚀 Agents in Action: 3 Real Examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  DockerAgent: Deploy Locally
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DockerAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseAgent&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;AGENT_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;docker&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;You manage Docker containers.
Rules: Pull image first, check if container exists, use sensible defaults.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="n"&gt;CAPABILITIES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;list_containers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;List running containers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;run_container&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Pull and run image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stop_container&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop a container&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Usage:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;docker_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DockerAgent&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;docker_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deploy FastAPI on port 8000 with health check&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Claude calls: list_containers → run_container → confirms it's running
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  EC2Agent: Scale to Cloud
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EC2Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseAgent&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;AGENT_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ec2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;You manage EC2 instances.
Rules: Tag for organization, verify security groups, create new→test→retire old.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="n"&gt;CAPABILITIES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;describe_instances&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;List instances&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;create_instance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Launch new instance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stop_instance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stop instance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Usage:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;ec2_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;EC2Agent&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;ec2_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Create 2 t2.micro instances, tag as app-server&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Claude: checks if instances exist → creates 2 → returns IPs → confirms running
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  ECSAgent: Production Auto-Scaling
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ECSAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseAgent&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;AGENT_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ecs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;You manage ECS services at production scale.
Rules: Task definition first, use Fargate, always configure auto-scaling.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="n"&gt;CAPABILITIES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;create_cluster&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Create ECS cluster&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;register_task_definition&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Register blueprint&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;create_service&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deploy service&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;setup_autoscaling&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Configure scaling rules&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Usage:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;ecs_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ECSAgent&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;ecs_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deploy with 5 tasks, auto-scale to 20 on high CPU, health monitoring&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Claude orchestrates: cluster → task definition → service → autoscaling → verification
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Real Usage: From Local to Production
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Scenario:&lt;/strong&gt; Deploy FastAPI from laptop to production in minutes for most cases, with significantly reduced debugging.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stage 1: Test locally&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;docker_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Run myapp:latest on port 8000&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prod_001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# ✅ Docker container running
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Stage 2: Scale to cloud&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;ec2_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Create 2 instances for load balancing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prod_001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# ✅ 2 EC2 instances up (same session = Claude remembers port 8000)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Stage 3: Auto-scaling production&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;ecs_agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deploy with 5-20 auto-scaling, CloudWatch monitoring&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prod_001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# ✅ 5 ECS tasks running, auto-scaling 5-20 based on CPU
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;All with the same &lt;code&gt;session_id&lt;/code&gt;:&lt;/strong&gt; Claude remembers the image name, port, and configuration from Stage 1. Everything connects seamlessly.&lt;/p&gt;




&lt;h2&gt;
  
  
  📚 Memory: Agents Remember
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Day 1
&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deploy on Docker with port 8000&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_123&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Week later
&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Move this to ECS for production&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_123&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Claude reads history from DynamoDB:
# "I remember this app used port 8000. I'll keep that for ECS too."
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;DynamoDB stores every conversation turn. Claude recalls past decisions and uses them for coherence.&lt;/p&gt;




&lt;h2&gt;
  
  
  🛡️ Production-Ready Code Patterns
&lt;/h2&gt;

&lt;p&gt;This is what the user feedback emphasized. Real code needs safety.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 1: Error Handling (Don't Crash, Recover)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ❌ Wrong: crashes on error
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dispatcher&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# ✅ Right: tell Claude about the error
&lt;/span&gt;&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dispatcher&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;failed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;request_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;request_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# For debugging
&lt;/span&gt;    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;# Claude sees the error and tries a different approach
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why this matters:&lt;/strong&gt; If Docker pull fails, don't give up. Tell Claude: "Pull failed, but let me check if the image is locally cached." Claude adapts.&lt;/p&gt;




&lt;h3&gt;
  
  
  Pattern 2: Timeouts (Prevent Hanging)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;functools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;wraps&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;signal&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;with_timeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Prevent tools from running forever&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decorator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;wrapper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;signum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;TimeoutError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tool execution exceeded &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SIGALRM&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;alarm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;alarm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Cancel timeout
&lt;/span&gt;                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;TimeoutError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timeout&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;wrapper&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;decorator&lt;/span&gt;

&lt;span class="nd"&gt;@with_timeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_instance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;If EC2 creation hangs, timeout after 60 seconds&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;ec2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ec2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ec2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_instances&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Pattern 3: Input Validation (Prevent Bad Requests)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_instance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Validate before executing&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Check required fields
&lt;/span&gt;    &lt;span class="n"&gt;required&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ImageId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;InstanceType&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;field&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;field&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Missing required field: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;field&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;failed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;# Validate instance type
&lt;/span&gt;    &lt;span class="n"&gt;valid_types&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;t2.micro&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;t2.small&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;t3.medium&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;InstanceType&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;valid_types&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;InstanceType must be one of &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;valid_types&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;failed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;# Now safe to execute
&lt;/span&gt;    &lt;span class="n"&gt;ec2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ec2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;ec2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_instances&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Pattern 4: Audit Logging (Proof for Compliance)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;log_action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Log everything for audits&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;log_entry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;request_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;request_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;params&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;# What was requested
&lt;/span&gt;        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result_status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;result_code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;# Save to CloudWatch Logs or DynamoDB
&lt;/span&gt;    &lt;span class="n"&gt;dynamodb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put_item&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TableName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agent_audit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;log_entry&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# In the tool dispatcher:
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dispatcher&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;log_action&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Pattern 5: Cost Guards (Don't Deploy Expensive Mistakes)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;MONTHLY_BUDGET&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;  &lt;span class="c1"&gt;# $1000/month limit
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;estimate_cost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Estimate AWS cost before executing&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;create_instance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;instance_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;InstanceType&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;t2.micro&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;hourly_rates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;t2.micro&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.012&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;t2.small&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.023&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;t3.medium&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.042&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;months_running&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;hourly_rates&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;instance_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;730&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;months_running&lt;/span&gt;

    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;create_rds&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# RDS: ~$400/month for 20GB
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

&lt;span class="c1"&gt;# In the runnable:
&lt;/span&gt;&lt;span class="n"&gt;estimated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;estimate_cost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;estimated&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;MONTHLY_BUDGET&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Cost $&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;estimated&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; exceeds budget $&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;MONTHLY_BUDGET&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rejected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;# Only execute if under budget
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dispatcher&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Pattern 6: Role-Based Access (Security Boundaries)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Define who can do what
&lt;/span&gt;&lt;span class="n"&gt;USER_PERMISSIONS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;admin&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;create_instance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;delete_instance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;create_role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;delete_role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;developer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;create_instance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stop_instance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;deploy_to_ecs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;readonly&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;describe_instances&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;list_containers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;get_logs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check_permission&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Prevent unauthorized actions&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;allowed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;USER_PERMISSIONS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;allowed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

&lt;span class="c1"&gt;# Before executing:
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;check_permission&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User role &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;user_role&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; cannot perform &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;permission_denied&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dispatcher&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tool_name&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="n"&gt;tool_input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  ✅ Why Agents &amp;gt; Scripts
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Scripts&lt;/th&gt;
&lt;th&gt;Agents&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;New workflow&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Rewrite code&lt;/td&gt;
&lt;td&gt;Claude adapts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Error recovery&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Crashes&lt;/td&gt;
&lt;td&gt;Tries alternatives&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reasoning&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;None (hardcoded)&lt;/td&gt;
&lt;td&gt;Full decision log&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Maintenance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Grows linearly&lt;/td&gt;
&lt;td&gt;One framework&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Learning&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Manual proof&lt;/td&gt;
&lt;td&gt;Automatic audit trail&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  🎯 Getting Started
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Just 3 commands:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/jayakrishnayadav24/ai-agents
&lt;span class="nb"&gt;cd &lt;/span&gt;ai-agents
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Then try:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agents.docker_agent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DockerAgent&lt;/span&gt;

&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DockerAgent&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deploy FastAPI app on port 8000 with health check&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;That's it.&lt;/strong&gt; Claude handles the rest.&lt;/p&gt;




&lt;h2&gt;
  
  
  📚 What's Next (Part 2)
&lt;/h2&gt;

&lt;p&gt;This was the concept. &lt;a href="https://github.com/jayakrishnayadav24/ai-agents" rel="noopener noreferrer"&gt;Github&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Full working code (all agents)&lt;/li&gt;
&lt;li&gt;✅ DynamoDB setup for memory&lt;/li&gt;
&lt;li&gt;✅ Deploying agents to Lambda&lt;/li&gt;
&lt;li&gt;✅ Real production patterns (request tracing, cost estimation)&lt;/li&gt;
&lt;li&gt;✅ Demo with actual deployments&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;⚠️ Where This Breaks&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Complex compliance environments (manual approvals still needed)&lt;/li&gt;
&lt;li&gt;Cost estimation is approximate&lt;/li&gt;
&lt;li&gt;Requires well-defined tools (garbage in → garbage out)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Before:&lt;/strong&gt; 47 scripts, 3 hours/deployment, constant debugging&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After:&lt;/strong&gt; 1 agent framework, in minutes for most cases, with significantly reduced debugging/deployment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why?&lt;/strong&gt; Because Claude doesn't follow scripts. Claude plans and executes the steps based on the goal.&lt;/p&gt;

&lt;p&gt;It adapts. It learns. It remembers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;That's not automation anymore. That's the future of infrastructure.&lt;/strong&gt;&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;How many deployment scripts are you still maintaining?&lt;/p&gt;

&lt;p&gt;10+? 20+? 50+?&lt;/p&gt;

&lt;p&gt;I want to see how bad this problem is. Drop a comment 👇&lt;/p&gt;

&lt;p&gt;Your feedback shapes Part 2.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>agents</category>
      <category>aws</category>
      <category>eks</category>
      <category>automation</category>
    </item>
    <item>
      <title>Processing long running events on AWS API Gateway</title>
      <dc:creator>Evertson Croes</dc:creator>
      <pubDate>Fri, 17 Apr 2026 08:19:51 +0000</pubDate>
      <link>https://dev.to/aws-builders/processing-long-running-events-on-aws-api-gateway-bn4</link>
      <guid>https://dev.to/aws-builders/processing-long-running-events-on-aws-api-gateway-bn4</guid>
      <description>&lt;h1&gt;
  
  
  Processing long running events on AWS API Gateway
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/api-gateway/" rel="noopener noreferrer"&gt;AWS API Gateway&lt;/a&gt; is a managed HTTP/REST service provided by AWS. It provides a relatively simple way to host an API and offers rich functionality when it comes to customizability, security and integration. AWS API Gateway enforces a maximum integration timeout of 29 seconds. For most APIs this is perfectly reasonable.&lt;/p&gt;

&lt;p&gt;However, problems arise when an API must trigger operations that take minutes to complete, such as generating large exports or running complex background jobs. In our case, we needed to generate large database exports that could take several minutes. A synchronous API request was therefore not an option. One example is the generation of large exports from a database. This became a challenge with the default setup we had with API Gateway.&lt;/p&gt;

&lt;p&gt;AWS provides some &lt;a href="https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/process-events-asynchronously-with-amazon-api-gateway-and-aws-lambda.html" rel="noopener noreferrer"&gt;guidance&lt;/a&gt; on this in their documentation. However, in this blog article I want to share how we solved this problem in our project in more detail and also provide a working CDK project as an example.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;As mentioned in the intro, we have a time &lt;strong&gt;limit of 29 seconds&lt;/strong&gt; on requests on API Gateway. But the problem is also that we want to run functionality that could potentially run for many minutes. In this case, we do not want to run a &lt;strong&gt;synchronous&lt;/strong&gt; call, but an &lt;strong&gt;asynchronous&lt;/strong&gt; call. Meaning that even if API Gateway would allow us to keep a request open for 30 minutes, it would not be beneficial for a frontend application to keep one blocking request open for that long since it takes up resources.&lt;/p&gt;

&lt;p&gt;So we also need a mechanism to handle asynchronous requests while still using API Gateway.&lt;/p&gt;

&lt;h2&gt;
  
  
  The TaskManager
&lt;/h2&gt;

&lt;p&gt;We ended up calling the solution to this problem the “TaskManager”. It can be seen as one Microservice with the sole responsibility of keeping track of tasks. It does not actually process the tasks. The following diagram provides a high-level overview:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyduvfgc98uiyz7yahd52.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyduvfgc98uiyz7yahd52.png" alt=" " width="639" height="415"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this overview, there are Task Suppliers, Task Processors and the TaskManager itself. It is important to note that in our use cases, we have not yet found a scenario where there are multiple suppliers of the same task and multiple processors for the same type of task. However, the pattern introduced in this blog could be expanded to include this if necessary.&lt;/p&gt;

&lt;p&gt;If we zoom in to the TaskManager, we have the following components:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8668okbvcxcu6wk7qost.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8668okbvcxcu6wk7qost.png" alt=" " width="800" height="330"&gt;&lt;/a&gt;&lt;br&gt;
(Click &lt;a href="https://evertson-public.s3.eu-central-1.amazonaws.com/task_manager_diagram.png" rel="noopener noreferrer"&gt;here&lt;/a&gt; to see diagram better)&lt;/p&gt;

&lt;p&gt;This diagram depicts the deployment and the flow at the same time. On the left side, we can see a Task Supplier. For this example, it does not matter what this is. It could be any component that can do a HTTP request. In our example, it was a Backend-for-Frontend API Gateway.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd87rwcuqfyoj5f4lo1ev.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd87rwcuqfyoj5f4lo1ev.png" alt=" " width="800" height="464"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This Task Supplier calls one of the two endpoints available in the TaskManager API, which itself is an API Gateway, that allows the creation of a Task via a HTTP POST request. This triggers a Lambda function that will create the task in the TaskStatus DynamoDB Table.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8e0jvvocgb3d4fb5kgp4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8e0jvvocgb3d4fb5kgp4.png" alt=" " width="800" height="242"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When an entry is created in this Table, a DynamoDB Stream will trigger the TaskStatusPublisher Lambda. This Lambda will check if this is a new entry (indicated by the INSERT) and in this case will publish a “TaskCreatedEvent”. It is important to note that this event also contains the Task Type. This is important as the type determines which processor needs to process it.&lt;/p&gt;

&lt;p&gt;This is essentially where the first flow of the TaskManager ends. It is the responsibility of the Task Processor to create an EventBridge rule to consume this event and process the event.&lt;/p&gt;

&lt;p&gt;The TaskManager expects to be regularly updated by the TaskProcessor via the TaskUpdatedEvent. The status of the event can be updated to RUNNING, SUCCESSFUL and FAILED. In the case of SUCCESSFUL, a payload can also be added. This could be the result of the task or in our case for the large exports, a signed S3 bucket URL to download it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbi08kelo9m8lon5aglo4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbi08kelo9m8lon5aglo4.png" alt=" " width="800" height="207"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Task Supplier can poll for the task regularly and get the status. Based on this status it can decide how to react.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd40mqyg8tpgqjohugcop.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd40mqyg8tpgqjohugcop.png" alt=" " width="800" height="122"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Notice that the TaskStatusPublisher also publishes TaskRunningEvent, TaskSuccessfulEvent and TaskFailedEvent. This could be used in combination with a WebSocket mechanism to receive live updates instead of polling. However, this is out-of-scope for this blog.&lt;/p&gt;

&lt;p&gt;This setup benefits from being completely serverless, meaning we can scale up on higher loads but also scale down to zero if there are no tasks. For this reason, in our case, we have only created one instance of this TaskManager that is shared by all tasks in our system. However, you could create multiple TaskManagers for different bounded contexts or even for each type of task.&lt;/p&gt;

&lt;h2&gt;
  
  
  CDK Project
&lt;/h2&gt;

&lt;p&gt;The example CDK project of this setup can be found in this GitHub &lt;a href="https://github.com/evertson90/aws-task-manager" rel="noopener noreferrer"&gt;repository&lt;/a&gt;. There is a README file which explains how to build and deploy the project to your own AWS environment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;This pattern is a simple but powerful way to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Work around API Gateway limitations&lt;/li&gt;
&lt;li&gt;Build scalable async workflows&lt;/li&gt;
&lt;li&gt;Keep your frontend responsive&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're dealing with long-running operations in AWS, this approach is definitely worth considering.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>apigateway</category>
      <category>architecture</category>
      <category>eventdriven</category>
    </item>
    <item>
      <title>Multi-Agent A2A with the Agent Development Kit(ADK), Amazon ECS Express, and Gemini CLI</title>
      <dc:creator>xbill</dc:creator>
      <pubDate>Thu, 16 Apr 2026 22:47:15 +0000</pubDate>
      <link>https://dev.to/aws-builders/multi-agent-a2a-with-the-agent-development-kitadk-amazon-ecs-express-and-gemini-cli-5ag7</link>
      <guid>https://dev.to/aws-builders/multi-agent-a2a-with-the-agent-development-kitadk-amazon-ecs-express-and-gemini-cli-5ag7</guid>
      <description>&lt;p&gt;Leveraging the Google Agent Development Kit (ADK) and the underlying Gemini LLM to build Multi-Agent Applications with A2A protocol support using the Python programming language deployed to AWS ECS Express.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpgdtt0hyrg9041coa25d.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpgdtt0hyrg9041coa25d.jpeg" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Aren’t There a Billion Python ADK Demos?
&lt;/h4&gt;

&lt;p&gt;Yes there are.&lt;/p&gt;

&lt;p&gt;Python has traditionally been the main coding language for ML and AI tools. The goal of this article is to provide a multi-agent test bed for building, debugging, and deploying multi-agent applications.&lt;/p&gt;

&lt;h4&gt;
  
  
  Rock and roll ain’t noise pollution
&lt;/h4&gt;

&lt;p&gt;So what is different about this lab compared to all the others out there?&lt;/p&gt;

&lt;p&gt;This is one of the first deep dives into a Multi-Agent application leveraging the advanced tooling of Gemini CLI. The starting point for the demo was an existing Codelab- which was updated and re-engineered with Gemini CLI.&lt;/p&gt;

&lt;p&gt;The original Codelab- is here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://codelabs.developers.google.com/codelabs/production-ready-ai-roadshow/1-building-a-multi-agent-system/building-a-multi-agent-system#0" rel="noopener noreferrer"&gt;Building a Multi-Agent System | Google Codelabs&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Python Version Management
&lt;/h4&gt;

&lt;p&gt;One of the downsides of the wide deployment of Python has been managing the language versions across platforms and maintaining a supported version.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;pyenv&lt;/strong&gt; tool enables deploying consistent versions of Python:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/pyenv/pyenv" rel="noopener noreferrer"&gt;GitHub - pyenv/pyenv: Simple Python version management&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As of writing — the mainstream python version is 3.13. To validate your current Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;python --version
Python 3.13.13
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Amazon ECS Express
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://www.google.com/search?q=Amazon+ECS+Express+Mode&amp;amp;rlz=1CAIWTJ_enUS1110&amp;amp;oq=what+is+amazon+ecs+express&amp;amp;gs_lcrp=EgZjaHJvbWUyBggAEEUYOTIJCAEQIRgKGKAB0gEIMzI0MWowajeoAgCwAgA&amp;amp;sourceid=chrome&amp;amp;ie=UTF-8&amp;amp;mstk=AUtExfAELWySw4fS4VoaovwdGE8MUNcOltEQ-lyCKwxY4t3OArbcxO8JX30JpX02tjJDKML-JgcQEQDIaZjDgUHMoJTycp046hy8F-_Y_zxJ9Bo0rZyERUQ6geXGT9MPUb02ZLA7LpFjGlcpRgGkURGERCNHTKdtI2kGtm-bh5XT5dS4hpo&amp;amp;csui=3&amp;amp;ved=2ahUKEwiu_YSzptWTAxVPF1kFHY8nLbwQgK4QegQIARAB" rel="noopener noreferrer"&gt;Amazon ECS Express Mode&lt;/a&gt; (announced Nov 2025) is a simplified deployment feature for Amazon Elastic Container Service (ECS) designed to rapidly launch containerized applications, APIs, and web services on AWS Fargate. It automates infrastructure setup — including load balancing, networking, scaling, and HTTPS endpoints — allowing developers to deploy from container image to production in a single step.&lt;/p&gt;

&lt;p&gt;More details are available here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/AmazonECS/latest/developerguide/express-service-overview.html" rel="noopener noreferrer"&gt;Amazon ECS Express Mode&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The ECS status is visible from the AWS Console:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv9ydwmp94aodxr907d3d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv9ydwmp94aodxr907d3d.png" width="800" height="391"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Gemini CLI
&lt;/h4&gt;

&lt;p&gt;If not pre-installed you can download the Gemini CLI to interact with the source files and provide real-time assistance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @google/gemini-cli
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Testing the Gemini CLI Environment
&lt;/h4&gt;

&lt;p&gt;Once you have all the tools and the correct Node.js version in place- you can test the startup of Gemini CLI. You will need to authenticate with a Key or your Google Account:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;▝▜▄ Gemini CLI v0.33.1
    ▝▜▄
   ▗▟▀ Logged in with Google /auth
  ▝▀ Gemini Code Assist Standard /upgrade no sandbox (see /docs) /model Auto (Gemini 3) | 239.8 MB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Node Version Management
&lt;/h4&gt;

&lt;p&gt;Gemini CLI needs a consistent, up to date version of Node. The &lt;strong&gt;nvm&lt;/strong&gt; command can be used to get a standard Node environment:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/nvm-sh/nvm" rel="noopener noreferrer"&gt;GitHub - nvm-sh/nvm: Node Version Manager - POSIX-compliant bash script to manage multiple active node.js versions&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Agent Development Kit
&lt;/h4&gt;

&lt;p&gt;The &lt;a href="https://www.google.com/search?q=Google+Agent+Development+Kit&amp;amp;rlz=1CAIWTJ_enUS1114&amp;amp;oq=what+is+the+adk+google&amp;amp;gs_lcrp=EgZjaHJvbWUyBggAEEUYOTIICAEQABgWGB4yCAgCEAAYFhgeMggIAxAAGBYYHjIICAQQABgWGB4yCAgFEAAYFhgeMggIBhAAGBYYHjIKCAcQABgKGBYYHjINCAgQABiGAxiABBiKBTIKCAkQABiABBiiBNIBCDMxODlqMGo3qAIAsAIA&amp;amp;sourceid=chrome&amp;amp;ie=UTF-8&amp;amp;mstk=AUtExfB5Oo7ZHHcDEHu7aqZiPBA2l1c-QGh5dB7xkkDPIiYcn8O1Imt2IHNR7bzA6JnyDCSDCUGpGWTeBW14namlN_QqzJLLI5-px1BE9jfSxwli6njPDPERjm5pRqNP3uC6HhUKiRcTJ1T8x5LHQrCkVxylw7QWg0N8B4dQDIcWpnVX9Gc&amp;amp;csui=3&amp;amp;ved=2ahUKEwjYu-G8p-uSAxXrv4kEHUbpLo0QgK4QegQIARAB" rel="noopener noreferrer"&gt;Google Agent Development Kit&lt;/a&gt; (ADK) is an open-source, Python-based framework designed to streamline the creation, deployment, and orchestration of sophisticated, multi-agent AI systems. It treats agent development like software engineering, offering modularity, state management, and built-in tools (like Google Search) to build autonomous agents.&lt;/p&gt;

&lt;p&gt;The ADK can be installed from here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://google.github.io/adk-docs/" rel="noopener noreferrer"&gt;Agent Development Kit (ADK)&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Agent Skills
&lt;/h4&gt;

&lt;p&gt;Gemini CLI can be customized to work with ADK agents. Both an Agent Development MCP server, and specific Agent skills are available.&lt;/p&gt;

&lt;p&gt;More details are here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://adk.dev/tutorials/coding-with-ai/" rel="noopener noreferrer"&gt;Agent Development Kit (ADK)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To get the Agent Skills in Gemini CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /skills list
Available Agent Skills:
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and the ADK documentation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; /mcp list
Configured MCP servers:
🟢 adk-docs-mcp (from adk-docs-ext) - Ready (2 tools)
  Tools:
  - mcp_adk-docs-mcp_fetch_docs
  - mcp_adk-docs-mcp_list_doc_sources
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Where do I start?
&lt;/h4&gt;

&lt;p&gt;The strategy for starting multi agent development is a incremental step by step approach.&lt;/p&gt;

&lt;p&gt;First, the basic development environment is setup with the required system variables, and a working Gemini CLI configuration.&lt;/p&gt;

&lt;p&gt;Then, ADK Multi-Agent is built, debugged, and tested locally. Finally — the entire solution is deployed to Google Cloud Run.&lt;/p&gt;

&lt;h4&gt;
  
  
  Setup the Basic Environment
&lt;/h4&gt;

&lt;p&gt;At this point you should have a working Python environment and a working Gemini CLI installation. All of the relevant code examples and documentation is available in GitHub.&lt;/p&gt;

&lt;p&gt;The next step is to clone the GitHub repository to your local environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ~
git clone https://github.com/xbill9/gemini-cli-aws
&lt;span class="nb"&gt;cd &lt;/span&gt;multi-ecsexpress
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then run &lt;strong&gt;init2.sh&lt;/strong&gt; from the cloned directory.&lt;/p&gt;

&lt;p&gt;The script will attempt to determine your shell environment and set the correct variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;source &lt;/span&gt;init2.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your session times out or you need to re-authenticate- you can run the &lt;strong&gt;set_env.sh&lt;/strong&gt; script to reset your environment variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;source &lt;/span&gt;set_env.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Variables like PROJECT_ID need to be setup for use in the various build scripts- so the &lt;strong&gt;set_env&lt;/strong&gt; script can be used to reset the environment if you time-out.&lt;/p&gt;

&lt;p&gt;Login to the AWS console:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws login &lt;span class="nt"&gt;--remote&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally install the packages and dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight make"&gt;&lt;code&gt;&lt;span class="err"&gt;make&lt;/span&gt; &lt;span class="err"&gt;install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Verify The ADK Installation
&lt;/h4&gt;

&lt;p&gt;To verify the setup, run the ADK CLI locally with the researcher agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/multi-eks/agents$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;adk run researcher
&lt;span class="go"&gt;/home/xbill/.local/lib/python3.13/site-packages/google/adk/features/_feature_decorator.py:72: UserWarning: [EXPERIMENTAL] feature FeatureName.PLUGGABLE_AUTH is enabled.
  check_feature_enabled()
Log setup complete: /tmp/agents_log/agent.20260412_164250.log
To access latest log: tail -F /tmp/agents_log/agent.latest.log
{"asctime": "2026-04-12 16:42:50,986", "name": "root", "levelname": "INFO", "message": "Logging initialized for researcher", "filename": "logging_config.py", "lineno": 54, "service": "researcher", "log_level": "INFO"}
{"asctime": "2026-04-12 16:42:50,987", "name": "researcher.agent", "levelname": "INFO", "message": "Initialized researcher agent with model: gemini-2.5-flash", "filename": "agent.py", "lineno": 85}
{"asctime": "2026-04-12 16:42:50,988", "name": "google_adk.google.adk.cli.utils.envs", "levelname": "INFO", "message": "Loaded .env file for researcher at /home/xbill/gemini-cli-aws/multi-eks/.env", "filename": "envs.py", "lineno": 83}
{"asctime": "2026-04-12 16:42:50,988", "name": "google_adk.google.adk.cli.utils.local_storage", "levelname": "INFO", "message": "Using per-agent session storage rooted at /home/xbill/gemini-cli-aws/multi-eks/agents", "filename": "local_storage.py", "lineno": 84}
{"asctime": "2026-04-12 16:42:50,988", "name": "google_adk.google.adk.cli.utils.local_storage", "levelname": "INFO", "message": "Using file artifact service at /home/xbill/gemini-cli-aws/multi-eks/agents/researcher/.adk/artifacts", "filename": "local_storage.py", "lineno": 110}
{"asctime": "2026-04-12 16:42:50,988", "name": "google_adk.google.adk.cli.utils.service_factory", "levelname": "INFO", "message": "Using in-memory memory service", "filename": "service_factory.py", "lineno": 266}
{"asctime": "2026-04-12 16:42:50,993", "name": "google_adk.google.adk.cli.utils.local_storage", "levelname": "INFO", "message": "Creating local session service at /home/xbill/gemini-cli-aws/multi-eks/agents/researcher/.adk/session.db", "filename": "local_storage.py", "lineno": 60}
Running agent researcher, type exit to exit.


&lt;h4&gt;
  
  
  Test The ADK Web Interface
&lt;/h4&gt;

&lt;p&gt;This tests the ADK agent interactions with a browser:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws/multi-eks/agents$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;adk web &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0
&lt;span class="go"&gt;/home/xbill/.local/lib/python3.13/site-packages/google/adk/features/_feature_decorator.py:72: UserWarning: [EXPERIMENTAL] feature FeatureName.PLUGGABLE_AUTH is enabled.
  check_feature_enabled()
2026-04-12 16:43:14,152 - INFO - service_factory.py:266 - Using in-memory memory service
2026-04-12 16:43:14,153 - INFO - local_storage.py:84 - Using per-agent session storage rooted at /home/xbill/gemini-cli-aws/multi-eks/agents
2026-04-12 16:43:14,153 - INFO - local_storage.py:110 - Using file artifact service at /home/xbill/gemini-cli-aws/multi-eks/agents/.adk/artifacts
/home/xbill/.local/lib/python3.13/site-packages/google/adk/cli/fast_api.py:198: UserWarning: [EXPERIMENTAL] InMemoryCredentialService: This feature is experimental and may change or be removed in future versions without notice. It may introduce breaking changes at any time.
  credential_service = InMemoryCredentialService()
/home/xbill/.local/lib/python3.13/site-packages/google/adk/auth/credential_service/in_memory_credential_service.py:33: UserWarning: [EXPERIMENTAL] BaseCredentialService: This feature is experimental and may change or be removed in future versions without notice. It may introduce breaking changes at any time.
  super(). __init__ ()
INFO: Started server process [32675]
INFO: Waiting for application startup.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then use the web interface — either on the local interface &lt;strong&gt;127.0.0.1&lt;/strong&gt; or the catch-all web interface &lt;strong&gt;0.0.0.0&lt;/strong&gt; -depending on your environment:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhdsixkis3hdhngrjbooa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhdsixkis3hdhngrjbooa.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Special note for Google Cloud Shell Deployments- add a CORS &lt;strong&gt;allow_origins&lt;/strong&gt; configuration exemption to allow the ADK agent to run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;adk web &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &lt;span class="nt"&gt;--allow_origins&lt;/span&gt; &lt;span class="s1"&gt;'regex:.*'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Multi Agent Design
&lt;/h4&gt;

&lt;p&gt;The multi-agent deployment consists of 5 agents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Researcher&lt;/li&gt;
&lt;li&gt;Judge&lt;/li&gt;
&lt;li&gt;Orchestrator&lt;/li&gt;
&lt;li&gt;Content Builder&lt;/li&gt;
&lt;li&gt;Course Builder&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a detailed analysis of the multi-agent architecture- this article provides the background information:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://xbill999.medium.com/multi-agent-a2a-with-the-agent-development-kit-adk-cloud-run-and-gemini-cli-52f8be838ad6" rel="noopener noreferrer"&gt;Multi-Agent A2A with the Agent Development Kit(ADK), Cloud Run, and Gemini CLI&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Running/Testing/Debugging Locally
&lt;/h4&gt;

&lt;p&gt;The main Makefile has been extended with extensive targets for managing the agents on the local development environment.&lt;/p&gt;

&lt;p&gt;The key targets include:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;xbill@penguin:~/multi-agent$ make help
Available commands:
  install - Install all dependencies for root, agents, and app
  start - Start all services locally (alias for start-local)
  stop - Stop all local services (alias for stop-local)
  run - Start all services locally (alias for start-local)
  local - Show local service URLs
  start-local - Start all local services in background
  stop-local - Stop all local processes
  test - Run all tests (pytest)
  test-researcher - Test the Researcher agent directly
  test-judge - Test the Judge agent directly
  test-orchestrator - Test the Orchestrator logic
  lint - Run linting checks (ruff)
  deploy - Deploy all services to Cloud Run
  destroy - Delete all Cloud Run services
  clean - Remove caches and logs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First check for local running agents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;xbill@penguin:~/gemini-cli-aws$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;make status
&lt;span class="go"&gt;Checking status of locally running agents and servers...
--- Network Status ---
No services listening on expected ports (8000-8004, 5173).
--- Process Status ---
No matching processes found.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then all the agents can be started together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;✦ The frontend has been built and all local services (Researcher, Judge, Content Builder, Orchestrator, App Backend, and Frontend dev server) have
  been started in the background.

   - Frontend: http://localhost:5173 (http://localhost:5173)
   - Backend: http://localhost:8000 (http://localhost:8000)

  Logs for each service are available in the project root: researcher.log, judge.log, content_builder.log, orchestrator.log, backend.log, and
  frontend.log.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then connect to the front end locally:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9eui8i77oqdiftnerk4r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9eui8i77oqdiftnerk4r.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And the entire agent system will run in the local environment:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fklkygh1ukveelo7tr9uw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fklkygh1ukveelo7tr9uw.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Local Logging / Debugging
&lt;/h4&gt;

&lt;p&gt;Gemini CLI has full access to the local agent logs for debugging and troubleshooting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; review the local logs for the multi-agent run
✦ I will check run_local.sh and the project structure to determine where logs are stored or how they are generated.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will verify the local operation of the code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;✦ The local logs confirm a successful multi-agent run for the topic "the history of the internet." Here is a summary of the execution flow:

* Orchestrator (orchestrator.log): Successfully managed the pipeline. It captured the topic, coordinated the Research-Judge loop, and escalated to
     content generation after the first iteration.
   * Researcher (researcher.log): Gathered approximately 9.2KB of research data using the google_search tool. It completed its task with a 200 OK
     response.
   * Judge (judge.log): Evaluated the findings and gave a "pass" status. The judge noted that the research was "comprehensive, accurate, and
     well-structured," making it an "excellent foundation for educational content."
   * Content Builder (content_builder.log): Successfully transformed the validated research into a structured Markdown course module and finished
     execution.
   * Backend &amp;amp; Frontend (backend.log, frontend.log): The FastAPI backend (port 8000) and Vite frontend (port 5173) are running and communicating with
     the orchestrator as expected.

Notable Observation:
   * A minor warning appeared in researcher.log: Researcher callback: 'response' not found or has no parts. However, this did not impact the
     successful completion of the research or the overall pipeline.

All services are currently active and logged at the DEBUG level as specified in run_local.sh.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Deploying to Amazon Elastic Container Service (ECS)
&lt;/h4&gt;

&lt;p&gt;The project level Makefile has targets for managing the Agent deployment to serverless endpoints. You can submit the build for ECS Express deployment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;make deploy
✦ The deployment was successful, and all 5 services are active on AWS ECS Express. You can access the Web App at the following URL:

  Web App (Course Creator): https://ad-65d6861112ff49099782001efe5e2721.ecs.us-east-1.on.aws

  The other microservices are also deployed and integrated:
   - Researcher: https://ad-8779b3dc720e4d9e9ca9b1091499084a.ecs.us-east-1.on.aws
   - Judge: https://ad-ab123be1fad04a9390e1d918f9b8ec04.ecs.us-east-1.on.aws
   - Content Builder: https://ad-622b8527fd1f41668624714f62deee0f.ecs.us-east-1.on.aws
   - Orchestrator: https://ad-c0f70b2d021744ec9761c2e54ca60287.ecs.us-east-1.on.aws
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the containers are deployed- you can then get the endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;✦ The Web App (Course Creator) endpoint is:

  https://ad-65d6861112ff49099782001efe5e2721.ecs.us-east-1.on.aws
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The service will be visible in the AWS console:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5t022b841ry8jn2ibzw8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5t022b841ry8jn2ibzw8.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And the entire system can be tested:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; make e2e-test-ecsexpress

✦ The end-to-end test of the AI Course Creator on AWS ECS Express was successful. The system, comprising 5 microservices, correctly researched
  "The History of the Internet," evaluated the findings, and generated a structured 4-module course. The public URL for the Web App is:

  https://ad-65d6861112ff49099782001efe5e2721.ecs.us-east-1.on.aws

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Running the Web Interface
&lt;/h4&gt;

&lt;p&gt;Start a connection to the Cloud Run deployed app:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;http://a27c61bc6fb3c425ca13d862e0fe4aed-865627292.us-east-1.elb.amazonaws.com
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then connect to the app :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8x3ewby1onl8r03adirm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8x3ewby1onl8r03adirm.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then use online course generator:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fduba7v2df0ovz2kxqfz7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fduba7v2df0ovz2kxqfz7.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Final Gemini CLI Code Review
&lt;/h4&gt;

&lt;p&gt;As a final step — Gemini CLI was used for a full code review of the project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;✦ This multi-agent project is a well-engineered application of ADK 2.5 and the A2A protocol. The separation of specialized agents (Researcher,
  Judge, Content Builder) coordinated by a central Orchestrator demonstrates a mature microservice-oriented design.

Key Strengths
   * Coordinated Orchestration: The use of SequentialAgent and LoopAgent creates a robust, iterative research-judge cycle.
   * Resilient State Flow: The strategy of passing data through event content and "recovering" it via before_agent_callback heuristics is a clever
     way to handle state across distributed, independent session stores.
   * Polished Streaming: The web app's SSE implementation, specifically the greedy overlap deduplication (merge_strings) and system message
     cleanup, ensures a high-quality user experience despite the inherent noise in multi-agent LLM streams.
   * Cloud-Native Readiness: Using Identity Tokens for authenticated service-to-service communication and middleware for dynamic A2A URL rewriting
     makes the system ready for production deployment on Cloud Run.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Summary
&lt;/h4&gt;

&lt;p&gt;The Agent Development Kit (ADK) was used to build a multi-agent system with A2A support using the Gemini Flash LLM Model. This application was tested locally with Gemini CLI and then deployed to AWS ECS Express. Finally, Gemini CLI was used for a complete project code review.&lt;/p&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>googleadk</category>
      <category>a2aprotocol</category>
      <category>gemini</category>
      <category>aws</category>
    </item>
    <item>
      <title>AWS Data &amp; AI Stories #01: Multimodal AI</title>
      <dc:creator>Sedat SALMAN</dc:creator>
      <pubDate>Thu, 16 Apr 2026 18:49:58 +0000</pubDate>
      <link>https://dev.to/aws-builders/aws-data-ai-stories-01-multimodal-ai-2k4k</link>
      <guid>https://dev.to/aws-builders/aws-data-ai-stories-01-multimodal-ai-2k4k</guid>
      <description>&lt;p&gt;In traditional AI systems, text was usually the main input.&lt;/p&gt;

&lt;p&gt;But to solve real life problem, text is not enough by alone.&lt;/p&gt;

&lt;p&gt;Today, many workloads include documents, images, audio, and video at the same time. A user may upload a PDF report, attach a photo, send a voice note, or provide a short video clip. If our solution only understands text, we miss a big part of the context.&lt;/p&gt;

&lt;p&gt;This is where multimodal AI becomes important.&lt;/p&gt;

&lt;p&gt;On AWS, multimodal AI is now becoming more practical. Amazon Bedrock Knowledge Bases supports multimodal content such as images, audio, and video, and AWS now provides different processing approaches depending on whether the goal is retrieval or structured extraction.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Multimodal AI?
&lt;/h2&gt;

&lt;p&gt;Multimodal AI means an AI system can work with more than one type of data.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;text&lt;/li&gt;
&lt;li&gt;images&lt;/li&gt;
&lt;li&gt;scanned documents&lt;/li&gt;
&lt;li&gt;audio&lt;/li&gt;
&lt;li&gt;video&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead of focusing on only one format, the system can process and combine multiple data types to produce better results.&lt;/p&gt;

&lt;p&gt;This is useful because enterprise data is rarely pure text. A lot of business value sits inside screenshots, scanned forms, call recordings, diagrams, inspection videos, and media-rich documents. AWS’s current multimodal stack is built around exactly that problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why does it matter?
&lt;/h2&gt;

&lt;p&gt;Because real systems are multimodal by nature.&lt;/p&gt;

&lt;p&gt;Think about a few examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A support team receives images and voice notes from the field&lt;/li&gt;
&lt;li&gt;A finance team works with reports, charts, and scanned documents&lt;/li&gt;
&lt;li&gt;A healthcare team uses forms, reports, and medical images&lt;/li&gt;
&lt;li&gt;An industrial operation stores inspection photos, maintenance PDFs, and recorded operator observations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In all of these cases, text-only AI is limited.&lt;/p&gt;

&lt;p&gt;A multimodal approach helps us move from isolated files to connected understanding.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multimodal AI on AWS
&lt;/h2&gt;

&lt;p&gt;When I look at AWS from a practical point of view, I see multimodal AI as a workflow, not just a model.&lt;/p&gt;

&lt;p&gt;A simple logical flow looks like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Collect multimodal data&lt;/li&gt;
&lt;li&gt;Process and extract useful information&lt;/li&gt;
&lt;li&gt;Store or index the result&lt;/li&gt;
&lt;li&gt;Retrieve relevant context&lt;/li&gt;
&lt;li&gt;Generate answers, summaries, or actions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AWS already has building blocks for this. Amazon Bedrock Data Automation is designed to extract structured insights from documents, images, audio, and video. Amazon Bedrock Knowledge Bases supports multimodal retrieval. AWS also supports Nova Multimodal Embeddings for visual and cross-modal retrieval scenarios.&lt;/p&gt;

&lt;h2&gt;
  
  
  Main AWS services to know
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Amazon Bedrock Data Automation
&lt;/h3&gt;

&lt;p&gt;This is useful when the main challenge is understanding raw input.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;extracting information from documents&lt;/li&gt;
&lt;li&gt;analyzing images&lt;/li&gt;
&lt;li&gt;processing audio&lt;/li&gt;
&lt;li&gt;turning video into structured output&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So if the input is messy and unstructured, this is a strong starting point. AWS positions Bedrock Data Automation specifically for automating insight generation from unstructured multimodal content.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Amazon Bedrock Knowledge Bases
&lt;/h3&gt;

&lt;p&gt;This is useful when the goal is retrieval.&lt;/p&gt;

&lt;p&gt;If you want your AI application to search your content and answer questions based on it, Knowledge Bases becomes important. AWS documentation now states that Bedrock Knowledge Bases supports images, audio, and video in addition to traditional text-based content.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Amazon Nova Multimodal Embeddings
&lt;/h3&gt;

&lt;p&gt;This is useful when the goal is similarity and cross-modal search.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;finding images similar to another image&lt;/li&gt;
&lt;li&gt;searching media with text&lt;/li&gt;
&lt;li&gt;creating a shared semantic space across different content types&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Amazon Nova Multimodal Embeddings supports text, documents, images, video, and audio in a single embedding space, which makes cross-modal retrieval possible.&lt;/p&gt;

&lt;h2&gt;
  
  
  A simple architecture view
&lt;/h2&gt;

&lt;p&gt;At a high level, a multimodal AI architecture on AWS can look like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data comes from users, applications, devices, or storage&lt;/li&gt;
&lt;li&gt;Files are stored in Amazon S3&lt;/li&gt;
&lt;li&gt;Bedrock Data Automation extracts useful content and structure&lt;/li&gt;
&lt;li&gt;Bedrock Knowledge Bases indexes or connects relevant knowledge&lt;/li&gt;
&lt;li&gt;Nova Multimodal Embeddings can support semantic and visual retrieval&lt;/li&gt;
&lt;li&gt;A Bedrock-based application or assistant generates the final output&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This approach is also reflected in AWS guidance and recent AWS machine learning posts around multimodal retrieval and multimodal assistants.&lt;/p&gt;

&lt;h2&gt;
  
  
  Retrieval or extraction?
&lt;/h2&gt;

&lt;p&gt;This is one of the first design questions I would ask.&lt;/p&gt;

&lt;p&gt;Do I want to extract information from the content?&lt;/p&gt;

&lt;p&gt;Or do I want to retrieve relevant content across multiple modalities?&lt;/p&gt;

&lt;p&gt;These are not exactly the same problem.&lt;/p&gt;

&lt;p&gt;If the main need is converting raw media into structured output, Bedrock Data Automation is usually the right starting point.&lt;/p&gt;

&lt;p&gt;If the main need is visual similarity or cross-modal search, Nova Multimodal Embeddings is often the better fit.&lt;/p&gt;

&lt;p&gt;AWS explicitly separates these two approaches in its multimodal guidance, which is useful because many teams try to solve all multimodal problems in the same way.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where can this be used?
&lt;/h2&gt;

&lt;p&gt;There are many practical scenarios.&lt;/p&gt;

&lt;p&gt;A few examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;intelligent document processing&lt;/li&gt;
&lt;li&gt;visual search&lt;/li&gt;
&lt;li&gt;media search&lt;/li&gt;
&lt;li&gt;support case analysis&lt;/li&gt;
&lt;li&gt;industrial inspection workflows&lt;/li&gt;
&lt;li&gt;knowledge assistants&lt;/li&gt;
&lt;li&gt;report summarization from mixed content&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For me, the important point is this: multimodal AI is not only about chat. It is also about turning different content types into usable business knowledge.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common mistake
&lt;/h2&gt;

&lt;p&gt;A common mistake is to think multimodal AI means only “upload file and ask question.”&lt;/p&gt;

&lt;p&gt;That is too simple.&lt;/p&gt;

&lt;p&gt;A real solution usually needs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ingestion&lt;/li&gt;
&lt;li&gt;extraction&lt;/li&gt;
&lt;li&gt;indexing&lt;/li&gt;
&lt;li&gt;retrieval&lt;/li&gt;
&lt;li&gt;generation&lt;/li&gt;
&lt;li&gt;governance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The model is only one part of the story.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final thoughts
&lt;/h2&gt;

&lt;p&gt;Multimodal AI is becoming a practical architecture topic on AWS.&lt;/p&gt;

&lt;p&gt;Instead of treating text, images, audio, and video as separate worlds, we can now build workflows that connect them. AWS already provides managed building blocks for processing, retrieval, and generation across these content types, which makes multimodal design much more realistic than before.&lt;/p&gt;

&lt;p&gt;For me, the first step is not choosing the model.&lt;/p&gt;

&lt;p&gt;The first step is asking:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;What type of data do I have?&lt;/li&gt;
&lt;li&gt;What insight do I need?&lt;/li&gt;
&lt;li&gt;Do I need extraction, retrieval, or both?&lt;/li&gt;
&lt;li&gt;Do I want an assistant, a search engine, or a workflow?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If these answers are clear, the architecture becomes much easier.&lt;/p&gt;

&lt;p&gt;In the next article, I will focus on Amazon Bedrock Data Automation and how it fits into a real multimodal workflow on AWS.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>ai</category>
      <category>datascience</category>
      <category>awsbigdata</category>
    </item>
    <item>
      <title>Monitorando AWS com Datadog: o que aprendi partindo do zero</title>
      <dc:creator>Diogo Maske</dc:creator>
      <pubDate>Thu, 16 Apr 2026 14:07:47 +0000</pubDate>
      <link>https://dev.to/aws-builders/monitorando-aws-com-datadog-o-que-aprendi-partindo-do-zero-3n24</link>
      <guid>https://dev.to/aws-builders/monitorando-aws-com-datadog-o-que-aprendi-partindo-do-zero-3n24</guid>
      <description>&lt;p&gt;🇧🇷&lt;em&gt;O artigo está em PT-BR, fique à vontade para traduzir.&lt;/em&gt;&lt;br&gt;
🇺🇲&lt;em&gt;The article is in PT-BR, feel free to translate it.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;☁️☁️☁️&lt;/p&gt;

&lt;p&gt;Acabei de concluir o laboratório &lt;strong&gt;Introduction to Monitoring AWS with Datadog&lt;/strong&gt; e quero compartilhar o que aprendi, porque foi bem mais interessante do que eu esperava. E também porque tive algumas dúvidas que acho que muita gente iniciante também teria.&lt;/p&gt;

&lt;p&gt;Antes de começar, vou ser honesto: eu sei o que a AWS soluciona e entrega e tinha uma noção vaga do que era o Datadog &lt;em&gt;("aquela ferramenta de monitoramento cara")&lt;/em&gt;. Mas não entendia como os dois se conectavam na prática, nem por que isso seria importante no dia a dia de um time de DevOps.&lt;/p&gt;

&lt;p&gt;O curso usa um app fictício chamado &lt;strong&gt;TechStories&lt;/strong&gt;, &lt;em&gt;uma plataforma de notícias e mídia social hospedada inteiramente na AWS&lt;/em&gt;, como base para os exercícios. Isso ajudou muito, porque deu um contexto real pra tudo.&lt;/p&gt;




&lt;h2&gt;
  
  
  O problema que o curso &lt;strong&gt;resolve&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Imagine que seu time mantém uma aplicação rodando em EC2, com containers no ECS Fargate, banco de dados no RDS, funções Lambda e tabelas no DynamoDB. Como você sabe se está tudo funcionando bem? Você fica alternando entre console da AWS, CloudWatch, logs espalhados, é muito caótico.&lt;/p&gt;

&lt;p&gt;O Datadog entra como uma &lt;em&gt;camada única de observabilidade&lt;/em&gt;: você coleta métricas, logs e traces de tudo isso em um só lugar. O lab simulou exatamente esse cenário.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsf8xcmafb95yv64826pi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsf8xcmafb95yv64826pi.png" alt="Architecture" width="800" height="471"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Como a &lt;strong&gt;integração&lt;/strong&gt; AWS funciona por baixo dos panos
&lt;/h2&gt;

&lt;p&gt;A primeira coisa que o curso explica é que existem basicamente dois jeitos de coletar dados da AWS no Datadog:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Via polling do CloudWatch&lt;/strong&gt; — o Datadog consulta a API da AWS de tempos em tempos pra pegar as métricas. É a forma mais simples de configurar, mas tem um delay de alguns minutos.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Via Metric Streams&lt;/strong&gt; — você configura a AWS pra empurrar as métricas pro Datadog em tempo quase real. Latência bem menor, ideal pra alertas críticos.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Na prática, pra começar você instala a integração AWS no Datadog, cria uma role IAM na sua conta com as permissões corretas, e aponta o Datadog pra essa role. Depois disso ele já começa a descobrir os recursos automaticamente.&lt;/p&gt;




&lt;h2&gt;
  
  
  O &lt;strong&gt;Datadog Forwarder&lt;/strong&gt; e o negócio dos logs
&lt;/h2&gt;

&lt;p&gt;Uma das partes que mais me fez &lt;em&gt;&lt;strong&gt;parar e reler&lt;/strong&gt;&lt;/em&gt; foi o fluxo de logs. Na AWS, os logs dos serviços gerenciados (Lambda, RDS, etc.) vão pro CloudWatch Logs. Mas o Datadog não lê o CloudWatch Logs diretamente, &lt;em&gt;você precisa de um intermediário&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Esse intermediário é o &lt;strong&gt;Datadog Forwarder&lt;/strong&gt;: uma função Lambda que você instala via CloudFormation e que fica "ouvindo" os CloudWatch Log Groups. Quando chega um log novo, ela encaminha pro Datadog. É bonito quando está pronto, mas exige configurar subscriptions pra cada log group que você quer monitorar.&lt;/p&gt;

&lt;p&gt;No lab, depois de configurar isso, consegui ver os logs da Lambda &lt;code&gt;keyword-insights-processor&lt;/code&gt; direto no &lt;strong&gt;Log Explorer do Datadog&lt;/strong&gt;, com todos os campos estruturados, tags automáticas de ambiente, serviço, ARN da função, etc. Bem diferente de ficar vasculhando no CloudWatch.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Framnc382tpilx0705ndi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Framnc382tpilx0705ndi.png" alt="keyword-insights-processor" width="800" height="471"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Datadog Agent:&lt;/strong&gt; quando os serviços gerenciados não são suficientes
&lt;/h2&gt;

&lt;p&gt;Pra EC2 e containers, o Datadog tem o próprio agente, um processo que roda dentro da máquina/container e coleta métricas muito mais granulares do que o CloudWatch oferece: memória por processo, latência de disco, métricas customizadas, traces...&lt;/p&gt;

&lt;p&gt;No caso do ECS Fargate, você sobe o Agent como um sidecar container na mesma task definition do seu serviço. Quando vi isso me pareceu muito trabalhoso, mas no lab eles mostram que dá pra fazer via CloudFormation, e as tags ficam todas consistentes por causa de um padrão de &lt;code&gt;key:value&lt;/code&gt; definido nos recursos:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;App:TechStories / Env:monitoring-aws-lab / Service:&amp;lt;nome-do-serviço&amp;gt;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4jqhs1ufod41olx3hj0m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4jqhs1ufod41olx3hj0m.png" alt="ECS Fargate Services - Tags" width="800" height="471"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Isso é fundamental.&lt;/strong&gt; Sem tags consistentes, você não consegue filtrar nada no Datadog depois.&lt;/p&gt;




&lt;h2&gt;
  
  
  O que dá pra ver depois que tudo está integrado
&lt;/h2&gt;

&lt;p&gt;O Resource Catalog mostrou tudo que foi descoberto na conta: 343 databases, 70 containers, 22 serverless functions, 1 host EC2... tudo categorizado, com região, ambiente e serviço associado. Dá pra clicar em qualquer recurso e ver as métricas diretamente.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffbf57w8t1z61b8pca5lz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffbf57w8t1z61b8pca5lz.png" alt="Resource Catalog" width="800" height="471"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Na seção de Metrics, busquei por &lt;code&gt;aws.dynamodb&lt;/code&gt; e apareceram 22 métricas diferentes — capacidade de leitura/escrita, número de itens, tamanho das tabelas. Coisas que antes eu teria que montar dashboards manualmente no CloudWatch.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy5jbp6kvtxrr6mlgtvtl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy5jbp6kvtxrr6mlgtvtl.png" alt="Metrics" width="800" height="471"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;O &lt;strong&gt;AWS Overview Dashboard&lt;/strong&gt;, que vem pronto como OOTB (out-of-the-box) no Datadog, já trouxe um panorama geral: 1 instância EC2 rodando, 4 monitores de status todos OK, 491 invocações Lambda na última hora, taxa de erro 0%, duração média de 988ms. Também dá pra ver traces das requisições HTTP diretamente no host EC2: cada request com host, serviço, recurso e latência.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3oaqp4r9hqeq4pxre7pw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3oaqp4r9hqeq4pxre7pw.png" alt="AWS Overview Dashboard" width="800" height="471"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  O que ficou de lição 🧠
&lt;/h2&gt;

&lt;p&gt;Mais do que as ferramentas em si, o lab reforçou uma coisa: observabilidade não é sobre acumular dados. É sobre ter os dados certos, com contexto suficiente &lt;strong&gt;(tags!)&lt;/strong&gt; pra você conseguir responder perguntas quando algo dá errado.&lt;/p&gt;

&lt;p&gt;A integração &lt;strong&gt;AWS + Datadog&lt;/strong&gt;, quando bem configurada, te dá isso: &lt;em&gt;um lugar só pra métricas, logs e traces, com correlação entre eles.&lt;/em&gt; Você vê que o tempo de resposta subiu, clica no trace, chega no log do Lambda, entende o que aconteceu.&lt;/p&gt;

&lt;p&gt;Ainda tenho muito a aprender (APM, alertas, SLOs...), mas foi uma ótima base. Recomendo o &lt;em&gt;Learning Center do Datadog&lt;/em&gt; pra quem quiser começar, os labs são práticos e o ambiente já vem configurado, então você foca no aprendizado em vez de ficar lutando com IAM.&lt;/p&gt;

&lt;p&gt;Se quiser trocar ideia sobre isso ou acompanhar minha jornada, me encontra aqui no &lt;a href="https://www.linkedin.com/in/diogomaske/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; ✌️&lt;/p&gt;

</description>
      <category>aws</category>
      <category>datadog</category>
      <category>productivity</category>
      <category>cloud</category>
    </item>
    <item>
      <title>The Setup Is the Strategy: How I Orchestrated a Product Migration with Claude Code</title>
      <dc:creator>Karthik Subramanian</dc:creator>
      <pubDate>Thu, 16 Apr 2026 13:49:55 +0000</pubDate>
      <link>https://dev.to/aws-builders/the-setup-is-the-strategy-how-i-orchestrated-a-product-migration-with-claude-code-b92</link>
      <guid>https://dev.to/aws-builders/the-setup-is-the-strategy-how-i-orchestrated-a-product-migration-with-claude-code-b92</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Most engineers using Claude Code are getting a fraction of its value. Not because the tool isn't capable — but because they're using it out of the box, unconfigured, the way you'd use a new IDE without installing extensions or setting up your build system. The default experience is decent. The configured experience is transformative.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I'm a Senior Software Engineering Manager leading a team focused on leveraging AI to find acceleration opportunities across the software development lifecycle. I've been playing around with AI tooling for a while and could see its potential, so I proposed a proof of concept: take a real product migration — the kind of project that would normally require a team of engineers across multiple sprints — and attempt it solo, using Claude Code as my primary development platform. Not "AI-assisted development" An AI-first model where every phase of the SDLC runs through Claude Code's feature set.&lt;/p&gt;

&lt;p&gt;The migration itself was substantial: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;4 legacy repositories consolidated into 2, &lt;/li&gt;
&lt;li&gt;a database migration from MySQL to PostgreSQL, &lt;/li&gt;
&lt;li&gt;framework upgrades across the full stack (Spring Boot 2 to 3, Java 17 to 21, React 17 to 18), &lt;/li&gt;
&lt;li&gt;an authentication model replacement, &lt;/li&gt;
&lt;li&gt;and complete test suites for everything.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What made this work wasn't Claude Code itself. It was how I set it up.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The skills that made me effective at leading engineering teams — providing clear context, delegating with specificity, reviewing rigorously, building repeatable processes — turned out to be exactly the skills that make Claude Code most effective. Engineers whose day-to-day revolves around writing code by hand sometimes struggle with this shift because the instinct is to do the work yourself, not to set up the system and direct it. I'd spent years not writing the code myself. I was already an orchestrator. The medium changed, the model didn't.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This post walks through how I configured Claude Code for a real migration, organized as building blocks. Each layer depends on the one below it. Skip the foundation and the rest falls apart.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc6s9pycdaj67n89elccc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc6s9pycdaj67n89elccc.png" alt="Building blocks of an AI-first migration" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Building Block 1: Planning — Laying the Foundation
&lt;/h2&gt;

&lt;p&gt;Before any code was written, I built the context and planning infrastructure. This is the layer most people skip, and it's the layer that matters most.&lt;/p&gt;

&lt;h3&gt;
  
  
  CLAUDE.md — The Master Context
&lt;/h3&gt;

&lt;p&gt;Claude Code reads CLAUDE.md files on every session start. They're your project memory — the equivalent of onboarding documentation for a new team member. I built a multi-level hierarchy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Workspace-level&lt;/strong&gt; (181 lines): overview of all 25+ repositories, tech stack summary, cross-service architecture, shared build commands&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Domain-level&lt;/strong&gt;: 18-service catalog, glossary of domain terms, JWT authentication architecture, service communication patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-repo&lt;/strong&gt;: service-specific conventions, module structure, testing standards, local dev setup&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each level inherits from its parent. When Claude Code opens a session in any repo, it automatically has the full context stack — from the broadest architectural view down to the specific service conventions.&lt;/p&gt;

&lt;p&gt;This is the single highest-leverage thing you can configure. A well-written CLAUDE.md prevents Claude from re-exploring your codebase every session, asking questions you've already answered, or making assumptions that contradict your architecture. It's free, it's immediate, and it compounds over time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fev8c962aj9abj5k571cv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fev8c962aj9abj5k571cv.png" alt="CLAUDE.md hierarchy — workspace, domain, and repo levels" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Memory System
&lt;/h3&gt;

&lt;p&gt;Claude Code has a persistent memory system — files that survive across sessions. I built 17 memory files organized by type:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;User&lt;/strong&gt;: who I am, my role, what I'm working on&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feedback&lt;/strong&gt;: corrections that become permanent rules&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Project&lt;/strong&gt;: active work context, ticket maps, architectural decisions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reference&lt;/strong&gt;: pointers to external systems (Confluence pages, Jira boards, SonarQube dashboards)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The feedback memories are the most powerful. Every time I corrected Claude — "don't amend commits, it breaks CI and MR reviews" or "always fix all test failures before committing, even seemingly pre-existing ones" or "copy application-local.yml to worktrees because it's gitignored" — that correction became a permanent rule. One-time mistakes became permanent automation. After a few weeks, the memory system had captured dozens of workflow-specific rules that would take a new team member months to internalize.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Context Funnel
&lt;/h3&gt;

&lt;p&gt;For the planning phase itself, I fed Claude Code everything it would need to design the migration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The four legacy repositories (full source access)&lt;/li&gt;
&lt;li&gt;Dev database connections (read-only) to both MySQL and PostgreSQL&lt;/li&gt;
&lt;li&gt;A reference implementation — a similar product that had already been migrated to the platform&lt;/li&gt;
&lt;li&gt;Live Swagger documentation from upstream services (router, rostering, authentication APIs)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With this context loaded, I used the &lt;strong&gt;brainstorming skill&lt;/strong&gt; (from the superpowers plugin) to generate the migration design across all eight repositories simultaneously. The skill enforces a structured process: explore context, ask clarifying questions, propose approaches with trade-offs, present the design for approval, then write a spec document.&lt;/p&gt;

&lt;p&gt;I also used &lt;strong&gt;agent teams&lt;/strong&gt; — an experimental Claude Code feature that runs parallel reviewers with independent context windows — to stress-test the design. Three independent agents reviewed the same architecture and caught issues a single pass missed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Missing resume logic for interrupted user flows (a legacy endpoint had been removed without accounting for in-progress sessions)&lt;/li&gt;
&lt;li&gt;Frontend state invalidation gaps in the data fetching layer&lt;/li&gt;
&lt;li&gt;Unnecessary network hops that could be eliminated now that previously separate services lived in the same JVM&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Giving AI direct database access allowed exact column-by-column mapping between the legacy and target schemas. It caught DDL mismatches — timestamp type differences, nullable column discrepancies, default value conflicts — that ORM annotations hide. Without this, the migration would have hit runtime errors that are painful to debug after the fact.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp3q877d0dn02arswueu4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp3q877d0dn02arswueu4.png" alt="The context funnel — from raw inputs to structured outputs" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Atlassian MCP + Custom Jira Skill — From Design to Tickets
&lt;/h3&gt;

&lt;p&gt;This is where the &lt;strong&gt;Atlassian MCP&lt;/strong&gt; enters the story. It connects Claude Code directly to Jira and Confluence — no browser, no context switching.&lt;/p&gt;

&lt;p&gt;First, the design became documentation: 20+ Confluence pages generated directly from Claude Code via MCP. Design documents, use case specifications, system architecture diagrams — all created and published without leaving the terminal. That said, this is where I hit my first major failure. The Atlassian MCP's &lt;code&gt;updateConfluencePage&lt;/code&gt; tool silently truncates content beyond ~5KB. I asked Claude to update two design documents — 37KB and 46KB — and both were overwritten with partial content. I had to manually restore them from Confluence's page history. The data loss was real. I immediately encoded a memory rule: never update large Confluence pages via MCP, only add comments. Lesson learned the hard way.&lt;/p&gt;

&lt;p&gt;Then came the decomposition. This was one of the most powerful things I did: I tasked Claude with breaking the architecture into Jira tickets scoped for three constraints:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reviewable code reviews&lt;/strong&gt;: no 2,000-line merge requests that reviewers rubber-stamp. Each ticket's scope had to produce a merge request a human could meaningfully assess.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;QA throughput&lt;/strong&gt;: QA can't test a monolithic "migrate everything" ticket. Each ticket needed to be independently testable with clear acceptance criteria.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallel development&lt;/strong&gt;: tickets needed clean boundaries so multiple could be in-flight simultaneously without merge conflicts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result: &lt;strong&gt;19 Jira tickets created in a single session&lt;/strong&gt; from the design docs. Each with acceptance criteria in Atlassian Document Format, story points, sprint assignment, and a parent epic link. But it couldn't link them — the MCP tool for creating issue links between tickets throws a "not found" error. I had to go into Jira manually and add the "is blocked by" relationships myself. Not everything is automatable yet.&lt;/p&gt;

&lt;p&gt;The Jira API has other quirks that would bite you every session without the right setup. So I built the &lt;strong&gt;&lt;code&gt;my-jira&lt;/code&gt; skill&lt;/strong&gt; — a custom skill file that encodes all the workarounds:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;createJiraIssue&lt;/code&gt; renders newlines as literal &lt;code&gt;\n&lt;/code&gt; text. The skill enforces a follow-up &lt;code&gt;editJiraIssue&lt;/code&gt; call to fix formatting.&lt;/li&gt;
&lt;li&gt;Story points live in &lt;code&gt;customfield_10058&lt;/code&gt;, not the obvious-looking field. The wrong field silently saves to the wrong place — you'd never know until someone checks the sprint board.&lt;/li&gt;
&lt;li&gt;QA testing note templates, project constants, sprint IDs, assignee account IDs — all encoded in one place.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One skill file eliminated an entire class of silent failures.&lt;/p&gt;




&lt;h2&gt;
  
  
  Building Block 2: Execution — The Development Loop
&lt;/h2&gt;

&lt;p&gt;With the plan in place and tickets created, execution begins. Each ticket follows the same cycle: design doc, plan doc, execute, review, iterate, merge, move to QA. The tools enter the story as the workflow demands them.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6zm5g17lasdpj2blujq5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6zm5g17lasdpj2blujq5.png" alt="The per-ticket execution loop" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Per-Ticket Planning
&lt;/h3&gt;

&lt;p&gt;Every ticket — no matter how small — gets two documents before any code is written:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Design doc&lt;/strong&gt; (what + why): the problem being solved, the approach, the constraints&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plan doc&lt;/strong&gt; (how + steps): every file to change, every migration rule, every commit interval&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I review the plan before execution starts. This is the gate. The AI generates, I validate. I know the domain, the constraints, the edge cases that don't show up in code. This is where the orchestrator model is most visible: I'm not writing plans by hand, but I'm reading every one and catching the things that only domain knowledge reveals.&lt;/p&gt;

&lt;h3&gt;
  
  
  Worktrees — Parallel Execution
&lt;/h3&gt;

&lt;p&gt;Each ticket executes in a dedicated git worktree. Claude Code's &lt;code&gt;/execute-plan&lt;/code&gt; skill runs the plan step-by-step in an isolated working directory.&lt;/p&gt;

&lt;p&gt;At peak, I had &lt;strong&gt;8 active worktrees across 2 repositories&lt;/strong&gt; — 4 tickets developed concurrently. The tickets depended on each other, but the worktree model let me develop them in parallel and rebase with &lt;code&gt;--onto&lt;/code&gt; as dependencies merged upstream. All four hit QA in the same sprint.&lt;/p&gt;

&lt;p&gt;The practical limit: about 4 active Claude Code sessions at a time, depending on how many contexts you can keep in your head. You're reviewing output from multiple streams, making judgment calls, and keeping the overall architecture coherent. It's project management, not coding.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4bfrwix1mw81qve1t0z0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4bfrwix1mw81qve1t0z0.png" alt="8 worktrees, 2 repos, 4 tickets shipping in parallel" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  GitLab CLI + The &lt;code&gt;manage-mr&lt;/code&gt; Skill
&lt;/h3&gt;

&lt;p&gt;Merge requests aren't just &lt;code&gt;git push&lt;/code&gt;. There's the strategy description, the pipeline to monitor, quality gates to check, and the fix-push-recheck cycle when something fails.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;&lt;code&gt;manage-mr&lt;/code&gt; skill&lt;/strong&gt; wraps the full lifecycle:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create the MR with a description derived from the plan doc&lt;/li&gt;
&lt;li&gt;Monitor the CI pipeline with &lt;code&gt;/loop&lt;/code&gt; on a recurring interval&lt;/li&gt;
&lt;li&gt;Check SonarQube quality gates&lt;/li&gt;
&lt;li&gt;If anything fails: read the failure, trace to source, fix, re-push&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The &lt;strong&gt;&lt;code&gt;/loop&lt;/code&gt; skill&lt;/strong&gt; deserves its own mention. It runs a command on a configurable interval — I used it to poll CI pipelines. Pipeline fails? Claude reads the build log, traces the error to the source file, applies a fix, pushes, and the loop continues. No browser, no manual checking.&lt;/p&gt;

&lt;p&gt;One recurring failure pattern worth mentioning: AI would sometimes aggressively remove "unused" state variables without checking the callbacks that referenced them, breaking CI. It also missed secondary integration tests that asserted on the removed behavior. The fix was straightforward each time, but the pattern recurred enough that I added a memory rule: "verify all references before removing anything." The pipeline loop caught these quickly, but they shouldn't have happened in the first place.&lt;/p&gt;

&lt;h3&gt;
  
  
  SonarQube MCP
&lt;/h3&gt;

&lt;p&gt;Connected via MCP server, Claude can query pull request issues, check quality gates, and fix vulnerabilities directly from the terminal. The migration shipped with &lt;strong&gt;95%+ API coverage, 91% frontend coverage, zero bugs, zero vulnerabilities, and zero security hotspots&lt;/strong&gt; at the time of QA handoff.&lt;/p&gt;

&lt;h3&gt;
  
  
  Chrome DevTools MCP
&lt;/h3&gt;

&lt;p&gt;For frontend work, Claude needs to see the actual rendered application — not just the code. The Chrome DevTools MCP connects Claude to a live browser session. I log in, navigate to the page, and Claude inspects the live DOM, console errors, and network requests.&lt;/p&gt;

&lt;p&gt;This is a game-changer for UI work. It finds CSS/layout bugs, missing state updates, and rendering issues that code-level analysis and screenshots could never surface. Claude can see what the user sees.&lt;/p&gt;

&lt;h3&gt;
  
  
  Figma MCP
&lt;/h3&gt;

&lt;p&gt;During frontend porting, Claude references Figma designs directly via MCP. No screenshotting, no describing layouts in words. It reads the design context — component structure, spacing, colors, typography — and translates to code. This kept the ported UI faithful to the design without the constant back-and-forth of "does this match the mockup?"&lt;/p&gt;

&lt;h3&gt;
  
  
  Postman MCP
&lt;/h3&gt;

&lt;p&gt;The API collection stays in sync with endpoint changes. Test scripts auto-chain with dynamic JWT extraction. This matters because QA depends on Postman to validate the API — if the collection is stale, they're blocked. The MCP integration ensures the collection reflects the latest API state at all times.&lt;/p&gt;

&lt;h3&gt;
  
  
  The &lt;code&gt;move-to-qa&lt;/code&gt; Skill
&lt;/h3&gt;

&lt;p&gt;When a ticket is ready for QA, there's a ritual: add structured testing notes (environment, credentials, test steps, caveats), transition the ticket to the QA column, notify the QA channel. Getting any step wrong means the QA engineer wastes time asking clarifying questions.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;&lt;code&gt;move-to-qa&lt;/code&gt; skill&lt;/strong&gt; encodes the entire handoff as a single invocation. One command handles the comment (in the exact template QA expects), the Jira transition, and the notification. Consistent handoff, every time, no steps skipped.&lt;/p&gt;

&lt;h3&gt;
  
  
  Keeping Everything in Sync
&lt;/h3&gt;

&lt;p&gt;Here's where the orchestration model really pays off. When implementation changes a design decision or uncovers a requirement gap, Claude updates Confluence and Jira in the same session — with the caveat that large page edits go through comments, not full page updates (see the Confluence truncation lesson above). The old friction — finish code, open browser, update Jira, update Confluence — is the kind of manual chore that developers frequently skip. Now it's one prompt: "Update the design doc to reflect that we're using a materialized view instead of a join, and comment the change on the Jira ticket."&lt;/p&gt;

&lt;p&gt;Documentation stays in sync because it's part of the workflow, not an afterthought.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9uwf130ruubiryvq07co.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9uwf130ruubiryvq07co.png" alt="The MCP ecosystem — every system connected to one terminal" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Building Block 3: Review &amp;amp; Iteration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Code Review Plugins
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;code-review&lt;/strong&gt; and &lt;strong&gt;pr-review-toolkit&lt;/strong&gt; plugins run multi-agent PR reviews with confidence-based scoring. They're effective as a first-pass filter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Caught syntax and formatting bugs&lt;/li&gt;
&lt;li&gt;Off-by-one errors in date filters&lt;/li&gt;
&lt;li&gt;Missing transactional annotations&lt;/li&gt;
&lt;li&gt;Raw data leakage in error responses&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI caught roughly &lt;strong&gt;30-40% of issues&lt;/strong&gt; — the low-to-mid-level stuff that humans miss under time pressure.&lt;/p&gt;

&lt;p&gt;But the high-level stuff still needed a human reviewer: permission architecture that needed refactoring, structural design decisions for domain enums, Java stream filtering optimizations, missing API documentation annotations. These aren't bugs — they're design-level improvements that require understanding the system's intent, not just its syntax. AI review is a floor-raiser, not a ceiling-raiser. It catches what slips through, but it doesn't replace architectural judgment.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0y4d2erkashnrnigrans.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0y4d2erkashnrnigrans.png" alt="AI vs Human review — complementary, not competing" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Responding to Review Comments
&lt;/h3&gt;

&lt;p&gt;When human reviewers leave comments on merge requests, Claude reads and responds via the GitLab integration. Implement the requested changes, push, and the review loop continues — all from the terminal. The reviewer doesn't know or care that the fixes were AI-assisted. They just see responsive, well-reasoned changes.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Cost
&lt;/h2&gt;

&lt;p&gt;Honest accounting matters. Over the course of the migration — roughly 15 work days from planning through QA handoff — I spent close to &lt;strong&gt;$5,000&lt;/strong&gt; in API token costs running Claude Code through AWS Bedrock.&lt;/p&gt;

&lt;p&gt;For a migration of this scope — four repos, database migration, full-stack framework upgrades, auth model replacement, ~50 tickets, ~580 tests — that's a fraction of what the engineering time alone would cost with a traditional team across multiple sprints.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lessons learned on cost management:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Clear your context frequently.&lt;/strong&gt; This is the single biggest cost lever. Claude Code caches your conversation context and re-reads it on every turn. Long marathon sessions accumulate enormous cache read and cache write charges that dwarf the actual input/output token costs. Use &lt;code&gt;/compact&lt;/code&gt; aggressively, and prefer multiple shorter focused sessions over all-day marathons.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use the right model for the job.&lt;/strong&gt; Opus for planning, architecture, and complex reasoning. Sonnet for routine execution, test generation, and boilerplate. Haiku for quick lookups. The cost difference between models is significant and most execution work doesn't need the most powerful model.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The cost is front-loaded.&lt;/strong&gt; The initial setup — CLAUDE.md files, skills, MCPs, memory rules, the design phase — was the most token-intensive period. Once configured, subsequent tickets were dramatically cheaper because the context was already built and the workflows were encoded.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;By the end of the POC, the migration had produced roughly 50 tickets, 580 passing tests, 95%+ API coverage, and zero bugs, vulnerabilities, or security hotspots at QA handoff. One engineer, half a month, one tool — configured deliberately for every phase of the work.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The throughput increase didn't come from AI writing better code. It came from iterating faster, verifying more thoroughly, and managing parallel execution streams. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The building blocks made this possible:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff92aw1qf96b2uw0k640m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff92aw1qf96b2uw0k640m.png" alt="The building blocks — planning, execution, review" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Planning&lt;/strong&gt;: CLAUDE.md gave the AI context. Memory gave it institutional knowledge. The brainstorming skill gave it structure. The Atlassian MCP and custom Jira skill turned designs into documentation and actionable, well-scoped tickets.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execution&lt;/strong&gt;: worktrees enabled parallel development. Skills encoded repeatable workflows. MCPs connected Claude Code to every system in the SDLC — version control, CI/CD, code quality, design tools, project management, documentation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review&lt;/strong&gt;: plugins raised the floor on code review quality. Human reviewers caught the architectural and design-level issues that AI can't.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each layer depends on the one below it. Skip the foundation — the CLAUDE.md files, the memory, the skills — and the execution layer produces mediocre results. That's what most people experience. They skip straight to "write me some code" without investing in the setup, get underwhelming output, and conclude the tool isn't useful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The setup is the strategy.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you're starting from zero, here's my recommendation: write your CLAUDE.md first. Just the basics — tech stack, project structure, build commands, conventions. Then add one MCP integration for the system you context-switch to most often (probably your issue tracker). Then build one custom skill for your most repeated workflow. Build up from there. Each layer makes the next one more effective.&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>ai</category>
      <category>softwareengineering</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Control Your Tesla from the Terminal with a Kiro CLI Skill</title>
      <dc:creator>Jérôme GUYON</dc:creator>
      <pubDate>Thu, 16 Apr 2026 08:49:16 +0000</pubDate>
      <link>https://dev.to/aws-builders/control-your-tesla-from-the-terminal-with-a-kiro-cli-skill-472g</link>
      <guid>https://dev.to/aws-builders/control-your-tesla-from-the-terminal-with-a-kiro-cli-skill-472g</guid>
      <description>&lt;p&gt;🔗 &lt;a href="https://github.com/guyon-it-consulting/myteslamate-skills-and-power" rel="noopener noreferrer"&gt;https://github.com/guyon-it-consulting/myteslamate-skills-and-power&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpba05j5jp5o47rim0kbg.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpba05j5jp5o47rim0kbg.gif" alt="Demo" width="708" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Last Tuesday, I was deep into a CDK refactor — the kind where I have 14 files open and I'm scared to blink. Then a thought hit me: did I turn on Sentry mode? My car was parked at the train station. I could grab my phone, open the Tesla app, wait for it to wake the car, scroll to Security, check the toggle… or I could just not break my flow.&lt;/p&gt;

&lt;p&gt;What if I could ask my coding assistant instead?&lt;/p&gt;

&lt;p&gt;Turns out, I can. I built a Kiro CLI skill that lets me control my Tesla straight from the terminal. &lt;strong&gt;It was also the perfect excuse to learn how to create a Kiro CLI Skill 😊&lt;/strong&gt; — and what better way to test a new feature than with something fun? Check the battery, lock the doors, toggle Sentry mode, pull up charging stats — all without leaving my editor. The secret ingredient? An MCP server that wraps APIs that any AI assistant can call.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is MyTeslaMate?
&lt;/h2&gt;

&lt;p&gt;Before we get to the skill itself, let me introduce the engine behind it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://myteslamate.com" rel="noopener noreferrer"&gt;MyTeslaMate&lt;/a&gt; is the hosted version of &lt;a href="https://github.com/teslamate-org/teslamate" rel="noopener noreferrer"&gt;TeslaMate&lt;/a&gt;, the most popular open-source data logger for Tesla vehicles. If you own a Tesla and you haven't heard of TeslaMate, stop reading and go look at it. It continuously records every drive, charge session, sleep cycle, and software update into a PostgreSQL database, and exposes rich Grafana dashboards — battery degradation, charging curves, trip history, lifetime stats, vampire drain, efficiency trends, and more.&lt;/p&gt;

&lt;p&gt;MyTeslaMate takes all of that and hosts it for you. No Docker, no self-hosting, no database maintenance. It adds premium features like supercharger cost import, automations, fleet management, and — this is the part we care about — an MCP server.&lt;/p&gt;

&lt;p&gt;The MCP server at &lt;code&gt;https://mcp.myteslamate.com/mcp&lt;/code&gt; wraps two APIs into a single endpoint:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;API&lt;/th&gt;
&lt;th&gt;Tools&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tesla Fleet API&lt;/td&gt;
&lt;td&gt;98&lt;/td&gt;
&lt;td&gt;Vehicle commands, energy control, charging, navigation, security&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TeslaMate API&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Drive stats, charging analytics, efficiency data, trip history&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That's &lt;strong&gt;100+ tools&lt;/strong&gt; accessible from any MCP-compatible AI assistant. Authentication is handled via OAuth SSO with Tesla's authorization server.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqwhc5uxby03ify43j94z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqwhc5uxby03ify43j94z.png" alt="MyTeslaMate dashboard" width="800" height="621"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What is a Kiro CLI Skill?
&lt;/h2&gt;

&lt;p&gt;A skill is how you teach Kiro CLI about a specific domain. It's a markdown file with YAML frontmatter that describes what the skill does and when to activate it. Think of it as a cheat sheet that Kiro loads on demand — it doesn't bloat your context until you actually need it.&lt;/p&gt;

&lt;p&gt;A skill has two parts:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The frontmatter&lt;/strong&gt; — metadata that tells Kiro when to load the skill:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;tesla-commands&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Control your Tesla vehicle and energy products via MyTeslaMate MCP server.&lt;/span&gt;
  &lt;span class="s"&gt;Use when the user asks about their car, vehicle status, lock/unlock, climate control, charging, Powerwall, solar production, energy optimization, Sentry mode, trip planning, drive statistics, or any Tesla-related query. Triggers on "tesla", "my car", "vehicle",&lt;/span&gt;
  &lt;span class="s"&gt;"charge", "battery", "climate", "powerwall", "solar", "sentry", "lock", "unlock",&lt;/span&gt;
  &lt;span class="s"&gt;"supercharger", "road trip", "energy", "charging history", "drive stats".&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The body&lt;/strong&gt; — capabilities, workflow instructions, and safety rules that guide the agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Tesla Commands&lt;/span&gt;

The &lt;span class="sb"&gt;`tesla-mcp`&lt;/span&gt; server exposes &lt;span class="gs"&gt;**100+ tools**&lt;/span&gt;.

&lt;span class="gu"&gt;## Capabilities&lt;/span&gt;
&lt;span class="gu"&gt;### Vehicle Control (64 commands)&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Doors &amp;amp; Access: lock, unlock, open/close trunk, open frunk
&lt;span class="p"&gt;-&lt;/span&gt; Climate: start/stop HVAC, set temps, seat heaters, steering wheel heater
&lt;span class="p"&gt;-&lt;/span&gt; Charging: start/stop charge, set charge limit, schedule charging
...

&lt;span class="gu"&gt;## Workflow&lt;/span&gt;
&lt;span class="p"&gt;1.&lt;/span&gt; Use @tesla-mcp tools directly for all Tesla operations.
&lt;span class="p"&gt;2.&lt;/span&gt; Check current state before making changes.
&lt;span class="p"&gt;3.&lt;/span&gt; Wake the vehicle before sending action commands if the car is asleep.

&lt;span class="gu"&gt;## Safety&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Confirm with the user before executing security-sensitive commands.
&lt;span class="p"&gt;-&lt;/span&gt; Always show current state before making changes.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The skill lives in &lt;code&gt;~/.kiro/skills/tesla-commands/SKILL.md&lt;/code&gt;. But a skill alone isn't enough — you also need an &lt;strong&gt;agent&lt;/strong&gt; that knows how to use it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Tesla agent configuration
&lt;/h2&gt;

&lt;p&gt;I created a dedicated agent: a JSON file that ties everything together: the skill, the MCP server and the tools.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tesla"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Tesla vehicle and energy control agent via MyTeslaMate MCP server"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"prompt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"You help the user monitor and control their Tesla vehicle and energy products. Use the tesla-mcp tools for all operations. Present data in a human-readable format. Confirm before executing security-sensitive commands."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tools"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"read"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"shell"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"grep"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"glob"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@tesla-mcp"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"allowedTools"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"read"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@tesla-mcp"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"resources"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"skill://.kiro/skills/tesla-commands/SKILL.md"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tesla-mcp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://mcp.myteslamate.com/mcp"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"welcomeMessage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Tesla control ready. What would you like to do with your car?"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few things worth noting here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;resources&lt;/code&gt; uses the &lt;code&gt;skill://&lt;/code&gt; URI scheme. This tells Kiro to load the skill's metadata at startup but defer loading the full content until it's actually needed. No wasted context. (See &lt;a href="https://kiro.dev/docs/cli/custom-agents/configuration-reference" rel="noopener noreferrer"&gt;Agent Configuration Reference&lt;/a&gt; in the Kiro docs: &lt;em&gt;"skill:// — Skills progressively loaded on demand"&lt;/em&gt;.)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mcpServers&lt;/code&gt; points to the remote MyTeslaMate MCP server in this case, no local mcp server needed.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;allowedTools&lt;/code&gt; auto-approves &lt;code&gt;read&lt;/code&gt; and all &lt;code&gt;@tesla-mcp&lt;/code&gt; tools so you don't get prompted for every single API call.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How to set it up
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Prerequisites:&lt;/strong&gt; You'll need a Tesla vehicle (obviously) and a &lt;a href="https://myteslamate.com" rel="noopener noreferrer"&gt;MyTeslaMate&lt;/a&gt; account. Sign up, link your Tesla account, and pick a subscription plan. This gives you access to the MCP server and the TeslaMate analytics dashboards.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Clone the repo and copy the files.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/guyon-it-consulting/myteslamate-skills-and-power.git
&lt;span class="nb"&gt;cd &lt;/span&gt;myteslamate-skills-and-power

&lt;span class="c"&gt;# Copy the skill&lt;/span&gt;
&lt;span class="nb"&gt;cp&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; kiro/skills/tesla-commands ~/.kiro/skills/

&lt;span class="c"&gt;# Copy the agent&lt;/span&gt;
&lt;span class="nb"&gt;cp &lt;/span&gt;kiro/agents/tesla.json ~/.kiro/agents/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; Start a new Kiro CLI session and switch to Tesla Agent&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;kiro-cli
&lt;span class="go"&gt;
/agent swap tesla

✔ Switched to agent: tesla
Tesla control ready. What would you like to do with your car?
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 3:&lt;/strong&gt; Authenticate with Tesla.&lt;/p&gt;

&lt;p&gt;Run &lt;code&gt;/mcp&lt;/code&gt;. The &lt;code&gt;tesla-mcp&lt;/code&gt; server will show &lt;code&gt;⚠ auth-required&lt;/code&gt;. Follow the instructions to copy the OAuth URL, open it in your browser, and sign in with your Tesla account. This is a one-time step — the token persists across sessions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiobcml9paboucvq4yr6m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiobcml9paboucvq4yr6m.png" alt="OAuth flow" width="800" height="553"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Using it
&lt;/h2&gt;

&lt;p&gt;Now the fun part. Just talk naturally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; Is Sentry mode on?

Nope — Sentry Mode is currently off on your Model Y. 
It is available though, so I can turn it on for you if you'd like.
Want me to enable it?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; What's my battery at?

Fresh data from TeslaMate:
- Battery level: 80% (hit your charge limit!)
- Estimated range: ~411 km
- Charging state: Complete — plugged in but done charging
- Energy added this session: 34.1 kWh

Looks like it finished charging since we last checked (was at 77% earlier).
You're topped up to your 80% limit.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; Lock my car

Your car is already locked according to the latest status.
No action needed!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; How much did I spend on charging this month?

Here's your April charging summary (so far, through April 10):

┌──────────────────┬──────────┬───────────────┬────────────┐
│ Type             │ Sessions │ Energy        │ Cost       │
├──────────────────┼──────────┼───────────────┼────────────┤
│ Supercharger     │ 2        │ 105.0 kWh     │ €37.88     │
│ Home/Destination │ 5        │ 60.98 kWh     │ €0*        │
│ **Total**        │ **7**    │ **166.0 kWh** │ **€37.88** │
└──────────────────┴──────────┴───────────────┴────────────┘

No Data, estimating home/destination charging at ~€0.25/kWh, that's roughly €15 extra,
bringing the estimated total to around €53
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; What's my battery health status ?

Your battery is in great shape:
- Battery health: 98.6%
- Current capacity: 75.7 kWh (out of 76.8 kWh max)
- Current max range: 506 km (vs 512 km when new)
- Degradation: only ~1.4% 

Very healthy for a car with 435 drives logged.
Nothing to worry about there.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent checks the current state before making changes, asks for confirmation on security-sensitive commands, and presents data with proper units. All of that behavior comes from the SKILL.md instructions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Things to know
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;100+ tools, one endpoint.&lt;/strong&gt; The MyTeslaMate MCP server covers vehicle control, vehicle data, energy/Powerwall/solar, charging history, and TeslaMate analytics. You don't need to know which API to call — the agent figures it out.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Wake before you command.&lt;/strong&gt; Tesla vehicles go to sleep to save battery. The skill instructs the agent to wake the car before sending action commands. You'll see a brief delay the first time.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Safety checks.&lt;/strong&gt; The skill explicitly tells the agent to confirm before executing unlock, disable Sentry, remote start, or erase data. You won't accidentally unlock your car because of a typo.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;OAuth, not API keys.&lt;/strong&gt; Authentication goes through Tesla's OAuth flow via MyTeslaMate. The token is scoped and can be revoked from your Tesla account at any time.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;— Jérôme&lt;/p&gt;

</description>
      <category>kiro</category>
      <category>aws</category>
      <category>tesla</category>
      <category>myteslamate</category>
    </item>
  </channel>
</rss>
