<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Volodymyr Marynychev</title>
    <description>The latest articles on DEV Community by Volodymyr Marynychev (@vol).</description>
    <link>https://dev.to/vol</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2773141%2F5f8dd81e-73bb-47d0-a28f-5d5fbef6e9d6.jpeg</url>
      <title>DEV Community: Volodymyr Marynychev</title>
      <link>https://dev.to/vol</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vol"/>
    <language>en</language>
    <item>
      <title>Building Persistent Memory for Kiro with Bash Hooks</title>
      <dc:creator>Volodymyr Marynychev</dc:creator>
      <pubDate>Sat, 31 Jan 2026 22:00:01 +0000</pubDate>
      <link>https://dev.to/aws-builders/building-persistent-memory-for-kiro-with-bash-hooks-4gm8</link>
      <guid>https://dev.to/aws-builders/building-persistent-memory-for-kiro-with-bash-hooks-4gm8</guid>
      <description>&lt;p&gt;&lt;em&gt;Making Kiro learn from every session&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;A few months into using Kiro and Kiro CLI daily, I wanted to review my week.&lt;/p&gt;

&lt;p&gt;What problems did I solve? What approaches worked? What did I learn?&lt;/p&gt;

&lt;p&gt;Kiro has conversation persistence: you can resume chats, save and load sessions. That's useful. But it's not what I was looking for.&lt;/p&gt;

&lt;p&gt;I didn't want to scroll through old conversations. I wanted the &lt;em&gt;insights&lt;/em&gt; extracted. The patterns. The solutions that worked and why.&lt;/p&gt;

&lt;p&gt;Tuesday's debugging session taught me something about Terraform state locks. But that lesson was buried in a 200-message conversation. Next time I hit the same issue, would I remember to search for it? Would I even remember it happened?&lt;/p&gt;

&lt;p&gt;Kiro gives you great infrastructure: agents, hooks, steering files. But no system for capturing what you learn and surfacing it when relevant.&lt;/p&gt;

&lt;p&gt;I started asking: What if I built that layer on top of what Kiro already provides?&lt;/p&gt;




&lt;h2&gt;
  
  
  The Idea
&lt;/h2&gt;

&lt;p&gt;I found Daniel Miessler's &lt;a href="https://github.com/danielmiessler/pai" rel="noopener noreferrer"&gt;PAI&lt;/a&gt; project - Personal AI Infrastructure for Claude. It clicked.&lt;/p&gt;

&lt;p&gt;The idea: an AI assistant that learns from your work and applies that knowledge to future tasks.&lt;/p&gt;

&lt;p&gt;What if I could build something similar for Kiro? A layer that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Captures learnings as they happen&lt;/li&gt;
&lt;li&gt;Knows my projects and priorities
&lt;/li&gt;
&lt;li&gt;Follows a consistent problem-solving approach&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not a smarter model. A system that learns.&lt;/p&gt;

&lt;p&gt;Kiro already has the building blocks: hooks that run at key moments, steering files for persistent context, agent configurations. I just needed to wire them together.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;PILOT is an agent for Kiro that adds three things:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. The Algorithm&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every task follows seven phases:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;OBSERVE → THINK → PLAN → BUILD → EXECUTE → VERIFY → LEARN
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Not revolutionary. It's how experienced engineers already work. But making it explicit means the AI follows it consistently.&lt;/p&gt;

&lt;p&gt;The key insight: Define success criteria &lt;em&gt;before&lt;/em&gt; executing. Most people skip this. They try something, vaguely check if it worked, move on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Memory&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;PILOT captures solutions, but only after verification confirms they work.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;~/.pilot/learnings/
├── 2026-01-15_terraform-state-lock-fix.md
├── 2026-01-14_lambda-timeout-optimization.md
└── 2026-01-12_git-merge-conflict-pattern.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next time you hit a similar problem, PILOT surfaces the past solution.&lt;/p&gt;

&lt;p&gt;For semantic search over learnings, PILOT uses Kiro's &lt;code&gt;/knowledge&lt;/code&gt; feature. Without it, PILOT still works but uses keyword matching instead.&lt;/p&gt;

&lt;p&gt;Unverified solutions are guesses. Verified solutions are knowledge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Identity&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Optional markdown files that give PILOT context about you:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;~/.pilot/identity/
├── MISSION.md       # What you're building
├── GOALS.md         # Current objectives
├── PROJECTS.md      # Active work
└── STRATEGIES.md    # Approaches that work for you
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You don't have to fill these manually. PILOT observes your work and gradually populates them. The more you use it, the more context it captures: projects you work on, challenges you face, strategies that work for you.&lt;/p&gt;

&lt;p&gt;Without identity: "Here's how to fix this Lambda timeout"&lt;/p&gt;

&lt;p&gt;With identity: "Given your focus on cost optimization, consider this approach you used last month..."&lt;/p&gt;




&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;p&gt;PILOT uses four Kiro features: hooks, resources, agents, and the experimental knowledge base.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Resources&lt;/strong&gt; load context into the agent. Kiro supports three types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;file://&lt;/code&gt; resources load directly at startup (identity files like MISSION.md, GOALS.md)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;skill://&lt;/code&gt; resources load on-demand when relevant (the algorithm, principles)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;knowledgeBase&lt;/code&gt; resources enable semantic search over indexed content&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;PILOT indexes your learnings folder as a knowledge base. When you ask about something, it can find related past solutions even if the keywords don't match exactly.&lt;/p&gt;

&lt;p&gt;This keeps context lean. The agent sees your mission and goals immediately, but only loads the full algorithm documentation when it needs to reference it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hooks&lt;/strong&gt; run scripts at key moments. Kiro provides the trigger points, PILOT provides the scripts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────────┐
│                      Kiro Session                           │
│                                                             │
│   Kiro Event              PILOT Script                      │
│   ──────────              ────────────                      │
│                                                             │
│   agentSpawn        →     agent-spawn.sh                    │
│                           Load identity, past learnings     │
│                                                             │
│   userPromptSubmit  →     user-prompt-submit.sh             │
│                           Search relevant patterns          │
│                                                             │
│   preToolUse        →     pre-tool-use.sh                   │
│                           Validate before execution         │
│                                                             │
│   [Tool executes]                                           │
│                                                             │
│   postToolUse       →     post-tool-use.sh                  │
│                           Capture results                   │
│                                                             │
│   stop              →     stop.sh                           │
│                           Archive session, save learnings   │
│                                                             │
└─────────────────────────────────────────────────────────────┘

 Kiro provides:          PILOT provides:
 • Hook trigger points   • Scripts that run at each hook
 • Agent system          • Learning capture logic
 • Steering files        • Memory management
                         • Identity context
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Agents&lt;/strong&gt; tie it together. The &lt;code&gt;pilot.json&lt;/code&gt; configuration defines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which hooks to run and when&lt;/li&gt;
&lt;li&gt;Which resources to load&lt;/li&gt;
&lt;li&gt;The system prompt with the algorithm phases&lt;/li&gt;
&lt;li&gt;Tool permissions and safety rules&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hooks must exit 0. A crashed hook breaks the session.&lt;/p&gt;

&lt;p&gt;It's all shell scripts. No TypeScript, no build tools, no dependencies.&lt;/p&gt;

&lt;p&gt;Why bash? Three reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No runtime startup cost&lt;/li&gt;
&lt;li&gt;Works on macOS and Linux out of the box&lt;/li&gt;
&lt;li&gt;Plain text scripts - open any file and see exactly what it does&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What Changed
&lt;/h2&gt;

&lt;p&gt;Persistent learning compounds.&lt;/p&gt;

&lt;p&gt;Early on, the benefit is small: a few captured solutions, some context loaded at startup. But each session adds to the knowledge base. Each verified fix becomes available for next time.&lt;/p&gt;

&lt;p&gt;After a few weeks, the difference becomes noticeable. Not because any single feature is transformative, but because the system remembers what you've already figured out.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Tradeoffs
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What works well:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Learning capture is automatic&lt;/li&gt;
&lt;li&gt;Past solutions surface when relevant&lt;/li&gt;
&lt;li&gt;The algorithm keeps work structured&lt;/li&gt;
&lt;li&gt;Bash is fast and debuggable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What's limited:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each project has isolated memory (no cross-project patterns yet)&lt;/li&gt;
&lt;li&gt;Learning detection is keyword-based, not ML&lt;/li&gt;
&lt;li&gt;Semantic search requires enabling Kiro's &lt;code&gt;/knowledge&lt;/code&gt; feature&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What's missing:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Team collaboration (your learnings stay yours)&lt;/li&gt;
&lt;li&gt;Visual dashboard&lt;/li&gt;
&lt;li&gt;Cross-project pattern sharing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is an MVP. Try it and see if it fits your workflow.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/requix/pilot
&lt;span class="nb"&gt;cd &lt;/span&gt;pilot
./install.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For semantic search over learnings (optional but recommended):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kiro-cli settings chat.enableKnowledge &lt;span class="nb"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then: &lt;code&gt;kiro-cli --agent pilot&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The code is simple. The impact compounds over time.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Questions? &lt;a href="https://github.com/requix/pilot/issues" rel="noopener noreferrer"&gt;Open an issue&lt;/a&gt;. The system improves through use.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>kiro</category>
      <category>agents</category>
      <category>cli</category>
      <category>ai</category>
    </item>
    <item>
      <title>Why an AWS Architect Built Azure Powers for Kiro (And What I Learned)</title>
      <dc:creator>Volodymyr Marynychev</dc:creator>
      <pubDate>Mon, 26 Jan 2026 20:15:38 +0000</pubDate>
      <link>https://dev.to/aws-builders/why-an-aws-architect-built-azure-powers-for-kiro-and-what-i-learned-2dg4</link>
      <guid>https://dev.to/aws-builders/why-an-aws-architect-built-azure-powers-for-kiro-and-what-i-learned-2dg4</guid>
      <description>&lt;p&gt;&lt;em&gt;How I used Kiro powers to bridge my cloud platform knowledge gap&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Start: Just Try It
&lt;/h2&gt;

&lt;p&gt;Want to skip the story and experiment with power?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UPD:&lt;/strong&gt; Now, all three powers are available in the &lt;a href="https://www.promptz.dev/powers" rel="noopener noreferrer"&gt;Kiro Community Library&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You can install them directly from the library or follow the next step to install them manually from the GitHub repository.&lt;/p&gt;

&lt;p&gt;Open Kiro IDE → Powers panel → &lt;strong&gt;Add power from GitHub&lt;/strong&gt; → Enter the URL for the power you want:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Power&lt;/th&gt;
&lt;th&gt;URL&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;azure-architect&lt;/td&gt;
&lt;td&gt;&lt;code&gt;https://github.com/requix/azure-kiro-powers/tree/main/azure-architect&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;azure-operations&lt;/td&gt;
&lt;td&gt;&lt;code&gt;https://github.com/requix/azure-kiro-powers/tree/main/azure-operations&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;azure-monitoring&lt;/td&gt;
&lt;td&gt;&lt;code&gt;https://github.com/requix/azure-kiro-powers/tree/main/azure-monitoring&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Click &lt;strong&gt;Install&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Before installing:&lt;/strong&gt; These are third-party powers, not official Kiro or Microsoft tools. Review the &lt;a href="https://github.com/requix/azure-kiro-powers" rel="noopener noreferrer"&gt;repository&lt;/a&gt; and code before installing.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;No authentication needed for &lt;code&gt;azure-architect&lt;/code&gt;. Start with: &lt;em&gt;"What are the best practices for Azure storage account security?"&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Have fun.
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;If you're interested in building Kiro powers yourself, the rest of this post walks through the development process, design decisions, and what I learned.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Context
&lt;/h2&gt;

&lt;p&gt;You never know what the next project brings you.&lt;/p&gt;

&lt;p&gt;I'm an AWS cloud architect. Have been for years. Then I joined a project running entirely on Azure. Different services, different naming conventions, same deadline pressure.&lt;/p&gt;

&lt;p&gt;Here's the irony: I reached for Kiro - an AWS IDE - to help me learn Azure. An AWS architect using an AWS tool to work with Microsoft's cloud. The cloud world is strange sometimes.&lt;/p&gt;

&lt;p&gt;But it made sense. I didn't have months to follow the classic learning path - certifications, documentation deep-dives, sandbox experiments. I needed to ship. I needed Azure knowledge in context, at the moment of need, without constantly switching between documentation tabs and my development environment.&lt;/p&gt;

&lt;p&gt;So I built first, learned along the way. Three Kiro powers for Azure. Each design choice taught me how Azure actually works.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Kiro Powers Are
&lt;/h2&gt;

&lt;p&gt;Powers are Kiro's extension system. They solve two problems that traditional MCP setups create:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Without framework context, agents guess.&lt;/strong&gt; Your agent can call Azure APIs, but does it know the right patterns? Without built-in expertise, you're both manually reading documentation and refining approaches until the output is right.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With too much context, agents slow down.&lt;/strong&gt; Connect five MCP servers and your agent loads 100+ tool definitions before writing a single line of code. Five servers might consume 50,000+ tokens - 40% of your context window - before your first prompt. More tools should mean better results, but unstructured context overwhelms the agent.&lt;/p&gt;

&lt;p&gt;Powers fix this through dynamic loading. Instead of loading all MCP tools at once, powers activate based on keywords in your conversation. Mention "Azure architecture" and the azure-architect power loads. Switch to deployment topics and azure-operations activates.&lt;/p&gt;

&lt;p&gt;A power consists of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;POWER.md&lt;/code&gt;&lt;/strong&gt; - Required. Contains frontmatter (metadata, keywords for activation) and instructions (onboarding steps, steering guidance)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;mcp.json&lt;/code&gt;&lt;/strong&gt; - Optional. MCP server configuration for tool integrations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;steering/&lt;/code&gt;&lt;/strong&gt; - Optional. Workflow-specific guidance files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;hooks/&lt;/code&gt;&lt;/strong&gt; - Optional. Automated tasks that run on IDE events or via slash commands&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My Azure powers use the first three. Hooks are useful for validation workflows or automated setup tasks - something to explore in future iterations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Three Powers, Not One: The Design Decision
&lt;/h2&gt;

&lt;p&gt;The official Azure MCP Server has dozens of namespaces. Loading everything at once would defeat the purpose of powers - you'd be back to context overload.&lt;/p&gt;

&lt;p&gt;I split it into three powers based on workflow phases:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────┐     ┌──────────────────┐     ┌───────────────────┐
│ azure-architect │ ──▶ │ azure-operations │ ──▶ │ azure-monitoring  │
│                 │     │                  │     │                   │
│ "Design it"     │     │ "Build &amp;amp; run it" │     │ "Watch &amp;amp; fix it"  │
│ Design tools    │     │ Resource mgmt    │     │ Observability     │
└─────────────────┘     └──────────────────┘     └───────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;azure-architect&lt;/strong&gt;: Best practices, architecture guidance, documentation search, schema references. Design-time namespaces only.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;azure-operations&lt;/strong&gt;: Storage, databases, RBAC, Key Vault, AKS management. Resource management namespaces.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;azure-monitoring&lt;/strong&gt;: Log Analytics, metrics, alerts, resource health. Observability namespaces.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The primary benefit is focus.&lt;/strong&gt; Each power loads only the tools relevant to that workflow phase. When you're designing infrastructure, you don't need monitoring tools consuming context. When you're debugging production, you don't need architecture best practices.&lt;/p&gt;

&lt;p&gt;When you install all three powers, Kiro automatically selects the right one based on your request. Ask about Azure best practices, it uses architect. Query storage accounts, it switches to operations. Check resource health, it activates monitoring.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A secondary benefit is authentication separation.&lt;/strong&gt; The architect power works without &lt;code&gt;az login&lt;/code&gt; - useful for design work on a fresh machine. The operations and monitoring powers require authentication, with monitoring limited to read-only namespaces.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Note on permissions:&lt;/strong&gt; Your Azure permissions come from &lt;code&gt;az login&lt;/code&gt;. If you authenticate with write access, that access exists regardless of which power is active. The powers organize workflows; your Azure RBAC controls what's actually permitted.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Building azure-architect: The MCP Configuration
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;mcp.json&lt;/code&gt; file defines which MCP servers and namespaces a power uses:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"microsoft-docs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://learn.microsoft.com/api/mcp"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"azure-mcp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"@azure/mcp@latest"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"server"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"start"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"--namespace"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"documentation"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"--namespace"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"bicepschema"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"--namespace"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"cloudarchitect"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"--namespace"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"bestpractices"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"AZURE_MCP_COLLECT_TELEMETRY"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"false"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two MCP servers. Microsoft Learn for documentation search. Azure MCP for design tools.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;--namespace&lt;/code&gt; flags are where token efficiency happens. Without them, the Azure MCP server loads &lt;em&gt;all&lt;/em&gt; namespaces. By specifying only design-time namespaces, this power stays focused - loading exactly what's needed for architecture work, nothing more.&lt;/p&gt;




&lt;h2&gt;
  
  
  Steering Files: Teaching Kiro How to Think
&lt;/h2&gt;

&lt;p&gt;MCP connections give Kiro access to tools. Steering files teach it &lt;em&gt;when&lt;/em&gt; and &lt;em&gt;how&lt;/em&gt; to use them.&lt;/p&gt;

&lt;p&gt;Here's a section from the naming conventions steering file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Azure Resource Naming Pattern&lt;/span&gt;

{resource-type}-{workload}-{environment}-{region}-{instance}

Examples:
&lt;span class="p"&gt;-&lt;/span&gt; st-payments-prod-westeu-001 (storage account)
&lt;span class="p"&gt;-&lt;/span&gt; kv-payments-prod-westeu-001 (key vault)
&lt;span class="p"&gt;-&lt;/span&gt; aks-payments-prod-westeu-001 (kubernetes cluster)

When user asks to create a resource:
&lt;span class="p"&gt;1.&lt;/span&gt; Ask for workload name if not provided
&lt;span class="p"&gt;2.&lt;/span&gt; Infer environment from context or ask
&lt;span class="p"&gt;3.&lt;/span&gt; Apply naming pattern automatically
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This isn't just reference material. It's encoded behavior. When I ask Kiro to create a storage account, it doesn't just generate code - it asks clarifying questions and applies the naming pattern automatically.&lt;/p&gt;

&lt;p&gt;The steering files also include ready-to-use patterns. From the KQL patterns file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Error Rate Calculation
AppServiceHTTPLogs
| where TimeGenerated &amp;gt; ago(1h)
| summarize 
    TotalRequests = count(),
    ErrorCount = countif(ScStatus &amp;gt;= 500),
    ErrorRate = round(countif(ScStatus &amp;gt;= 500) * 100.0 / count(), 2)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Response Time Percentiles
AppServiceHTTPLogs
| where TimeGenerated &amp;gt; ago(6h)
| summarize 
    p50 = percentile(TimeTaken, 50),
    p95 = percentile(TimeTaken, 95),
    p99 = percentile(TimeTaken, 99)
  by bin(TimeGenerated, 15m)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These aren't just examples. They're templates Kiro adapts to specific queries. When I ask "show me slow requests from the last hour," Kiro modifies the percentile pattern with my timeframe.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real Usage: What Actually Gets Used
&lt;/h2&gt;

&lt;p&gt;After completing an IaC authoring task with the azure-architect power, I asked Kiro to analyze its own tool usage. Here's the honest breakdown:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Used&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;microsoft_docs_search&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;High&lt;/strong&gt;  -  Found exact configuration patterns and integration details&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;get_azure_bestpractices&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;Medium  -  General Azure coding guidelines&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;azureterraformbestpractices&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;Medium  -  Validation workflow patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;bicepschema&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;Discovered it exists, didn't use for this task&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The verdict from Kiro:&lt;/strong&gt; &lt;em&gt;"Documentation search was the killer feature. Worth having for the documentation search alone."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;What made the documentation search valuable wasn't generic information - it was concrete implementation details: exact API endpoint formats, available metrics and thresholds, configuration patterns for specific integrations.&lt;/p&gt;

&lt;p&gt;The best practices tools confirmed patterns but didn't provide service-specific guidance. Moderately useful, not transformative.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What this reveals about the three-power design:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Kiro noted that operational tools (subscription queries, live resource inspection) weren't needed for this IaC authoring task. Those capabilities &lt;em&gt;"would shine more in a 'diagnose my existing infrastructure' scenario rather than 'author new IaC.'"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This is exactly why the powers are separated. The architect power handles design-time work with minimal context overhead. The operations and monitoring powers exist for when you need to interact with live resources.&lt;/p&gt;

&lt;p&gt;Different workflows. Different tools. Focused context.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs7yn05dg05s6ed4jxiri.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs7yn05dg05s6ed4jxiri.png" alt="Kiro IDE Interface" width="800" height="752"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Security Steering: Making Best Practices Unavoidable
&lt;/h2&gt;

&lt;p&gt;The azure-operations power includes security guidelines that encode least privilege into every workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Least Privilege Patterns&lt;/span&gt;

&lt;span class="gu"&gt;### Pattern 1: Application Access to Storage&lt;/span&gt;

&lt;span class="gs"&gt;**Instead of:**&lt;/span&gt; Contributor role on storage account
&lt;span class="gs"&gt;**Use:**&lt;/span&gt; Storage Blob Data Contributor on specific container

&lt;span class="gu"&gt;### Pattern 2: Application Access to Key Vault&lt;/span&gt;

&lt;span class="gs"&gt;**Instead of:**&lt;/span&gt; Key Vault Contributor
&lt;span class="gs"&gt;**Use:**&lt;/span&gt; Key Vault Secrets User (read-only) or Key Vault Secrets Officer (read/write)

&lt;span class="gu"&gt;### Pattern 3: CI/CD Pipeline Access&lt;/span&gt;

&lt;span class="gs"&gt;**Instead of:**&lt;/span&gt; Contributor on subscription
&lt;span class="gs"&gt;**Use:**&lt;/span&gt; Contributor on specific resource groups + specific data plane roles
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When I ask about access management, Kiro frames answers in terms of principals, definitions, and scopes. The steering file made it harder to give overly permissive advice.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Development Process: Using Kiro to Build Kiro Powers
&lt;/h2&gt;

&lt;p&gt;Here's the meta part: I used Kiro to build these powers.&lt;/p&gt;

&lt;p&gt;The Kiro team maintains a &lt;a href="https://github.com/kirodotdev/powers/blob/main/power-builder/POWER.md" rel="noopener noreferrer"&gt;power-builder&lt;/a&gt; power specifically for creating new powers. Install it, and Kiro becomes your power development assistant.&lt;/p&gt;

&lt;h3&gt;
  
  
  Spec Mode for Requirements
&lt;/h3&gt;

&lt;p&gt;Kiro's Spec mode generates structured plans from descriptions. I described what I wanted - three workflow-aligned powers with namespace separation - and Spec mode produced:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Requirements documents for each power&lt;/li&gt;
&lt;li&gt;File structure recommendations&lt;/li&gt;
&lt;li&gt;Task lists for implementation&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The Iteration Loop
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Define MCP configuration&lt;/strong&gt;  -  Which namespaces to include&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Write steering files&lt;/strong&gt;  -  Patterns, workflows, decision trees&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Install power locally&lt;/strong&gt;  -  Powers panel → Add power from Local Path&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test with real queries&lt;/strong&gt;  -  "List my storage accounts"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check tool availability&lt;/strong&gt;  -  Verify expected tools load&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Refine based on gaps&lt;/strong&gt;  -  Fix tool names, add missing patterns&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Step 5 caught several issues. The Azure MCP uses &lt;code&gt;azmcp_&lt;/code&gt; prefixes for tool names. My early documentation referenced incorrect names. Testing revealed the mismatch.&lt;/p&gt;

&lt;h3&gt;
  
  
  Local Installation
&lt;/h3&gt;

&lt;p&gt;Installing during development:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open Kiro's Powers panel&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Add power from Local Path&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Select the power directory containing &lt;code&gt;POWER.md&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Install&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;No build step. No packaging. Direct folder reference. Change a steering file, reload the power, test immediately.&lt;/p&gt;

&lt;p&gt;Once ready, push to a public GitHub repository and others can install via &lt;strong&gt;Add power from GitHub&lt;/strong&gt; using the URL to the specific power folder.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flljz4zatz6b319wdkd17.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flljz4zatz6b319wdkd17.png" alt="Kiro IDE Interface" width="800" height="889"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What Actually Matters
&lt;/h2&gt;

&lt;p&gt;Powers aren't just a packaging format. They're a model for how AI agents should acquire expertise.&lt;/p&gt;

&lt;p&gt;The old approach: stuff everything into context upfront. Hope the agent figures out what's relevant. Watch token costs climb while response quality drops.&lt;/p&gt;

&lt;p&gt;The new approach: agents learn what they need, when they need it. Expertise flows in on demand. Context stays focused. The agent expands its capabilities as the tools around it evolve.&lt;/p&gt;

&lt;p&gt;This matters beyond my Azure learning curve. HashiCorp built their Terraform power in days after learning about the format. Stripe, Supabase, Datadog - all shipping domain expertise as installable packages. The pattern scales.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For tool providers&lt;/strong&gt;: Write one &lt;code&gt;POWER.md&lt;/code&gt;, and your expertise reaches every developer using powers. No maintaining separate integrations for each AI tool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For teams&lt;/strong&gt;: Package internal knowledge - your design system, your deployment patterns, your security policies - as powers. Every developer's agent knows how to use them correctly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For individuals&lt;/strong&gt;: Install the expertise you need today. Uninstall when you're done. Your agent's capabilities match your current project, not some generic average.&lt;/p&gt;

&lt;p&gt;This is what separates useful AI assistance from the "chat with docs" experience. Not just answering questions. Bringing the right context at the right moment, then getting out of the way.&lt;/p&gt;

&lt;p&gt;Documentation teaches concepts. Powers teach workflows. The difference is action.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Future: Cross-Tool Compatibility
&lt;/h2&gt;

&lt;p&gt;Today, powers work in Kiro IDE. The team is building toward a future where powers work across any AI development tool - Kiro CLI, Cursor, Claude Code, and beyond.&lt;/p&gt;

&lt;p&gt;The Model Context Protocol provides a standard for tool communication. Powers extend this with standards for packaging, activation, and knowledge transfer.&lt;/p&gt;

&lt;p&gt;This matters for the ecosystem. Tool providers don't want to maintain separate integrations for each AI tool. Write one &lt;code&gt;POWER.md&lt;/code&gt;, use it anywhere.&lt;/p&gt;

&lt;p&gt;I'm particularly interested in Kiro CLI support. Running these powers from a terminal would match my actual workflow better than the IDE interface.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tradeoffs
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Cognitive Distance from the Platform
&lt;/h3&gt;

&lt;p&gt;Using powers means interacting with Azure through an abstraction layer. For learning fundamentals, this might hide important details.&lt;/p&gt;

&lt;h3&gt;
  
  
  MCP Server Dependency
&lt;/h3&gt;

&lt;p&gt;These powers depend on Microsoft maintaining the Azure MCP Server. Version updates have already changed tool naming conventions, requiring documentation updates across all three powers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Steering File Maintenance
&lt;/h3&gt;

&lt;p&gt;Steering files encode current best practices. Azure evolves. The files need periodic updates to stay relevant.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd Change
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Better Tool Discovery
&lt;/h3&gt;

&lt;p&gt;You need to know what tools exist to use them effectively. A more systematic discovery mechanism would help - something that surfaces available capabilities based on what you're trying to accomplish.&lt;/p&gt;

&lt;h3&gt;
  
  
  Add Hooks for Validation
&lt;/h3&gt;

&lt;p&gt;The powers currently don't use hooks. Adding automated validation, like checking Terraform syntax before deployment or verifying RBAC configurations, would make the workflow tighter.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;Building these powers taught me things that reading documentation wouldn't have:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;About Azure:&lt;/strong&gt; Working with the MCP namespaces forced me to understand how Azure organizes its services. The separation between control plane and data plane operations became obvious when I had to decide which namespaces each power needed. You learn a platform's structure by building tools for it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;About Powers:&lt;/strong&gt; The format is more accessible than I expected.&lt;br&gt;
My three Azure powers took a weekend of focused work. The barrier isn't technical complexity - it's knowing what workflows to optimize for.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;About Context Efficiency:&lt;/strong&gt; Before this project, I would have connected every MCP server and hoped for the best. Now I think in terms of focused context. What does this specific task need? What's consuming tokens without adding value?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;About Learning Paths:&lt;/strong&gt; Sometimes building tools &lt;em&gt;is&lt;/em&gt; the learning path. The classic route - docs, tutorials, certifications - works when you have time. When you don't, building forces understanding faster. Every decision about what to include in a power required me to understand what Azure actually offers.&lt;/p&gt;

&lt;p&gt;The unexpected part: an AWS architect, using an AWS IDE, building Azure tooling. But that's the point of powers. They're platform-agnostic expertise packages. The tool doesn't care which cloud you're learning. It just loads the right context when you need it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Repository&lt;/strong&gt;: &lt;a href="https://github.com/requix/azure-kiro-powers" rel="noopener noreferrer"&gt;github.com/requix/azure-kiro-powers&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each power installs independently. Start with azure-architect - no authentication required.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Building AI-powered development tools? The interesting work happens at the edges, where people try things.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Your move.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>kiro</category>
      <category>ai</category>
      <category>mcp</category>
      <category>azure</category>
    </item>
    <item>
      <title>Building a Serverless AI Chatbot: Integrating OpenAI with Telegram on AWS</title>
      <dc:creator>Volodymyr Marynychev</dc:creator>
      <pubDate>Thu, 30 Jan 2025 21:29:37 +0000</pubDate>
      <link>https://dev.to/aws-builders/building-a-serverless-ai-chatbot-integrating-openai-with-telegram-on-aws-3fj2</link>
      <guid>https://dev.to/aws-builders/building-a-serverless-ai-chatbot-integrating-openai-with-telegram-on-aws-3fj2</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Let me share how I built an AI chatbot using AWS, OpenAI, and Telegram. The main goal was to create a smart, cost-effective chatbot without dealing with server maintenance. A serverless approach was a perfect fit for this task.&lt;/p&gt;

&lt;p&gt;The project needed to solve these main challenges:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create an intelligent chatbot using OpenAI&lt;/li&gt;
&lt;li&gt;Keep running costs low with serverless architecture&lt;/li&gt;
&lt;li&gt;Ensure secure handling of sensitive data&lt;/li&gt;
&lt;li&gt;Guarantee reliable message delivery&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Serverless architecture was chosen because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pay-per-use pricing model&lt;/li&gt;
&lt;li&gt;Automatic scaling capabilities&lt;/li&gt;
&lt;li&gt;Minimal maintenance overhead&lt;/li&gt;
&lt;li&gt;Built-in high availability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The tech stack includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AWS services (Lambda, API Gateway, SQS, DynamoDB, KMS)&lt;/li&gt;
&lt;li&gt;OpenAI's GPT-4 for message processing&lt;/li&gt;
&lt;li&gt;Telegram as a messaging platform&lt;/li&gt;
&lt;li&gt;Terraform for infrastructure setup&lt;/li&gt;
&lt;li&gt;AWS Lambda Powertools for monitoring&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Architecture Overview
&lt;/h2&gt;

&lt;h3&gt;
  
  
  How It Works
&lt;/h3&gt;

&lt;p&gt;The system processes messages in a simple flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User sends a message to the Telegram bot&lt;/li&gt;
&lt;li&gt;Telegram forwards it to AWS API Gateway&lt;/li&gt;
&lt;li&gt;Message goes through processing pipeline&lt;/li&gt;
&lt;li&gt;User receives response from OpenAI&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here's the visual representation:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flfh8rcdnmbvmb5p0no5l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flfh8rcdnmbvmb5p0no5l.png" alt="Architecture Diagram" width="800" height="490"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Core Components
&lt;/h3&gt;

&lt;p&gt;Each component has a specific role in the system:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;API Gateway&lt;/strong&gt; serves as an entry point:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"api_gateway"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;          &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${var.app_name}-webhook"&lt;/span&gt;
  &lt;span class="nx"&gt;protocol_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"HTTP"&lt;/span&gt;
  &lt;span class="nx"&gt;integrations&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;"ANY /"&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;integration_type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"AWS_PROXY"&lt;/span&gt;
      &lt;span class="nx"&gt;integration_subtype&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"SQS-SendMessage"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;SQS Queue&lt;/strong&gt; handles message buffering:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_sqs_queue"&lt;/span&gt; &lt;span class="s2"&gt;"inbound"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name_prefix&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${var.app_name}-inbound-queue"&lt;/span&gt;
  &lt;span class="nx"&gt;visibility_timeout_seconds&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;360&lt;/span&gt;
  &lt;span class="nx"&gt;message_retention_seconds&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;86400&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Lambda Function&lt;/strong&gt; processes messages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"lambda_function"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;function_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${var.app_name}-messages-processing"&lt;/span&gt;
  &lt;span class="nx"&gt;handler&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"index.handler"&lt;/span&gt;
  &lt;span class="nx"&gt;runtime&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"python3.12"&lt;/span&gt;
  &lt;span class="nx"&gt;timeout&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;DynamoDB&lt;/strong&gt; stores conversation state:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_dynamodb_table"&lt;/span&gt; &lt;span class="s2"&gt;"threads"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${var.app_name}-threads"&lt;/span&gt;
  &lt;span class="nx"&gt;hash_key&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"chat_id"&lt;/span&gt;
  &lt;span class="nx"&gt;range_key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"thread_id"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each component was designed with scalability and reliability in mind. The system can handle multiple conversations simultaneously while maintaining message order and conversation context.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deep Dive: Implementation Details
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Message Flow Implementation
&lt;/h3&gt;

&lt;p&gt;Let's break down how messages move through the system. This section covers the actual implementation of each component.&lt;/p&gt;

&lt;h4&gt;
  
  
  Setting Up Telegram Webhook
&lt;/h4&gt;

&lt;p&gt;First, we need to connect Telegram to our AWS endpoint. Here's a simple script that handles this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;TELEGRAM_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-bot-token"&lt;/span&gt;
&lt;span class="nv"&gt;ENDPOINT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-api-gateway-url"&lt;/span&gt;

curl &lt;span class="nt"&gt;-X&lt;/span&gt; &lt;span class="s2"&gt;"POST"&lt;/span&gt; &lt;span class="s2"&gt;"https://api.telegram.org/bot&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TELEGRAM_TOKEN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/setWebhook"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"{&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;url&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;ENDPOINT&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;}"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Message Processing Pipeline
&lt;/h4&gt;

&lt;p&gt;The Lambda function processes messages in several steps. Here's the main handler:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Main entry point for processing messages.
    Receives events from SQS, processes them, and sends responses.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Extract message from SQS event
&lt;/span&gt;        &lt;span class="n"&gt;request_body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Records&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;update&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;telebot&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Update&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;de_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request_body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Process message only if user is allowed
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ALLOWED_USERS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;process_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error processing message: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;statusCode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  OpenAI Integration
&lt;/h4&gt;

&lt;p&gt;The OpenAI integration is handled through a dedicated function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ask_openai_threads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chat_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Sends user message to OpenAI and manages conversation threads.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Get or create assistant
&lt;/span&gt;    &lt;span class="n"&gt;assistant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_stored_assistant_id&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;assistant_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;assistant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_assistant&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nf"&gt;save_assistant&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;assistant_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Get or create thread
&lt;/span&gt;    &lt;span class="n"&gt;thread_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_stored_thread_id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chat_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;thread&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;thread_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;thread&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;
        &lt;span class="nf"&gt;save_thread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chat_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Add message and run assistant
&lt;/span&gt;    &lt;span class="n"&gt;openai_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;runs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;assistant_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;assistant_id&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Wait for response
&lt;/span&gt;    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;completed&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;runs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;retrieve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;run_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Get and return assistant's response
&lt;/span&gt;    &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;beta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  State Management
&lt;/h3&gt;

&lt;p&gt;The project uses DynamoDB to keep track of conversations and assistant configuration.&lt;/p&gt;

&lt;h4&gt;
  
  
  Thread Storage
&lt;/h4&gt;

&lt;p&gt;Here's how we store and retrieve conversation threads:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_thread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chat_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Saves new thread to DynamoDB.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;chat_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;N&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chat_id&lt;/span&gt;&lt;span class="p"&gt;)},&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;thread_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;S&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;thread_status&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;S&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ACTIVE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;created_at&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;S&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;dynamodb_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put_item&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;TableName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;THREADS_TABLE_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_stored_thread_id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chat_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Retrieves active thread for a chat.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;threads_table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;IndexName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;UserStatusIndex&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;KeyConditionExpression&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;chat_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chat_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; 
                             &lt;span class="nc"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;thread_status&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ACTIVE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;Limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Items&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;thread_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Items&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Security Implementation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Secret Management
&lt;/h3&gt;

&lt;p&gt;We use AWS Parameter Store to keep API tokens and other secrets safe.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Instead of hardcoding tokens:
&lt;/span&gt;&lt;span class="n"&gt;ssm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ssm&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Get Telegram token
&lt;/span&gt;&lt;span class="n"&gt;TELEGRAM_TOKEN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ssm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;TELEGRAM_TOKEN_PARAM_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;WithDecryption&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Parameter&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Value&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Get OpenAI token
&lt;/span&gt;&lt;span class="n"&gt;OPENAI_TOKEN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ssm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_parameter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;OPENAI_TOKEN_PARAM_NAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;WithDecryption&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Parameter&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Value&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The parameters are created using Terraform:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_ssm_parameter"&lt;/span&gt; &lt;span class="s2"&gt;"bot-token"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${var.app_name}-bot-token"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"SecureString"&lt;/span&gt;
  &lt;span class="nx"&gt;key_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"alias/aws/ssm"&lt;/span&gt;
  &lt;span class="nx"&gt;value&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"CHANGE-ME"&lt;/span&gt;  &lt;span class="c1"&gt;# Changed manually after deployment&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Encryption
&lt;/h3&gt;

&lt;p&gt;We use KMS for encrypting data at rest. Here's how we set it up:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"aws_kms_key"&lt;/span&gt; &lt;span class="s2"&gt;"dynamo-encryption-key"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt;             &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Key for DynamoDB encryption"&lt;/span&gt;
  &lt;span class="nx"&gt;deletion_window_in_days&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
  &lt;span class="nx"&gt;enable_key_rotation&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

  &lt;span class="nx"&gt;policy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jsonencode&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="nx"&gt;Version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;
    &lt;span class="nx"&gt;Statement&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;Sid&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Enable IAM User Permissions"&lt;/span&gt;
        &lt;span class="nx"&gt;Effect&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;
        &lt;span class="nx"&gt;Principal&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;AWS&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"arn:aws:iam::${local.account_id}:root"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="nx"&gt;Action&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"kms:*"&lt;/span&gt;
        &lt;span class="nx"&gt;Resource&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"*"&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Access Control
&lt;/h3&gt;

&lt;p&gt;We limit who can use the bot with a simple check:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;ALLOWED_USERS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;allowed_users_ids&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

&lt;span class="nd"&gt;@bot.message_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;func&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ALLOWED_USERS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;decline_strangers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Access denied.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Your user ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;bot&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reply_to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  IAM Roles
&lt;/h3&gt;

&lt;p&gt;Lambda needs specific permissions to access other services. Here's the IAM configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;module&lt;/span&gt; &lt;span class="s2"&gt;"lambda_function"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;# ... other configuration ...&lt;/span&gt;

  &lt;span class="nx"&gt;attach_policy_statements&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="nx"&gt;policy_statements&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;sqs&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;effect&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;actions&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"sqs:SendMessage"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="nx"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;aws_sqs_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;inbound&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="nx"&gt;ssm&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;effect&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;actions&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"ssm:GetParameter"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="nx"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="nx"&gt;aws_ssm_parameter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;bot-token&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;aws_ssm_parameter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;openai-token&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="nx"&gt;dynamodb&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;effect&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;actions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s2"&gt;"dynamodb:PutItem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"dynamodb:Query"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"dynamodb:Scan"&lt;/span&gt;
      &lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="nx"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Security Best Practices
&lt;/h3&gt;

&lt;p&gt;Some key security measures we implemented:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Network Security:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API Gateway uses HTTPS only&lt;/li&gt;
&lt;li&gt;Lambda functions run in private VPC (optional)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Data Security:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;All sensitive data encrypted at rest&lt;/li&gt;
&lt;li&gt;Secrets stored in Parameter Store&lt;/li&gt;
&lt;li&gt;DynamoDB encryption enabled&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Access Security:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Minimal IAM permissions&lt;/li&gt;
&lt;li&gt;User allowlist&lt;/li&gt;
&lt;li&gt;API key rotation enabled&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Monitoring and Operations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  CloudWatch Integration
&lt;/h3&gt;

&lt;p&gt;We use AWS Lambda Powertools to make monitoring easier. Here's how we set it up:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;aws_lambda_powertools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Tracer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Metrics&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Logger&lt;/span&gt;

&lt;span class="n"&gt;tracer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Tracer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;metrics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Metrics&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Logger&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nd"&gt;@tracer.capture_lambda_handler&lt;/span&gt;
&lt;span class="nd"&gt;@metrics.log_metrics&lt;/span&gt;
&lt;span class="nd"&gt;@logger.inject_lambda_context&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Main handler with full observability.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;process_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;metrics&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_metric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SuccessfulProcessing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;metrics&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_metric&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FailedProcessing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;unit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Logging Strategy
&lt;/h3&gt;

&lt;p&gt;We use structured logging to make debugging easier:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Process events with structured logging.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing new event&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;extra&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;event_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message_received&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;telegram&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What We Built
&lt;/h3&gt;

&lt;h4&gt;
  
  
  We created a serverless AI chatbot that combines:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;AWS serverless infrastructure&lt;/li&gt;
&lt;li&gt;OpenAI's powerful language models&lt;/li&gt;
&lt;li&gt;Telegram's messaging platform&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  The system handles:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Secure message processing&lt;/li&gt;
&lt;li&gt;Reliable conversation management&lt;/li&gt;
&lt;li&gt;Cost-effective scaling&lt;/li&gt;
&lt;li&gt;Comprehensive monitoring&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Key Takeaways
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Serverless architecture reduces operational overhead&lt;/li&gt;
&lt;li&gt;Queue-based design ensures message reliability&lt;/li&gt;
&lt;li&gt;DynamoDB provides flexible state management&lt;/li&gt;
&lt;li&gt;KMS encryption protects sensitive data&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Lessons Learned
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What Worked Well
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Serverless architecture scaled smoothly&lt;/li&gt;
&lt;li&gt;SQS prevented message loss&lt;/li&gt;
&lt;li&gt;Lambda Powertools improved observability&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  What Could Be Better
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Cold starts need optimization&lt;/li&gt;
&lt;li&gt;OpenAI API costs need monitoring&lt;/li&gt;
&lt;li&gt;Error handling could be more robust&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Final Thoughts
&lt;/h3&gt;

&lt;p&gt;Building a serverless AI chatbot taught us that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Simple architecture can handle complex tasks&lt;/li&gt;
&lt;li&gt;AWS services work well together&lt;/li&gt;
&lt;li&gt;Proper monitoring is crucial&lt;/li&gt;
&lt;li&gt;Cost management needs constant attention&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Getting Started
&lt;/h3&gt;

&lt;p&gt;Want to try it yourself? Here's a quick start:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Clone the repository&lt;/li&gt;
&lt;li&gt;Set up AWS credentials&lt;/li&gt;
&lt;li&gt;Deploy with Terraform&lt;/li&gt;
&lt;li&gt;Update SSM parameters with your API keys&lt;/li&gt;
&lt;li&gt;Set up the Telegram webhook&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Check &lt;a href="https://github.com/requix/aws-telegram-ai-module?tab=readme-ov-file#deployment-instructions" rel="noopener noreferrer"&gt;deployment instructions&lt;/a&gt; in the repository.&lt;/p&gt;

&lt;p&gt;The code is open source and available on GitHub: &lt;a href="https://github.com/requix/aws-telegram-ai-module" rel="noopener noreferrer"&gt;https://github.com/requix/aws-telegram-ai-module&lt;/a&gt;&lt;br&gt;
Feel free to contribute or adapt it for your needs.&lt;/p&gt;

&lt;p&gt;This project shows how modern cloud services and AI can work together to create practical, scalable applications. While there's always room for improvement, this architecture provides a solid foundation for building AI-powered chatbots.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>chatgpt</category>
      <category>serverless</category>
      <category>ai</category>
    </item>
    <item>
      <title>Orchestrating AI: Dynamic LLM Routing based on AWS Step Functions</title>
      <dc:creator>Volodymyr Marynychev</dc:creator>
      <pubDate>Wed, 29 Jan 2025 22:11:39 +0000</pubDate>
      <link>https://dev.to/aws-builders/aws-powered-dynamic-llm-routing-reducing-costs-maintaining-quality-52ab</link>
      <guid>https://dev.to/aws-builders/aws-powered-dynamic-llm-routing-reducing-costs-maintaining-quality-52ab</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;By expanding this simple architectural pattern, you can significantly reduce your LLM costs while maintaining high-quality responses across different use cases.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmx54n8kpcb8q5qpnfnff.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmx54n8kpcb8q5qpnfnff.png" alt="Architecture Diagram" width="800" height="890"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  🚨 Important Disclaimer: Proof of Concept 🚨
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;This project is a demonstration of the dynamic AI model routing concept and should NOT be considered a production-ready solution.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Key Limitations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Experimental architecture&lt;/li&gt;
&lt;li&gt;Prototype-level implementation&lt;/li&gt;
&lt;li&gt;Minimal error handling&lt;/li&gt;
&lt;li&gt;Requires significant enhancement for enterprise use&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use at Your Own Risk&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Not recommended for mission-critical applications&lt;/li&gt;
&lt;li&gt;Potential unexpected behaviors&lt;/li&gt;
&lt;li&gt;May incur unexpected cloud service costs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal of this project is to demonstrate a technical concept and provide a starting point for building intelligent, cost-effective AI routing systems. It's an educational resource and a blueprint for building more sophisticated solutions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Evolution of LLM Usage
&lt;/h2&gt;

&lt;p&gt;The landscape of Large Language Models (LLMs) has evolved dramatically over the past few years. What started with GPT-3 has expanded into a diverse ecosystem of models, each with its own strengths and cost structures. You now have access to various options:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI's GPT-4 and GPT-3.5&lt;/li&gt;
&lt;li&gt;Anthropic's Claude series&lt;/li&gt;
&lt;li&gt;Open-source models like Llama 2&lt;/li&gt;
&lt;li&gt;Cloud provider solutions like Amazon Bedrock&lt;/li&gt;
&lt;li&gt;Budget DeepSeek models&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This diversity brings both opportunities and challenges. While having multiple options provides flexibility, it also complicates the decision-making process. How do you choose the right model for each specific use case? How do you balance cost against performance? These questions become increasingly important as you scale your AI implementations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Components and Resources
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Complexity Analyzer
&lt;/h3&gt;

&lt;p&gt;The first step in our routing system is analyzing the complexity of incoming queries. For this demonstration, we've implemented a simple classifier that categorizes inputs based on their characteristics. While we're using Claude 3 Sonnet in this example, you could easily swap it for a more cost-effective model like GPT- 3.5 or DeepSeek-R1 or even a simpler rule-based system, depending on your specific needs and budget constraints.&lt;/p&gt;

&lt;p&gt;The complexity analyzer categorizes inputs into three basic levels, which helps determine the most appropriate model for handling each request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;analyze_complexity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Note: This is a demonstration using Claude 3 Sonnet
&lt;/span&gt;    &lt;span class="c1"&gt;# Consider using more cost-effective alternatives like DeepSeek
&lt;/span&gt;    &lt;span class="c1"&gt;# or implementing a custom rule-based classifier for production
&lt;/span&gt;    &lt;span class="n"&gt;bedrock_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bedrock-runtime&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
Analyze the complexity of the following input:
&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;input_text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;

Classify it into one of these categories:
1. SIMPLE: Basic questions, straightforward tasks
2. CALCULATION: Mathematical operations, data analysis
3. COMPLEX: Multi-step reasoning, creative problem-solving

Return ONLY the classification (SIMPLE/CALCULATION/COMPLEX)
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bedrock_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic.claude-3-sonnet-v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  AWS Step Functions State Machine
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Orchestrates the entire workflow&lt;/li&gt;
&lt;li&gt;Handles model selection logic based on complexity analysis&lt;/li&gt;
&lt;li&gt;Manages error handling and retries&lt;/li&gt;
&lt;li&gt;Integrates with various AWS services and external APIs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmyvndpnle9cra00r0j7m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmyvndpnle9cra00r0j7m.png" alt="Step Functions Workflow" width="800" height="734"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Lambda Functions
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Complexity Analyzer Lambda&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Uses Amazon Bedrock (Claude 3 Sonnet) to analyze input complexity&lt;/li&gt;
&lt;li&gt;Classifies inputs into three categories: SIMPLE, CALCULATION, COMPLEX&lt;/li&gt;
&lt;li&gt;Helps in optimal model selection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Bedrock Lambda (Instant &amp;amp; Sonnet)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Handles requests to Amazon Bedrock models&lt;/li&gt;
&lt;li&gt;Claude Instant for simple queries&lt;/li&gt;
&lt;li&gt;Claude Sonnet for complex analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cost Calculator Lambda&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Triggered by DynamoDB streams&lt;/li&gt;
&lt;li&gt;Calculates precise costs for each model invocation&lt;/li&gt;
&lt;li&gt;Updates cost information in DynamoDB&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Storage and Database
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;DynamoDB Table&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stores execution results and metadata&lt;/li&gt;
&lt;li&gt;Uses stream processing for cost calculations&lt;/li&gt;
&lt;li&gt;Encrypted at rest using KMS&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Security Components
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;KMS (Key Management Service)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Manages encryption keys for sensitive data&lt;/li&gt;
&lt;li&gt;Used for DynamoDB encryption&lt;/li&gt;
&lt;li&gt;Secures CloudWatch logs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;SSM Parameter Store&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Securely stores API keys&lt;/li&gt;
&lt;li&gt;Manages configuration values&lt;/li&gt;
&lt;li&gt;Encrypted using KMS&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Access Control&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fine-grained IAM permissions&lt;/li&gt;
&lt;li&gt;Service-to-service authentication&lt;/li&gt;
&lt;li&gt;Secure parameter management&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Integration Points
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;EventBridge API Destination&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Manages OpenAI API integration&lt;/li&gt;
&lt;li&gt;Handles API key authentication&lt;/li&gt;
&lt;li&gt;Provides secure HTTP endpoints&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Cost-Effectiveness Dilemma: Is Dynamic Routing Worth It? 🤔
&lt;/h3&gt;

&lt;p&gt;One of the most critical questions when designing any sophisticated system is: "Does the complexity come with a meaningful benefit?" In our dynamic AI model routing approach, we need to carefully analyze whether the overhead of complexity analysis justifies the potential cost savings.&lt;/p&gt;

&lt;h4&gt;
  
  
  The Hidden Cost of Complexity Analysis
&lt;/h4&gt;

&lt;p&gt;Let's break down the economics of our approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Complexity Analysis Cost Calculation
&lt;/span&gt;&lt;span class="n"&gt;complexity_cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.015&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;  &lt;span class="c1"&gt;# Using Claude Sonnet as analyzer
&lt;/span&gt;
&lt;span class="n"&gt;model_costs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-3.5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.000002&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="c1"&gt;# Cheapest model
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-instant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.0003&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Mid-range model 
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.015&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;     &lt;span class="c1"&gt;# Most expensive model
&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;is_routing_cost_effective&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Complexity check costs ~50-100 tokens
&lt;/span&gt;    &lt;span class="n"&gt;complexity_check_cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.015&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;  &lt;span class="c1"&gt;# ~$0.0015
&lt;/span&gt;
    &lt;span class="c1"&gt;# Potential savings by choosing optimal model
&lt;/span&gt;    &lt;span class="n"&gt;potential_savings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;calculate_model_cost_difference&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;potential_savings&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;complexity_check_cost&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  When Dynamic Routing Makes Sense
&lt;/h3&gt;

&lt;p&gt;Dynamic model routing is most beneficial in scenarios with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;High-volume systems (1000+ daily requests)&lt;/li&gt;
&lt;li&gt;Significant cost variation between models&lt;/li&gt;
&lt;li&gt;Diverse input complexity&lt;/li&gt;
&lt;li&gt;Large token count differences&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When to Reconsider
&lt;/h3&gt;

&lt;p&gt;You might want to skip complexity analysis if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your system has low request volume&lt;/li&gt;
&lt;li&gt;Input complexity is relatively uniform&lt;/li&gt;
&lt;li&gt;Model pricing is similar&lt;/li&gt;
&lt;li&gt;You have strict latency requirements&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cost Comparison
&lt;/h3&gt;

&lt;p&gt;Our solution doesn't just route requests - it meticulously tracks and calculates the cost of every single AI interaction. We've implemented a dedicated cost calculator Lambda function that processes each request's details and stores comprehensive cost information in DynamoDB. This approach allows for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Granular cost tracking per request&lt;/li&gt;
&lt;li&gt;Historical cost analysis&lt;/li&gt;
&lt;li&gt;Insights into model usage patterns
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;calculate_cost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_used&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;MODEL_COSTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-3.5-turbo-1106&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.000002&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Average cost per token
&lt;/span&gt;        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bedrock-instant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.0003&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bedrock-sonnet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.015&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;# Calculate cost based on tokens used
&lt;/span&gt;    &lt;span class="n"&gt;cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;MODEL_COSTS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_used&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;

    &lt;span class="c1"&gt;# Store detailed cost information in DynamoDB
&lt;/span&gt;    &lt;span class="n"&gt;dynamodb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put_item&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;TableName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ai-usage-costs&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Item&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;execution_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;S&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;())},&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;model_used&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;S&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;model_used&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tokens_used&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;N&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;)},&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;calculated_cost&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;N&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="p"&gt;)},&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;S&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;cost&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Terraform: Infrastructure as Code 🏗️
&lt;/h3&gt;

&lt;p&gt;The entire solution is implemented as a modular Terraform project, making it easy to deploy and customize:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Supports multiple AWS regions&lt;/li&gt;
&lt;li&gt;Easily configurable through variables&lt;/li&gt;
&lt;li&gt;Manages all AWS resources declaratively&lt;/li&gt;
&lt;li&gt;Includes security best practices

&lt;ul&gt;
&lt;li&gt;KMS encryption&lt;/li&gt;
&lt;li&gt;IAM least-privilege roles&lt;/li&gt;
&lt;li&gt;Secure parameter management&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Getting Started 🚀
&lt;/h3&gt;

&lt;p&gt;Want to try it out? Here's how:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Prerequisites:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Ensure you have&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;terraform  &lt;span class="c"&gt;# macOS&lt;/span&gt;
&lt;span class="c"&gt;# or&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get &lt;span class="nb"&gt;install &lt;/span&gt;terraform  &lt;span class="c"&gt;# Linux&lt;/span&gt;

&lt;span class="c"&gt;# Install AWS CLI&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;awscli

&lt;span class="c"&gt;# Configure AWS credentials&lt;/span&gt;
aws configure
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Clone the Repository:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/requix/aws-step-functions-ai-orchestration.git
&lt;span class="nb"&gt;cd &lt;/span&gt;aws-step-functions-ai-orchestration/terraform
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Set Up OpenAI API Key:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws ssm put-parameter &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"/ai-orchestration/openai-api-key"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--type&lt;/span&gt; &lt;span class="s2"&gt;"SecureString"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--value&lt;/span&gt; &lt;span class="s2"&gt;"your-openai-api-key"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Deploy Infrastructure:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;terraform init
terraform plan
terraform apply
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Run Your First Execution:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Use the output from terraform apply&lt;/span&gt;
aws stepfunctions start-execution &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--state-machine-arn&lt;/span&gt; YOUR_STATE_MACHINE_ARN &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--input&lt;/span&gt; &lt;span class="s1"&gt;'{"input": "What is the capital of France?"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Open Source and Community 🌐
&lt;/h3&gt;

&lt;p&gt;The entire project is open-source and available on GitHub:&lt;br&gt;
🔗 &lt;a href="https://github.com/requix/aws-step-functions-ai-orchestration" rel="noopener noreferrer"&gt;https://github.com/requix/aws-step-functions-ai-orchestration&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We welcome contributions, issue reports, and feature suggestions!&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;By expanding this architectural pattern, you can create an intelligent, cost-effective AI routing system that adapts to different use cases. The key is flexibility, continuous monitoring, and a willingness to iterate.&lt;/p&gt;

&lt;p&gt;Remember:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This is a proof of concept&lt;/li&gt;
&lt;li&gt;Always test thoroughly&lt;/li&gt;
&lt;li&gt;Monitor and optimize continuously&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Happy routing! 🤖✨&lt;/p&gt;

</description>
      <category>aws</category>
      <category>stepfunctions</category>
      <category>ai</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
