<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Yuuki Yamashita</title>
    <description>The latest articles on DEV Community by Yuuki Yamashita (@_76130e67067eab4c8510).</description>
    <link>https://dev.to/_76130e67067eab4c8510</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3963934%2Fb0f67ac0-2cd8-4455-9e2e-06bd89a61ba2.png</url>
      <title>DEV Community: Yuuki Yamashita</title>
      <link>https://dev.to/_76130e67067eab4c8510</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/_76130e67067eab4c8510"/>
    <language>en</language>
    <item>
      <title>The day I gave an AI a wallet — building an approval-gated shopping agent with Sonnet 4.6, AgentCore Payments, Rakuten and Stripe</title>
      <dc:creator>Yuuki Yamashita</dc:creator>
      <pubDate>Tue, 23 Jun 2026 13:48:22 +0000</pubDate>
      <link>https://dev.to/_76130e67067eab4c8510/the-day-i-gave-an-ai-a-wallet-building-an-approval-gated-shopping-agent-with-sonnet-46-504a</link>
      <guid>https://dev.to/_76130e67067eab4c8510/the-day-i-gave-an-ai-a-wallet-building-an-approval-gated-shopping-agent-with-sonnet-46-504a</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;&lt;br&gt;
I built a PoC that wires up &lt;strong&gt;Bedrock AgentCore Payments (x402 + USDC)&lt;/strong&gt; &lt;em&gt;and&lt;/em&gt; &lt;strong&gt;Rakuten + Stripe Checkout&lt;/strong&gt; behind a single Strands Agent — and made sure &lt;strong&gt;the agent can't spend a single cent without a human approval card&lt;/strong&gt;. Production-deployed on Vercel, with an 8-language UI.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Production: &lt;a href="https://wallet-agent.vercel.app/" rel="noopener noreferrer"&gt;https://wallet-agent.vercel.app/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Repo: &lt;a href="https://github.com/yama3133/wallet-agent" rel="noopener noreferrer"&gt;https://github.com/yama3133/wallet-agent&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Why I built this
&lt;/h2&gt;

&lt;p&gt;I have &lt;strong&gt;five different talks / CFPs&lt;/strong&gt; lined up — re:Invent 2026 COM Track, Qiita Tech Festa, iOSDC LT, re:Deploy Security, and a Slack Hackathon — all centred on the same idea: &lt;strong&gt;"the day you give an AI a wallet, and what an approval gate looks like."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Slides alone weren't going to carry the message. I needed a &lt;strong&gt;shared implementation base&lt;/strong&gt; that I could spin up live in front of an audience.&lt;/p&gt;

&lt;p&gt;The shape of the goal was simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;User asks something in natural language → the agent picks an option → &lt;strong&gt;an approval card pops up&lt;/strong&gt; → human approves / rejects → on approval, the actual payment runs → result is summarised back to the user.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I wanted to prove this in &lt;strong&gt;two axes&lt;/strong&gt;: pay-per-call paid APIs (microtransactions) and real-world product purchases.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fzvwestslq4vi7rah67uu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fzvwestslq4vi7rah67uu.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontend&lt;/strong&gt;: Next.js 16 on Vercel (approval list + chat + checkout result)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State&lt;/strong&gt;: DynamoDB — &lt;code&gt;wallet_agent_tasks&lt;/code&gt; / &lt;code&gt;approvals&lt;/code&gt; / &lt;code&gt;txns&lt;/code&gt;, Streams + PITR&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent&lt;/strong&gt;: AgentCore Runtime (ARM64 container) + Strands Agent + Claude Sonnet 4.6&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Payments — Phase 1&lt;/strong&gt;: AgentCore Payments → Privy (StripePrivy) → x402 → base-sepolia USDC&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Payments — Phase 2&lt;/strong&gt;: Rakuten Ichiba &lt;code&gt;IchibaItem/Search&lt;/code&gt; → Stripe Checkout (test mode)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Localisation&lt;/strong&gt;: ja / en / zh / ko / fr / it / es / &lt;strong&gt;ar (RTL)&lt;/strong&gt; — 8 languages, &lt;code&gt;localStorage&lt;/code&gt; + &lt;code&gt;navigator.language&lt;/code&gt; auto-detect, &lt;strong&gt;LINE Seed JP Bold&lt;/strong&gt; as the base font&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of it — code, CloudFormation, demo script — lives in &lt;a href="https://github.com/yama3133/wallet-agent" rel="noopener noreferrer"&gt;yama3133/wallet-agent&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The agent itself
&lt;/h2&gt;

&lt;p&gt;The agent is just six &lt;code&gt;@tool&lt;/code&gt; functions wired into a Strands Agent.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_paid_resources&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;  &lt;span class="c1"&gt;# x402 catalog
&lt;/span&gt;
&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;request_payment_approval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount_usd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;justification&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Write a pending approval to DynamoDB / local JSON and block until a decision lands.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute_x402_payment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payment_session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Drive AgentCore Payments&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; generate_payment_header through to a successful x402 settle.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_rakuten_items&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;keyword&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_jpy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hits&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt; &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;request_purchase_approval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount_jpy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;justification&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute_stripe_checkout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount_jpy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Create a Stripe Checkout Session and return the URL.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The trick is that &lt;code&gt;request_*_approval&lt;/code&gt; &lt;strong&gt;writes a row to DynamoDB and waits&lt;/strong&gt;. The tool chain literally can't progress until a human flips the row to &lt;code&gt;APPROVED&lt;/code&gt;. That single primitive keeps the LLM from going off the rails.&lt;/p&gt;

&lt;h2&gt;
  
  
  Phase 1 — the AgentCore Payments signer trap
&lt;/h2&gt;

&lt;p&gt;This is where I lost the most time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ProcessPayment → AccessDeniedException
"Privy credentials are invalid. Please verify the credential configuration."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I could create &lt;code&gt;PaymentManager&lt;/code&gt;, &lt;code&gt;PaymentConnector&lt;/code&gt;, and &lt;code&gt;PaymentInstrument&lt;/code&gt; (an Embedded Crypto Wallet) from boto3 just fine. &lt;code&gt;ProcessPayment&lt;/code&gt; was the one call that wouldn't go through.&lt;/p&gt;

&lt;p&gt;Looking at the Privy dashboard, the wallet that AWS's &lt;code&gt;CreatePaymentInstrument&lt;/code&gt; had created was &lt;strong&gt;owned by an internally-generated Privy User&lt;/strong&gt;, and my Authorization Key was &lt;strong&gt;not registered as a signer on any wallet at all&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The fix is to run Privy's official template, &lt;a href="https://github.com/privy-io/aws-agentcore-sdk" rel="noopener noreferrer"&gt;privy-io/aws-agentcore-sdk&lt;/a&gt;, locally and click through the &lt;strong&gt;"Connect agent"&lt;/strong&gt; UI in a browser. That UI hits Privy's internal API and adds your Authorization Key to &lt;code&gt;additional_signers&lt;/code&gt; on the wallet.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/privy-io/aws-agentcore-sdk ~/wallet-agent-privy-template
&lt;span class="nb"&gt;cd&lt;/span&gt; ~/wallet-agent-privy-template
&lt;span class="c"&gt;# Drop NEXT_PUBLIC_PRIVY_APP_ID / PRIVY_APP_SECRET / NEXT_PUBLIC_PRIVY_SIGNER_ID into .env.local&lt;/span&gt;
pnpm dev
&lt;span class="c"&gt;# → browse to localhost:3001 → log in → Connect agent&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After that, &lt;code&gt;process_payment&lt;/code&gt; returns &lt;strong&gt;&lt;code&gt;PROOF_GENERATED&lt;/code&gt;&lt;/strong&gt; and the merchant (&lt;code&gt;https://drvd12nxpcyd5.cloudfront.net/market-recap&lt;/code&gt;, a public x402 demo endpoint) accepts the payload.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[bedrock_agentcore.payments.manager] Successfully processed payment for user test-user-yama3133
[bedrock_agentcore.payments.manager] Successfully generated payment header for user test-user-yama3133
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The lesson: &lt;strong&gt;you cannot finish AgentCore Payments setup purely server-side&lt;/strong&gt;. Design your demo flow with the Privy frontend baked in from day one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Phase 2 — Rakuten and the Referer pitfall
&lt;/h2&gt;

&lt;p&gt;Phase 2 was a much more pedestrian WebAPI face-plant.&lt;/p&gt;

&lt;p&gt;I registered a new Rakuten webservice application, took the &lt;code&gt;Application ID&lt;/code&gt; (UUID form), and hit &lt;code&gt;https://openapi.rakuten.co.jp/ichibams/api/IchibaItem/Search/20260401&lt;/code&gt;. Got:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"errors"&lt;/span&gt;&lt;span class="p"&gt;:{&lt;/span&gt;&lt;span class="nl"&gt;"errorCode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;403&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"errorMessage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"REQUEST_CONTEXT_BODY_HTTP_REFERRER_MISSING"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Adding &lt;code&gt;Referer: https://wallet-agent.vercel.app/&lt;/code&gt; didn't help. The actual culprit was &lt;strong&gt;User-Agent bot detection&lt;/strong&gt; — &lt;code&gt;User-Agent: wallet-agent/0.1&lt;/code&gt; is rejected. Swapping to a browser-ish &lt;code&gt;Mozilla/5.0 ...&lt;/code&gt; makes the request go through.&lt;/p&gt;

&lt;p&gt;The final &lt;code&gt;tools/rakuten.py&lt;/code&gt; looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User-Agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0 Safari/537.36&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Referer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;referer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;             &lt;span class="c1"&gt;# WALLET_AGENT_PUBLIC_URL
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Origin&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;referer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rstrip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;access_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;accessKey&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;access_key&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;accessKey&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;access_key&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From there: a pair of black socks at ¥1,980, a Stripe Checkout Session, test card &lt;code&gt;4242 4242 4242 4242&lt;/code&gt;, and a redirect to &lt;code&gt;https://wallet-agent.vercel.app/checkout/success?session_id=cs_test_...&lt;/code&gt; showing &lt;strong&gt;"Paid / 1,980 JPY."&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  DynamoDB and the Vercel frontend
&lt;/h2&gt;

&lt;p&gt;The approval state lives in &lt;strong&gt;&lt;code&gt;wallet_agent_approvals&lt;/code&gt;&lt;/strong&gt; on DynamoDB. Local dev falls back to &lt;code&gt;agent/.approvals.json&lt;/code&gt; — flipped by &lt;code&gt;WALLET_AGENT_STORAGE=local|dynamo&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The Next.js 16 App Router side is a small handful of route handlers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// /api/approvals (GET) — list PENDING&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;ddb&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ScanCommand&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;TableName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;TABLES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;approvals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;FilterExpression&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#s = :p&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;ExpressionAttributeNames&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#s&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;status&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;ExpressionAttributeValues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;:p&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;PENDING&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}));&lt;/span&gt;

&lt;span class="c1"&gt;// /api/approvals/[id] (POST) — decide&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;ddb&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UpdateCommand&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;TableName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;TABLES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;approvals&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;approval_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;UpdateExpression&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;SET #s = :d, decision = :d, #r = :r, decided_at = :t&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;ExpressionAttributeNames&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#s&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;status&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#r&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;reason&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;ExpressionAttributeValues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;:d&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;decision&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;:r&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;reason&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;:t&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;:pending&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;PENDING&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;ConditionExpression&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;attribute_exists(approval_id) AND #s = :pending&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I bit the &lt;strong&gt;&lt;code&gt;error&lt;/code&gt; is a DynamoDB reserved word&lt;/strong&gt; trap once — you need &lt;code&gt;ExpressionAttributeNames: { "#e": "error" }&lt;/code&gt; to update it, otherwise you get &lt;code&gt;ValidationException: Invalid UpdateExpression&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  8-language UI and LINE Seed JP
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;apps/web/src/lib/i18n.ts&lt;/code&gt; is just a flat dictionary of 8 languages × 31 keys, wired into a tiny &lt;code&gt;useI18n()&lt;/code&gt; Context. The Arabic locale flips &lt;code&gt;document.documentElement.dir&lt;/code&gt; to &lt;code&gt;"rtl"&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;documentElement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lang&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;locale&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;documentElement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getDir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;locale&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// "ltr" | "rtl"&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;locale&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The font is &lt;strong&gt;LINE Seed JP Bold&lt;/strong&gt; via &lt;code&gt;next/font/google&lt;/code&gt;, exposed as a CSS variable &lt;code&gt;--font-line-seed&lt;/code&gt; and dropped into Tailwind's &lt;code&gt;font-sans&lt;/code&gt;. It gives Japanese text a friendly rounded-bold feel that reads well alongside the LINE-app universe.&lt;/p&gt;

&lt;h2&gt;
  
  
  Things I learned
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The Privy signer wall is not solvable server-side.&lt;/strong&gt; Build the "Connect agent" frontend step into the demo on day one.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;agentcore configure&lt;/code&gt; is interactive by default.&lt;/strong&gt; With &lt;code&gt;-ni&lt;/code&gt;, a hand-rolled ECR repo, and a Dockerfile in the bundle, you can drive it from CI just fine.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vercel's 60-second Hobby timeout&lt;/strong&gt; does not play nicely with synchronously invoking a long-running agent. Plan for &lt;code&gt;waitUntil&lt;/code&gt; or a polling pattern.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One "human approval card" tool is enough&lt;/strong&gt; to make the LLM safe-by-construction. The same primitive solves Phase 1 and Phase 2 without modification.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;🐙 GitHub: &lt;a href="https://github.com/yama3133/wallet-agent" rel="noopener noreferrer"&gt;yama3133/wallet-agent&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;🚀 Production: &lt;a href="https://wallet-agent.vercel.app/" rel="noopener noreferrer"&gt;https://wallet-agent.vercel.app/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📐 Architecture diagram (PNG): &lt;a href="https://github.com/yama3133/wallet-agent/blob/main/docs/images/wallet-agent-architecture-en.png" rel="noopener noreferrer"&gt;docs/images/wallet-agent-architecture-en.png&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;🎬 Demo script: &lt;a href="https://github.com/yama3133/wallet-agent/blob/main/docs/demo-script.md" rel="noopener noreferrer"&gt;docs/demo-script.md&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This PoC is meant to power &lt;strong&gt;five different talks&lt;/strong&gt; with one shared implementation. I'll be improving it as those events get closer. Feedback welcome.&lt;/p&gt;

&lt;p&gt;— &lt;a href="https://github.com/yama3133" rel="noopener noreferrer"&gt;@yama3133&lt;/a&gt; (AWS Community Builder, AI Engineering / 2026)&lt;/p&gt;

</description>
      <category>bedrock</category>
      <category>agentcorepayments</category>
      <category>nextjs</category>
      <category>ai</category>
    </item>
    <item>
      <title>Porting Apex Voice to Windows broke at every single layer (faster-whisper + pystray + pywin32)</title>
      <dc:creator>Yuuki Yamashita</dc:creator>
      <pubDate>Mon, 22 Jun 2026 00:27:06 +0000</pubDate>
      <link>https://dev.to/_76130e67067eab4c8510/porting-apex-voice-to-windows-broke-at-every-single-layer-faster-whisper-pystray-pywin32-4h7e</link>
      <guid>https://dev.to/_76130e67067eab4c8510/porting-apex-voice-to-windows-broke-at-every-single-layer-faster-whisper-pystray-pywin32-4h7e</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;I ported my macOS voice-typing app &lt;strong&gt;Apex Voice&lt;/strong&gt; to Windows,&lt;br&gt;
aiming for feature parity. Every layer of the stack tripped me up.&lt;br&gt;
Here's the honest log.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Where I started
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/yama3133/apex-voice" rel="noopener noreferrer"&gt;Apex Voice&lt;/a&gt; is the macOS&lt;br&gt;
menu-bar voice-typing app I built on mlx-whisper + rumps + LaunchAgent.&lt;br&gt;
Apple Silicon-only by design, so it doesn't run on Windows as-is.&lt;/p&gt;

&lt;p&gt;The plan was a port for parity. The only test environment I had on&lt;br&gt;
hand was &lt;strong&gt;Windows 11 ARM64 (evaluation) on UTM&lt;/strong&gt;. I did not expect&lt;br&gt;
this to be a months-of-effort kind of project. It wasn't months, but&lt;br&gt;
it was every layer.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Repo: &lt;a href="https://github.com/yama3133/apex-voice" rel="noopener noreferrer"&gt;github.com/yama3133/apex-voice&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Windows README: &lt;a href="https://github.com/yama3133/apex-voice/blob/main/README_win.md" rel="noopener noreferrer"&gt;README_win.md&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fflvvg43wlbl6umbn8eqm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fflvvg43wlbl6umbn8eqm.png" alt=" " width="800" height="498"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  The stack I picked
&lt;/h2&gt;

&lt;p&gt;The substitutions I started with:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;macOS&lt;/th&gt;
&lt;th&gt;Windows&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Speech recognition&lt;/td&gt;
&lt;td&gt;mlx-whisper&lt;/td&gt;
&lt;td&gt;faster-whisper (CPU/int8)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tray&lt;/td&gt;
&lt;td&gt;rumps&lt;/td&gt;
&lt;td&gt;pystray&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Text insertion&lt;/td&gt;
&lt;td&gt;NSPasteboard + osascript Cmd+V&lt;/td&gt;
&lt;td&gt;pyperclip + pynput Ctrl+V&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auto-launch&lt;/td&gt;
&lt;td&gt;LaunchAgent&lt;/td&gt;
&lt;td&gt;Task Scheduler&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Caps Lock one-key&lt;/td&gt;
&lt;td&gt;Karabiner-Elements&lt;/td&gt;
&lt;td&gt;AutoHotkey v2&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In my head this was "swap the libraries." In reality, &lt;strong&gt;every swapped&lt;br&gt;
layer&lt;/strong&gt; broke in its own way.&lt;/p&gt;
&lt;h2&gt;
  
  
  Pain #1: sounddevice doesn't load on ARM64
&lt;/h2&gt;

&lt;p&gt;First blocker. I wanted to use sounddevice for mic input, but on&lt;br&gt;
ARM64 Windows it fails with &lt;code&gt;libportaudioarm64.dll: error 0x7e&lt;/code&gt;.&lt;br&gt;
PortAudio doesn't ship an ARM64 binary.&lt;/p&gt;

&lt;p&gt;→ Fall back to pyaudio. pyaudio has an ARM64 wheel and just works.&lt;br&gt;
Rewrote the Recorder class around it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Recorder&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;sounddevice doesn&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t work on ARM64 Windows; use pyaudio.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_start_pyaudio&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pyaudio&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_pa&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pyaudio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;PyAudio&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_pa&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;rate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;SAMPLE_RATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;channels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pyaudio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;paFloat32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;frames_per_buffer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;BLOCK&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Pain #2: zero output on startup
&lt;/h2&gt;

&lt;p&gt;Ran &lt;code&gt;python voicetype_win.py&lt;/code&gt; and the console showed nothing. The&lt;br&gt;
process was running but looked frozen.&lt;/p&gt;

&lt;p&gt;My first guess: pystray is blocking the main thread. Added print&lt;br&gt;
statements everywhere. Still nothing.&lt;/p&gt;

&lt;p&gt;The actual cause: &lt;strong&gt;the Windows console was choking on bytes written&lt;br&gt;
to stderr&lt;/strong&gt;. pyaudio and pystray write something to stderr at startup&lt;br&gt;
that the console's code page can't render, and the output buffer&lt;br&gt;
stalls.&lt;/p&gt;

&lt;p&gt;Workaround:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python -u voicetype_win.py 2&amp;gt;err.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;-u&lt;/code&gt; unbuffers stdout, &lt;code&gt;2&amp;gt;err.txt&lt;/code&gt; redirects stderr to a file. After&lt;br&gt;
that I could finally see my &lt;code&gt;STEP1&lt;/code&gt; &lt;code&gt;STEP2&lt;/code&gt; ... diagnostic prints.&lt;/p&gt;

&lt;p&gt;I spent over an hour debugging "why aren't my &lt;code&gt;flush=True&lt;/code&gt; prints&lt;br&gt;
appearing."&lt;/p&gt;
&lt;h2&gt;
  
  
  Pain #3: pynput can't catch F19
&lt;/h2&gt;

&lt;p&gt;On macOS I use Karabiner to remap Caps Lock → F19, and pynput&lt;br&gt;
listens for F19. I tried to replicate this with PowerToys on Windows.&lt;/p&gt;

&lt;p&gt;The PowerToys remap itself worked (the Caps Lock LED no longer&lt;br&gt;
toggled, meaning the original Caps Lock behavior was suppressed).&lt;br&gt;
But pynput didn't catch F19. The log said &lt;code&gt;hotkey registered: &amp;lt;f19&amp;gt;&lt;/code&gt;,&lt;br&gt;
but pressing the key did nothing.&lt;/p&gt;

&lt;p&gt;I later found that pynput on Windows has known issues with F13 and&lt;br&gt;
above. Switched the hotkey to &lt;code&gt;&amp;lt;ctrl&amp;gt;+&amp;lt;alt&amp;gt;+r&lt;/code&gt; and it worked&lt;br&gt;
immediately.&lt;/p&gt;
&lt;h2&gt;
  
  
  Pain #4: PowerToys ate Ctrl+V
&lt;/h2&gt;

&lt;p&gt;Mid-debug, PowerToys Keyboard Manager somehow &lt;strong&gt;hijacked Ctrl+V&lt;br&gt;
itself&lt;/strong&gt;. Even after stopping Apex Voice, Ctrl+V no longer pasted.&lt;br&gt;
PowerToys' keyboard hook had stacked some leftover state.&lt;/p&gt;

&lt;p&gt;Eventually I lost &lt;strong&gt;all&lt;/strong&gt; keyboard input to Notepad and had to&lt;br&gt;
reboot the VM. After that I dropped PowerToys (it felt brittle) and&lt;br&gt;
switched to &lt;strong&gt;AutoHotkey v2&lt;/strong&gt; for the remap.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#Requires AutoHotkey v2.0
SetCapsLockState("AlwaysOff")
CapsLock::Send("^!r")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;SetCapsLockState("AlwaysOff")&lt;/code&gt; suppresses the Caps Lock toggle&lt;br&gt;
behavior (the uppercase mode).&lt;/p&gt;
&lt;h2&gt;
  
  
  Pain #5: Tray clicks steal focus
&lt;/h2&gt;

&lt;p&gt;Click tray → stop recording → Whisper transcribes → paste into&lt;br&gt;
Notepad. The "inserted" log appeared, but &lt;strong&gt;nothing showed up in&lt;br&gt;
Notepad&lt;/strong&gt;. The moment I clicked the tray, focus had moved to the&lt;br&gt;
Command Prompt, and Ctrl+V was pasting there.&lt;/p&gt;

&lt;p&gt;Fix: use pywin32 to save the foreground window handle when recording&lt;br&gt;
starts, and restore it right before the paste:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_toggle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;recording&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;win32gui&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inserter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_prev_hwnd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;win32gui&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;GetForegroundWindow&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;pass&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;win32gui&lt;/span&gt;
    &lt;span class="n"&gt;prev_hwnd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;_prev_hwnd&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;prev_hwnd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;win32gui&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;SetForegroundWindow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prev_hwnd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.15&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;pass&lt;/span&gt;
    &lt;span class="c1"&gt;# then pyperclip + Ctrl+V
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On macOS, osascript handled active-app restoration implicitly, so&lt;br&gt;
this code didn't exist. OS-layer differences exposed.&lt;/p&gt;
&lt;h2&gt;
  
  
  Pain #6: Bedrock — MissingDependencyException
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;aws login&lt;/code&gt; works, &lt;code&gt;aws sts get-caller-identity&lt;/code&gt; returns my user, but&lt;br&gt;
calling Bedrock via boto3 from Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;botocore.exceptions.MissingDependencyException: Missing Dependency:
Using the login credentials provider requires an additional dependency.
You will need to pip install "botocore[crt]"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Credentials produced by &lt;code&gt;aws login&lt;/code&gt; (AWS CLI 2.32.0+) need botocore's&lt;br&gt;
optional &lt;code&gt;crt&lt;/code&gt; extra:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install "botocore[crt]"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After that, polish/formal/translate/bullets all worked.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pain #7: Caps Lock doesn't reach the VM in UTM
&lt;/h2&gt;

&lt;p&gt;Running the AutoHotkey script as admin, pressing Caps Lock did&lt;br&gt;
nothing. AHK was never triggered.&lt;/p&gt;

&lt;p&gt;Cut to the chase: &lt;strong&gt;the Mac → UTM → Windows key path drops Caps&lt;br&gt;
Lock somewhere&lt;/strong&gt;. Either UTM's keyboard pass-through doesn't relay&lt;br&gt;
it, or macOS absorbs it at the OS level before UTM sees it. Hard to&lt;br&gt;
pin down.&lt;/p&gt;

&lt;p&gt;This wouldn't happen on bare-metal Windows. I gave up on Caps Lock&lt;br&gt;
1-key and shipped with &lt;strong&gt;&lt;code&gt;&amp;lt;ctrl&amp;gt;+&amp;lt;alt&amp;gt;+r&lt;/code&gt; pressed directly&lt;/strong&gt; as the&lt;br&gt;
hotkey. Control+Option+R on the Mac keyboard arrives in Windows as&lt;br&gt;
Ctrl+Alt+R through UTM, so functionally it works.&lt;/p&gt;
&lt;h2&gt;
  
  
  Auto-launch on login
&lt;/h2&gt;

&lt;p&gt;Task Scheduler with &lt;code&gt;/sc onlogon /rl highest&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;schtasks /create /tn "ApexVoice" /tr "%USERPROFILE%\apex_voice_start.bat" /sc onlogon /rl highest /f
schtasks /create /tn "ApexCapsAHK" /tr "%USERPROFILE%\apex_caps.ahk" /sc onlogon /rl highest /f
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Important: if Apex Voice runs as administrator, AutoHotkey must also&lt;br&gt;
run as administrator (Windows blocks key injection from a lower-&lt;br&gt;
privilege process to a higher-privilege one).&lt;/p&gt;

&lt;h2&gt;
  
  
  End state and performance
&lt;/h2&gt;

&lt;p&gt;Measured on UTM (no bare metal at hand):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;UTM (ARM64)&lt;/th&gt;
&lt;th&gt;Estimated (bare-metal Windows)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;4s audio recognition&lt;/td&gt;
&lt;td&gt;~37s (CPU/int8, large-v3-turbo)&lt;/td&gt;
&lt;td&gt;a few seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Insertion latency&lt;/td&gt;
&lt;td&gt;~250ms&lt;/td&gt;
&lt;td&gt;similar&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bedrock post-processing&lt;/td&gt;
&lt;td&gt;works&lt;/td&gt;
&lt;td&gt;works&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Caps Lock 1-key&lt;/td&gt;
&lt;td&gt;impossible (UTM limitation)&lt;/td&gt;
&lt;td&gt;should work&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;UTM's virtual CPU is too slow for Whisper at this model size. On&lt;br&gt;
real Windows hardware &lt;code&gt;large-v3-turbo&lt;/code&gt; should be practical. Need&lt;br&gt;
bare-metal testers for that.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;"Just swap the libraries" is a lie.&lt;/strong&gt; OS permission models, key
codes, console code pages, focus management — even Python apps
hit all of these at the lower layers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When console output disappears, suspect stderr.&lt;/strong&gt; Splitting it
off with &lt;code&gt;2&amp;gt;err.txt&lt;/code&gt; made everything visible.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key remappers that grab modifiers are dangerous.&lt;/strong&gt; PowerToys
Keyboard Manager left leftover state that killed Ctrl+V. AHK v2
has been more stable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The real cost of cross-platform isn't code volume&lt;/strong&gt;, it's the
number of OS-specific traps you have to step into.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Verification on bare-metal Windows.&lt;/strong&gt; UTM is enough for validation
but I want to confirm performance and the Caps Lock path on real
hardware.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GPU inference.&lt;/strong&gt; faster-whisper supports CUDA, so an NVIDIA-GPU
Windows machine should be dramatically faster.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AgentCore Memory / Browser / Payments on Windows.&lt;/strong&gt; The code path
is shared with macOS but I haven't verified it on Windows yet.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;The Windows port is functional on UTM, published, and documented in&lt;br&gt;
README_win.md. If you try it on bare metal, issues/PRs welcome:&lt;br&gt;
&lt;a href="https://github.com/yama3133/apex-voice" rel="noopener noreferrer"&gt;github.com/yama3133/apex-voice&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>bedrock</category>
      <category>python</category>
      <category>windows</category>
    </item>
    <item>
      <title>Apex Voice v2 — Killing Hallucinations, Cutting Latency, and Landing on Caps Lock</title>
      <dc:creator>Yuuki Yamashita</dc:creator>
      <pubDate>Sun, 21 Jun 2026 14:06:10 +0000</pubDate>
      <link>https://dev.to/_76130e67067eab4c8510/apex-voice-v2-killing-hallucinations-cutting-latency-and-landing-on-caps-lock-aei</link>
      <guid>https://dev.to/_76130e67067eab4c8510/apex-voice-v2-killing-hallucinations-cutting-latency-and-landing-on-caps-lock-aei</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Follow-up to my first post on &lt;strong&gt;Apex Voice&lt;/strong&gt; — the macOS menu-bar&lt;br&gt;
voice-typing app I built with mlx-whisper and Amazon Bedrock.&lt;br&gt;
This is the "make it actually pleasant to use every day" pass.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Where I Left Off
&lt;/h2&gt;

&lt;p&gt;A few weeks ago I shipped v0.2 — local Whisper, Bedrock post-processing,&lt;br&gt;
Strands Agents for multi-step tool calls, AgentCore Memory for&lt;br&gt;
vocabulary learning. It worked. But day-to-day it had three rough edges:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;YouTube-style hallucinations&lt;/strong&gt; — "Thanks for watching, see you in
the next video!" appearing mid-sentence when I hadn't said anything.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Latency&lt;/strong&gt; — noticeable lag between speaking and seeing text.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hotkey ergonomics&lt;/strong&gt; — &lt;code&gt;⌃⌥V&lt;/code&gt; worked, but it's a three-finger
contortion I never got comfortable with.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;v2 is about fixing all three plus adding two real Bedrock AgentCore&lt;br&gt;
features I'd been deferring.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Repo: &lt;a href="https://github.com/yama3133/apex-voice" rel="noopener noreferrer"&gt;github.com/yama3133/apex-voice&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Companion web: &lt;a href="https://apex-voice-web.vercel.app" rel="noopener noreferrer"&gt;apex-voice-web.vercel.app&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  1. Killing the Hallucinations (the layered way)
&lt;/h2&gt;

&lt;p&gt;Whisper hallucinations aren't a bug — they're a side effect of how&lt;br&gt;
the model was trained. The Japanese training set is dominated by&lt;br&gt;
YouTube captions, so when Whisper sees silence or noise, the highest-&lt;br&gt;
probability output is a polite YouTube sign-off. There's no single fix.&lt;br&gt;
You have to layer defenses.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 1: push-to-talk, not VAD-driven segmentation.&lt;/strong&gt;&lt;br&gt;
The original recorder used RMS-based VAD to detect the start and end&lt;br&gt;
of each utterance. That meant any time my mic picked up keyboard&lt;br&gt;
noise or breath, a "phantom utterance" got passed to Whisper —&lt;br&gt;
and Whisper helpfully invented words. I tore it out:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Listening mode = take everything, no judgment
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_speaking&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_speaking&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
    &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_buf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_pre&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_buf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The user explicitly toggles recording on/off (now via Caps Lock,&lt;br&gt;
see §3), so VAD inside the recording window is redundant. The&lt;br&gt;
gate is now at the boundary, not inside.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 2: Silero VAD as the final gate before Whisper.&lt;/strong&gt;&lt;br&gt;
Once a recording ends, I run it through &lt;a href="https://github.com/snakers4/silero-vad" rel="noopener noreferrer"&gt;Silero VAD&lt;/a&gt;. If Silero says&lt;br&gt;
"no speech here," the audio never touches Whisper.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;_silero_has_speech&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;audio&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Silero VAD: no speech → drop&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I also preload Silero at app startup on a background thread, so&lt;br&gt;
the first utterance doesn't pay the model-load cost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 3: tighter mlx-whisper decoding.&lt;/strong&gt;&lt;br&gt;
Three knobs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;no_speech_threshold&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;         &lt;span class="c1"&gt;# was 0.5 — bias toward silence
&lt;/span&gt;    &lt;span class="n"&gt;logprob_threshold&lt;/span&gt;&lt;span class="o"&gt;=-&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;          &lt;span class="c1"&gt;# drop low-confidence outputs
&lt;/span&gt;    &lt;span class="n"&gt;compression_ratio_threshold&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;2.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# drop repetition spirals
&lt;/span&gt;    &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                 &lt;span class="c1"&gt;# no fallback ladder
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Temperature fallback was a hidden latency cost too (more on that&lt;br&gt;
in §2).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 4: STRONG vs WEAK phrase filters.&lt;/strong&gt;&lt;br&gt;
My old filter rejected outputs that were &lt;em&gt;mostly&lt;/em&gt; a known YouTube&lt;br&gt;
phrase. But "次の動画でお会いしましょう" ("see you in the next video")&lt;br&gt;
has the trigger phrase "次の動画" at the front, followed by 9&lt;br&gt;
characters of plausible-looking text. The "mostly" check let it&lt;br&gt;
through.&lt;/p&gt;

&lt;p&gt;The fix was simple. Split the phrase list into two:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;STRONG&lt;/strong&gt;: if this phrase appears &lt;em&gt;anywhere&lt;/em&gt;, drop. ("see you in
the next video", "channel registration", "thank you for watching")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WEAK&lt;/strong&gt;: only drop if the phrase is the dominant content.
("good night" — could be a real utterance)
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;STRONG_HALLUCINATION_PHRASES&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;  &lt;span class="c1"&gt;# immediate kill
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h2&gt;
  
  
  2. Cutting the Latency
&lt;/h2&gt;

&lt;p&gt;I added timing logs to every stage to find the real bottleneck:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;audio=4.8s vad=56ms whisper=1499ms → 19 chars
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Whisper is the bottleneck. Always. So I attacked it three ways.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Async clipboard restore.&lt;/strong&gt; The clipboard dance was blocking:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_pb_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_paste&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# this is what the user feels
# THEN sleep 150ms, THEN restore...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Moved restore to a background thread. The user-perceived insertion&lt;br&gt;
time dropped from ~450ms (first time) to ~150ms steady-state.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quantized Whisper model.&lt;/strong&gt; Switched the default from&lt;br&gt;
&lt;code&gt;mlx-community/whisper-large-v3-turbo&lt;/code&gt; to its 4-bit quantized&lt;br&gt;
sibling &lt;code&gt;whisper-large-v3-turbo-q4&lt;/code&gt;. Honest result:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;turbo (default)&lt;/th&gt;
&lt;th&gt;turbo-q4&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Model size&lt;/td&gt;
&lt;td&gt;~800MB&lt;/td&gt;
&lt;td&gt;~350MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Whisper time (4s audio)&lt;/td&gt;
&lt;td&gt;~1500ms&lt;/td&gt;
&lt;td&gt;~1500ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Accuracy (subjective)&lt;/td&gt;
&lt;td&gt;high&lt;/td&gt;
&lt;td&gt;same or slightly better&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;MLX is already heavily optimized, so the speed delta is ~0%.&lt;br&gt;
But the memory and disk halve, accuracy didn't degrade in my&lt;br&gt;
testing, and there's no reason not to take the win.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The failed experiment.&lt;/strong&gt; I also tried&lt;br&gt;
&lt;code&gt;mlx-community/distil-whisper-large-v3&lt;/code&gt; — theoretically 2× faster.&lt;br&gt;
Japanese accuracy collapsed. Distil-Whisper is English-tuned and&lt;br&gt;
it shows. Reverted in 30 seconds. Worth saying out loud:&lt;br&gt;
&lt;strong&gt;always measure end-to-end, not just throughput.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;End state: ~1.5s for a 4-second utterance, on M-series Mac.&lt;br&gt;
That's the floor with this model.&lt;/p&gt;
&lt;h2&gt;
  
  
  3. Caps Lock as the One-Key Hotkey
&lt;/h2&gt;

&lt;p&gt;The original hotkey was &lt;code&gt;⌃⌥V&lt;/code&gt; via &lt;code&gt;pynput.GlobalHotKeys&lt;/code&gt;. Functional,&lt;br&gt;
but I never warmed to the three-finger combo. I wanted &lt;strong&gt;one key&lt;/strong&gt;&lt;br&gt;
that I could mash without thinking — but every reasonable single-key&lt;br&gt;
candidate has a problem:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Letter keys (single &lt;code&gt;v&lt;/code&gt;, &lt;code&gt;a&lt;/code&gt;, etc.) — kills normal typing&lt;/li&gt;
&lt;li&gt;Right Cmd / Right Option — pynput can't distinguish left/right&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;fn&lt;/code&gt; (globe key) — macOS hides this from userspace&lt;/li&gt;
&lt;li&gt;Caps Lock — taken by macOS for its toggle behavior&lt;/li&gt;
&lt;li&gt;F13–F19 — perfect, but most keyboards don't have them physically&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The fix: rebind Caps Lock to F19 at the OS layer with&lt;br&gt;
&lt;strong&gt;Karabiner-Elements&lt;/strong&gt;, then make Apex Voice listen for F19. Karabiner&lt;br&gt;
is a battle-tested OSS keyboard remapper for macOS. The config is&lt;br&gt;
a one-rule JSON file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Caps Lock -&amp;gt; F19 (for Apex Voice)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"manipulators"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"basic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"from"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"key_code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"caps_lock"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"modifiers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"optional"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"any"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"to"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"key_code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"f19"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apex Voice ships this rule and a &lt;code&gt;karabiner://import?url=…&lt;/code&gt; link&lt;br&gt;
that adds it with one click. The user-facing result: &lt;strong&gt;tap Caps&lt;br&gt;
Lock once to start recording, tap again to stop&lt;/strong&gt;. Best UX upgrade&lt;br&gt;
in the whole project.&lt;/p&gt;

&lt;p&gt;There's a small subtlety here for cross-platform thinking. The&lt;br&gt;
same &lt;code&gt;hotkey&lt;/code&gt; config value (&lt;code&gt;&amp;lt;f19&amp;gt;&lt;/code&gt;) works on Windows too, because&lt;br&gt;
&lt;code&gt;pynput.GlobalHotKeys&lt;/code&gt; parses identically. On Windows the user&lt;br&gt;
would remap Caps Lock → F19 with &lt;strong&gt;PowerToys Keyboard Manager&lt;/strong&gt;.&lt;br&gt;
Different tool, same shape — no code branches.&lt;/p&gt;
&lt;h2&gt;
  
  
  4. AgentCore Browser for Real Web Search
&lt;/h2&gt;

&lt;p&gt;The agent's "web search and summarize" tool was originally&lt;br&gt;
&lt;code&gt;requests&lt;/code&gt; + BeautifulSoup against Google's HTML results. It works&lt;br&gt;
right up until Google's bot detection notices and blocks the&lt;br&gt;
serverless IP. Predictable.&lt;/p&gt;

&lt;p&gt;The fix: a two-stage fetch.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1) requests + BeautifulSoup (fast, free)
   ↓ fail / too short / Google blocked
2) AgentCore Browser (managed Chromium + Playwright over CDP)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;AgentCore Browser is a managed headless Chromium environment on AWS,&lt;br&gt;
addressed via boto3 + an authenticated WebSocket. You connect&lt;br&gt;
Playwright with &lt;code&gt;connect_over_cdp&lt;/code&gt; and drive it like any browser&lt;br&gt;
session. Browsers with bot detection, JS-required pages, and Google's&lt;br&gt;
search results all work, because it really &lt;em&gt;is&lt;/em&gt; a real browser.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BrowserClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;BEDROCK_REGION&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;ws_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_ws_headers&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;sync_playwright&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chromium&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect_over_cdp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ws_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contexts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;new_page&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wait_until&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;domcontentloaded&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;() =&amp;gt; document.body.innerText&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sessions are started per-request and torn down — cost over comfort,&lt;br&gt;
since web summarization is a low-frequency action.&lt;/p&gt;
&lt;h2&gt;
  
  
  5. AgentCore Payments (x402) — Letting the Agent Pay
&lt;/h2&gt;

&lt;p&gt;This is the headline new capability. &lt;strong&gt;The agent can pay for things.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/bedrock/agentcore/" rel="noopener noreferrer"&gt;AgentCore Payments&lt;/a&gt; is&lt;br&gt;
the Bedrock service for AI-agent-initiated payments via embedded&lt;br&gt;
crypto wallets (Coinbase CDP or Stripe Privy). It speaks&lt;br&gt;
&lt;strong&gt;&lt;a href="https://www.x402.org/" rel="noopener noreferrer"&gt;x402&lt;/a&gt;&lt;/strong&gt;, the "HTTP 402 Payment Required"&lt;br&gt;
protocol revived for the agent era — a paid API responds with &lt;code&gt;402&lt;/code&gt;&lt;br&gt;
and a payment manifest, the agent generates a payment header and&lt;br&gt;
retries.&lt;/p&gt;

&lt;p&gt;The Apex Voice tool wires it into the existing approval flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;pay_for_paid_resource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_amount_usd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Pay for a paid HTTP resource via x402.

    1) GET the URL → expect HTTP 402
    2) Create a PaymentSession with a spend cap
    3) generate_payment_header for the 402 body
    4) Re-GET with the payment header
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;voice: "pay this API up to 5 cents"
  → Whisper transcript
  → Strands Agent picks pay_for_paid_resource
  → Guardrail check (per-request / per-day caps)
  → Human approval dialog (or Slack approval via Aegis)
  → PaymentManager.create_payment_session
  → generate_payment_header
  → HTTP retry with header
  → result inserted at cursor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Setting it up isn't trivial — you need a PaymentManager + Connector&lt;br&gt;
configured with Coinbase CDP or Stripe Privy credentials in the AWS&lt;br&gt;
console, and an embedded wallet funded for the user. Not a quick demo,&lt;br&gt;
but the &lt;strong&gt;code path is real and the approval flow is structurally what&lt;br&gt;
"giving an AI agent a wallet" should look like.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The New Architecture
&lt;/h2&gt;

&lt;p&gt;Everything composes into something that genuinely runs all day.&lt;/p&gt;

&lt;p&gt;(see architecture diagram below)&lt;/p&gt;

&lt;p&gt;The pipeline is local-first: mic → recording → Silero VAD → mlx-whisper&lt;br&gt;
→ insertion. Bedrock features layer on top per-utterance only when the&lt;br&gt;
mode demands them.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fx0gtvdj98h3h268o1asv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fx0gtvdj98h3h268o1asv.png" alt=" " width="800" height="512"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Learned (v2 Edition)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;For hallucinations, layered defense beats any single fix.&lt;/strong&gt; Pre-
filter (Silero VAD), in-flight params (logprob, no_speech), post-
filter (phrase lists). No single layer is enough.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Measure before optimizing, even when "obvious."&lt;/strong&gt; I was sure
the quantized model would be faster. It wasn't. MLX already had
the win baked in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The right primitive beats a clever hack.&lt;/strong&gt; Three-key combo vs.
Caps Lock + Karabiner — same goal, but one is invisible to the
user.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AgentCore is more than Memory.&lt;/strong&gt; Browser solves a real bot-
detection problem; Payments turns "give the agent a wallet"
from a thought experiment into a real code path with human
approval baked in.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A real x402 endpoint to demo against.&lt;/strong&gt; Right now the Payments
code path is functional but I'm exercising it with a synthetic
402 responder. I want to wire it to a real paid API.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Windows port.&lt;/strong&gt; mlx-whisper → faster-whisper, rumps → pystray.
The hotkey config (&lt;code&gt;&amp;lt;f19&amp;gt;&lt;/code&gt;) is already portable; Windows users
rebind via PowerToys.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vocabulary UI.&lt;/strong&gt; AgentCore Memory has been quietly learning
proper nouns for weeks. I have no UI to inspect or curate it.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;If you tried v0.2 and got bitten by the hallucinations or hotkey,&lt;br&gt;
v2 is the version to come back to. Issues and PRs:&lt;br&gt;
&lt;a href="https://github.com/yama3133/apex-voice" rel="noopener noreferrer"&gt;github.com/yama3133/apex-voice&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>bedrock</category>
      <category>bedrockagentcore</category>
      <category>strandsagents</category>
    </item>
    <item>
      <title>Building a chat-based Marp slide generator with Next.js, Amazon Bedrock, and Lambda</title>
      <dc:creator>Yuuki Yamashita</dc:creator>
      <pubDate>Fri, 19 Jun 2026 17:03:10 +0000</pubDate>
      <link>https://dev.to/_76130e67067eab4c8510/building-a-chat-based-marp-slide-generator-with-nextjs-amazon-bedrock-and-lambda-1hbj</link>
      <guid>https://dev.to/_76130e67067eab4c8510/building-a-chat-based-marp-slide-generator-with-nextjs-amazon-bedrock-and-lambda-1hbj</guid>
      <description>&lt;h1&gt;
  
  
  Overview
&lt;/h1&gt;

&lt;p&gt;I built an app that turns natural language into Marp slides through a chat interface.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Live: &lt;a href="https://marp-ai-app.vercel.app" rel="noopener noreferrer"&gt;marp-ai-app.vercel.app&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Repo: &lt;a href="https://github.com/yama3133/marp-ai-app" rel="noopener noreferrer"&gt;yama3133/marp-ai-app&lt;/a&gt; (Public)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can ask &lt;code&gt;"Make 8 slides on AI agent use cases"&lt;/code&gt; and the app generates the deck. Then keep chatting: &lt;code&gt;"Make slide 3 more concise"&lt;/code&gt; and it edits in place. Exports to &lt;strong&gt;PDF&lt;/strong&gt; and &lt;strong&gt;editable PPTX&lt;/strong&gt;. There is also a "My Themes" feature that lets you paste any Marp CSS, save it to localStorage, and switch on the fly.&lt;/p&gt;

&lt;p&gt;This post is a tour of the architecture and the rough edges I hit.&lt;/p&gt;

&lt;h1&gt;
  
  
  Architecture
&lt;/h1&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Browser
   ↓
Next.js 16 (Vercel) ─ /api/generate → Amazon Bedrock (Sonnet 4.6, us-east-1)
                    └ /api/export   → Lambda (arm64, ap-northeast-1)
                                       ├ marp-cli + Chromium     → PDF
                                       └ marp-cli + LibreOffice  → editable PPTX
                                       └ S3 (1-day lifecycle)     → presigned URL
Auth: Vercel OIDC Federation → IAM Role (keyless, no access keys)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Tech&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Frontend / API&lt;/td&gt;
&lt;td&gt;Next.js 16 (App Router, Route Handlers)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hosting&lt;/td&gt;
&lt;td&gt;Vercel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LLM&lt;/td&gt;
&lt;td&gt;Amazon Bedrock Converse API (&lt;code&gt;us.anthropic.claude-sonnet-4-6&lt;/code&gt;, us-east-1)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Export&lt;/td&gt;
&lt;td&gt;AWS Lambda (container, arm64, ap-northeast-1)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File delivery&lt;/td&gt;
&lt;td&gt;Amazon S3 + presigned URLs (1-day lifecycle)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auth&lt;/td&gt;
&lt;td&gt;Vercel OIDC Federation → IAM Role (keyless)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Preview&lt;/td&gt;
&lt;td&gt;marp-core (rendered in the browser)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fbn40i4xut5cqlezpojbj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fbn40i4xut5cqlezpojbj.png" alt=" " width="800" height="576"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  1. Multi-turn chat with Bedrock Converse
&lt;/h1&gt;

&lt;p&gt;I wanted a plain chat experience — user message, assistant reply, follow-up edits — so I pass the conversation directly to the Bedrock Converse API's &lt;code&gt;messages[]&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Message shape
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;ChatMessage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;markdown&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each assistant message holds both &lt;code&gt;content&lt;/code&gt; (the short conversational reply) and &lt;code&gt;markdown&lt;/code&gt; (the Marp deck). When sending to the API, I concatenate them into a single message:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;buildApiMessages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ChatMessage&lt;/span&gt;&lt;span class="p"&gt;[]):&lt;/span&gt; &lt;span class="nx"&gt;ApiMessage&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;markdown&lt;/span&gt;
        &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;\n\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;markdown&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  System prompt: reply first, then Markdown
&lt;/h2&gt;

&lt;p&gt;I want both a chat reply and a Marp deck back, so the system prompt enforces the format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="p"&gt;-&lt;/span&gt; Start with a 1-2 sentence natural reply.
&lt;span class="p"&gt;-&lt;/span&gt; If generating or editing slides, follow the reply (no blank line) with Marp Markdown.
&lt;span class="p"&gt;-&lt;/span&gt; Never wrap the Markdown in code fences (&lt;span class="sb"&gt;```&lt;/span&gt;

).
&lt;span class="p"&gt;-&lt;/span&gt; On edit requests, re-output the entire deck (no diffs, no patches).


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The server splits on the first &lt;code&gt;---\nmarp:&lt;/code&gt; occurrence:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
ts
function parseResponse(text: string): { message: string; markdown: string } {
  const trimmed = text.trim();
  let idx = trimmed.search(/(?:^|\n)---\s*\n\s*marp:/);
  if (idx === -1) idx = trimmed.search(/(?:^|\n)---\s*\n/);
  if (idx === -1) return { message: trimmed, markdown: "" };
  const splitAt = idx === 0 ? 0 : idx + 1;
  return {
    message: trimmed.slice(0, splitAt).trim(),
    markdown: trimmed.slice(splitAt).trim(),
  };
}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Full-deck re-outputs are easier to keep coherent than diffs (frontmatter, page breaks, closing slide all stay consistent). Since the chat history is already in Bedrock's context, vague follow-ups like &lt;code&gt;"shorter please"&lt;/code&gt; still work.&lt;/p&gt;

&lt;h1&gt;
  
  
  2. Making preview and export look the same
&lt;/h1&gt;

&lt;p&gt;Marp themes are CSS. The challenge:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Preview: &lt;code&gt;marp-core&lt;/code&gt; running in the browser&lt;/li&gt;
&lt;li&gt;Export: &lt;code&gt;marp-cli&lt;/code&gt; running in Lambda&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the theme CSS or font override drifts between the two, you get the classic "preview looks fine but the exported file looks different" bug.&lt;/p&gt;

&lt;h2&gt;
  
  
  One source of truth for theme CSS
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;src/lib/themes.ts&lt;/code&gt; defines six custom theme CSS strings. Both sides consume them:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
ts
// Preview side (marp-core)
for (const t of CUSTOM_THEMES) {
  if (t.css) marp.themeSet.add(t.css);
}

// Export side (Lambda: marp-cli)
if (themeCss) {
  const themePath = path.join(work, "theme.css");
  await writeFile(themePath, themeCss, "utf8");
  args.push("--theme-set", themePath);
}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
  
  
  Font override via a &lt;code&gt;&amp;lt;style&amp;gt;&lt;/code&gt; block in the Markdown
&lt;/h2&gt;

&lt;p&gt;Marp treats &lt;code&gt;&amp;lt;style&amp;gt;&lt;/code&gt; tags inside the Markdown as deck-wide CSS. Inject the same override into both &lt;code&gt;marp-core&lt;/code&gt; and &lt;code&gt;marp-cli&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
ts
export function fontStyleBlock(font: FontId): string {
  const stack = getFontStack(font);
  if (!stack) return "";
  return `&amp;lt;style&amp;gt;
section,
section :is(h1, h2, h3, h4, h5, h6, p, li, blockquote, th, td, a, strong, em) {
  font-family: ${stack} !important;
}
&amp;lt;/style&amp;gt;`;
}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h1&gt;
  
  
  3. User-defined themes (localStorage)
&lt;/h1&gt;

&lt;p&gt;I wanted to let users paste any Marp CSS and use it as a theme. The trick is making sure the &lt;code&gt;/* @theme name */&lt;/code&gt; header is always present — if the user forgot, generate one from the label and prepend it:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
ts
export function ensureThemeHeader(
  css: string,
  fallbackLabel: string,
): { name: string; css: string } {
  const existing = parseThemeName(css);
  if (existing) return { name: existing, css };
  const name = slugifyThemeName(fallbackLabel);
  return { name, css: `/* @theme ${name} */\n${css}` };
}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Validate against builtin names → save to localStorage → list under a "My Themes" optgroup → on export, send the CSS via &lt;code&gt;userThemeCss&lt;/code&gt; so Lambda's existing &lt;code&gt;--theme-set&lt;/code&gt; path picks it up. No Lambda changes required.&lt;/p&gt;

&lt;h1&gt;
  
  
  4. Lambda container: Chromium + LibreOffice in one box
&lt;/h1&gt;

&lt;p&gt;One Lambda handles both formats:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;marp-cli&lt;/code&gt; + Chromium → PDF (Marp's default path)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;marp-cli&lt;/code&gt; + LibreOffice → editable PPTX (&lt;code&gt;--pptx --pptx-editable&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Three Lambda-specific gotchas worth flagging.&lt;/p&gt;

&lt;h2&gt;
  
  
  Chromium dies with &lt;code&gt;Connection closed&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Lambda has a read-only filesystem, tiny &lt;code&gt;/dev/shm&lt;/code&gt;, and forbids sandboxing — Chromium's zygote/multi-process model falls over. Marp doesn't expose low-level browser args, so route through a wrapper script via &lt;code&gt;--browser-path&lt;/code&gt;:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
dockerfile
RUN printf '#!/bin/sh\nexec /usr/bin/chromium \
  --no-sandbox --disable-dev-shm-usage --disable-gpu \
  --disable-software-rasterizer --single-process --no-zygote "$@"\n' \
  &amp;gt; /usr/local/bin/chromium-marp \
  &amp;amp;&amp;amp; chmod +x /usr/local/bin/chromium-marp
ENV CHROME_PATH=/usr/local/bin/chromium-marp


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
  
  
  marp-cli hangs waiting on stdin
&lt;/h2&gt;

&lt;p&gt;In Lambda's &lt;code&gt;child_process.spawn&lt;/code&gt;, stdin is an open pipe by default, and &lt;code&gt;marp&lt;/code&gt; blocks waiting for input. Always pass &lt;code&gt;--no-stdin&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
js
const args = ["--no-stdin", "--allow-local-files"];


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
  
  
  M PLUS Rounded 1c font family name mismatch
&lt;/h2&gt;

&lt;p&gt;The TTF's internal family name is &lt;code&gt;Rounded Mplus 1c&lt;/code&gt;, so listing only &lt;code&gt;'M PLUS Rounded 1c'&lt;/code&gt; in CSS won't match. List both:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
css
font-family: 'M PLUS Rounded 1c','Rounded Mplus 1c',sans-serif;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h1&gt;
  
  
  5. Multi-stage build: 3.18 GB → 2.39 GB
&lt;/h1&gt;

&lt;p&gt;My first Dockerfile was a single stage at 3.18 GB. The culprit: the toolchain needed for &lt;code&gt;aws-lambda-ric&lt;/code&gt;'s native build (&lt;code&gt;g++ / cmake / automake / python3 / libcurl4-openssl-dev&lt;/code&gt;) was still in the runtime image.&lt;/p&gt;

&lt;p&gt;Split into three stages — runtime dropped to 2.39 GB (-790 MB, -25%):&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
dockerfile
# 1. Font fetcher
FROM debian:bookworm-slim AS fonts
RUN apt-get install -y --no-install-recommends curl ca-certificates
COPY install-fonts.sh /tmp/
RUN sh /tmp/install-fonts.sh

# 2. node_modules builder (aws-lambda-ric native build)
FROM node:22-bookworm-slim AS builder
RUN apt-get install -y --no-install-recommends \
      g++ make cmake autoconf automake libtool pkg-config python3 \
      libcurl4-openssl-dev ca-certificates
COPY package.json ./
RUN npm install --omit=dev &amp;amp;&amp;amp; npm cache clean --force

# 3. Runtime
FROM node:22-bookworm-slim AS runtime
RUN apt-get install -y --no-install-recommends \
      chromium libreoffice-impress fonts-noto-cjk fonts-noto-color-emoji fontconfig
COPY --from=fonts /usr/share/fonts/truetype/marpfonts /usr/share/fonts/truetype/marpfonts
RUN fc-cache -f /usr/share/fonts/truetype/marpfonts
COPY --from=builder /build/node_modules ./node_modules


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Key points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;fonts&lt;/code&gt; stage uses &lt;code&gt;curl&lt;/code&gt; to grab 7 font families — runtime never sees &lt;code&gt;curl&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;builder&lt;/code&gt; stage owns the toolchain; only &lt;code&gt;node_modules&lt;/code&gt; is copied to runtime&lt;/li&gt;
&lt;li&gt;Runtime only carries &lt;code&gt;chromium + libreoffice-impress + fonts-noto-cjk + fontconfig&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  6. Keyless AWS from Vercel via OIDC Federation
&lt;/h1&gt;

&lt;p&gt;I didn't want to store AWS access keys in Vercel env vars. Vercel's OIDC Federation lets you exchange a short-lived token for AWS credentials.&lt;/p&gt;

&lt;p&gt;The IAM trust policy locks scope and project:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
json
{
  "Effect": "Allow",
  "Principal": {
    "Federated": "arn:aws:iam::761018866498:oidc-provider/oidc.vercel.com/yuuki-yamashitas-projects"
  },
  "Action": "sts:AssumeRoleWithWebIdentity",
  "Condition": {
    "StringEquals": {
      "oidc.vercel.com/yuuki-yamashitas-projects:aud":
        "https://vercel.com/yuuki-yamashitas-projects",
      "oidc.vercel.com/yuuki-yamashitas-projects:sub":
        "owner:yuuki-yamashitas-projects:project:marp-ai-app:environment:production"
    }
  }
}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;On the Next.js side, &lt;code&gt;@vercel/functions/oidc&lt;/code&gt; does the heavy lifting:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
ts
import { awsCredentialsProvider } from "@vercel/functions/oidc";

export function awsCredentials() {
  if (process.env.VERCEL) {
    return awsCredentialsProvider({
      roleArn: process.env.AWS_ROLE_ARN!,
    });
  }
  return undefined; // local: default AWS credential chain
}


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Locally I fall back to &lt;code&gt;~/.aws/credentials&lt;/code&gt; via the default chain.&lt;/p&gt;

&lt;h1&gt;
  
  
  Wrap-up
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Multi-turn chat editing falls out of Bedrock Converse &lt;code&gt;messages[]&lt;/code&gt; if you let the model re-output the full deck on every turn&lt;/li&gt;
&lt;li&gt;Match preview and export by sharing theme CSS and injecting font overrides through a &lt;code&gt;&amp;lt;style&amp;gt;&lt;/code&gt; block in the Markdown&lt;/li&gt;
&lt;li&gt;For Chromium + LibreOffice in one Lambda: &lt;code&gt;--single-process --no-zygote --no-sandbox&lt;/code&gt; and &lt;code&gt;--no-stdin&lt;/code&gt; are non-negotiable&lt;/li&gt;
&lt;li&gt;Multi-stage build saved 790 MB&lt;/li&gt;
&lt;li&gt;Vercel → AWS goes through OIDC Federation, no keys involved&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/yama3133/marp-ai-app" rel="noopener noreferrer"&gt;yama3133/marp-ai-app&lt;/a&gt;&lt;br&gt;
Live: &lt;a href="https://marp-ai-app.vercel.app" rel="noopener noreferrer"&gt;marp-ai-app.vercel.app&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Feedback welcome.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>bedrock</category>
      <category>nextjs</category>
      <category>marp</category>
    </item>
    <item>
      <title>Voice Typing Anywhere on macOS — I Built Apex Voice with mlx-whisper, Amazon Bedrock, and Strands Agents</title>
      <dc:creator>Yuuki Yamashita</dc:creator>
      <pubDate>Fri, 19 Jun 2026 14:22:06 +0000</pubDate>
      <link>https://dev.to/_76130e67067eab4c8510/voice-typing-anywhere-on-macos-i-built-apex-voice-with-mlx-whisper-amazon-bedrock-and-strands-18a8</link>
      <guid>https://dev.to/_76130e67067eab4c8510/voice-typing-anywhere-on-macos-i-built-apex-voice-with-mlx-whisper-amazon-bedrock-and-strands-18a8</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;A weekend project that turned into a daily-driver: a macOS menu-bar app that lets you talk into &lt;strong&gt;any&lt;/strong&gt; input field — Slack, browser, Notes, anywhere — powered by local Whisper and Amazon Bedrock.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;p&gt;I built &lt;strong&gt;Apex Voice&lt;/strong&gt;, an open-source macOS voice typing tool. It listens through your microphone, transcribes speech offline with &lt;a href="https://github.com/ml-explore/mlx-examples/tree/main/whisper" rel="noopener noreferrer"&gt;mlx-whisper&lt;/a&gt;, and inserts the result wherever your cursor is. With Amazon Bedrock layered on top, it can also polish the text, translate it, or execute agent actions like "add a reminder" or "summarize this page for me."&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Repo: &lt;a href="https://github.com/yama3133/apex-voice" rel="noopener noreferrer"&gt;github.com/yama3133/apex-voice&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Companion web app: &lt;a href="https://apex-voice-web.vercel.app" rel="noopener noreferrer"&gt;apex-voice-web.vercel.app&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why I Built It
&lt;/h2&gt;

&lt;p&gt;macOS already has built-in dictation, and there are great commercial tools like Aqua Voice. So why bother?&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Built-in dictation&lt;/strong&gt; doesn't reliably work in every app, and Japanese accuracy is uneven.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aqua Voice&lt;/strong&gt; is polished but closed and paid.&lt;/li&gt;
&lt;li&gt;I wanted to &lt;strong&gt;own the stack&lt;/strong&gt;: pick my model, my post-processing, my agent tools. And to learn by building.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What It Does
&lt;/h2&gt;

&lt;p&gt;The core loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Mic → VAD → mlx-whisper → (optional Bedrock post-process) → Clipboard → ⌘V into any app
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On top of that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Post-process modes&lt;/strong&gt; powered by Claude Haiku 4.5 on Bedrock: &lt;code&gt;polish&lt;/code&gt;, &lt;code&gt;formal&lt;/code&gt;, &lt;code&gt;translate&lt;/code&gt;, &lt;code&gt;bullets&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent mode&lt;/strong&gt; via &lt;a href="https://github.com/strands-agents/sdk-python" rel="noopener noreferrer"&gt;Strands Agents&lt;/a&gt;: one utterance can trigger multiple tool calls — add a reminder, open a calendar event, fetch a webpage and summarize it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vocabulary learning&lt;/strong&gt; with &lt;a href="https://aws.amazon.com/bedrock/agentcore/" rel="noopener noreferrer"&gt;Amazon Bedrock AgentCore Memory&lt;/a&gt;: proper nouns and domain terms accumulate over time and get injected as Whisper's &lt;code&gt;initial_prompt&lt;/code&gt;, so accuracy improves with use.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;The whole thing runs as a Python process managed by &lt;code&gt;launchd&lt;/code&gt;. Local-only by default; Bedrock features kick in when you enable them.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fdlbx4ecm7siaflkbguoc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fdlbx4ecm7siaflkbguoc.png" alt=" " width="800" height="498"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The main pipeline (mic → whisper → insertion) is fully offline. Bedrock and AgentCore Memory are auxiliary — they make the experience richer but the app works without them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The AWS Side
&lt;/h2&gt;

&lt;p&gt;Three Bedrock-shaped pieces:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Piece&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Amazon Bedrock (Claude Haiku 4.5)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Post-processing (rewrite/translate/bullets), agent classification, page summarization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Strands Agents&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Defines tool schemas and orchestrates multi-step calls to Claude&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Amazon Bedrock AgentCore Memory&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Persists extracted vocabulary across sessions; injected as Whisper prompt hints&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;I picked &lt;strong&gt;Haiku 4.5&lt;/strong&gt; because the post-processing happens &lt;em&gt;every utterance&lt;/em&gt;. Sub-second latency matters more than top-tier reasoning here. For the agent mode, Haiku is still strong enough to pick the right tool from ~10 options reliably.&lt;/p&gt;

&lt;p&gt;The companion web app (&lt;code&gt;apex-voice-web&lt;/code&gt;) runs on &lt;strong&gt;Vercel&lt;/strong&gt; with a Python serverless function that calls Bedrock for URL classification and summarization. It uses &lt;strong&gt;Upstash Redis&lt;/strong&gt; for a live history feed. The macOS app fires history entries to it via a non-blocking POST.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Hard Part Wasn't AI. It Was Packaging.
&lt;/h2&gt;

&lt;p&gt;I lost almost a full day to &lt;code&gt;py2app&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The plan was obvious: &lt;code&gt;setup.py py2app&lt;/code&gt; → &lt;code&gt;dist/Apex Voice.app&lt;/code&gt; → drop it in &lt;code&gt;/Applications&lt;/code&gt; → done. Reality:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;[12:12:22] 認識エラー: bad local file header:
  '/Users/.../Apex Voice.app/Contents/Resources/lib/python312.zip'
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;py2app&lt;/code&gt; bundles your dependencies into a &lt;code&gt;python312.zip&lt;/code&gt;, and &lt;code&gt;mlx-whisper&lt;/code&gt;'s native extensions don't survive being zipped. I tried &lt;code&gt;zip_include_packages: []&lt;/code&gt; — not a valid py2app option. I tried a shell-script launcher inside the &lt;code&gt;.app&lt;/code&gt; — macOS warned the user about needing Rosetta. I tried a hand-compiled arm64 launcher binary — that worked, but every iteration meant re-granting Accessibility permission because the bundle signature changed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix: stop building a &lt;code&gt;.app&lt;/code&gt; entirely.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I switched to a &lt;strong&gt;LaunchAgent&lt;/strong&gt; plist (&lt;code&gt;com.yamashita.apexvoice.plist&lt;/code&gt;) that points directly at the venv's Python and the script. Drop it in &lt;code&gt;~/Library/LaunchAgents/&lt;/code&gt;, &lt;code&gt;launchctl load&lt;/code&gt;, done. With &lt;code&gt;KeepAlive: true&lt;/code&gt;, it auto-restarts on crash. The "restart" menu item just calls &lt;code&gt;rumps.quit_application()&lt;/code&gt; and lets launchd bring it back.&lt;/p&gt;

&lt;p&gt;Two small touches kept it feeling like a proper app:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;setproctitle&lt;/span&gt;
&lt;span class="n"&gt;setproctitle&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setproctitle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Apex Voice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So Activity Monitor shows "Apex Voice" instead of "python3.12". And for the menu-bar icon, I rendered SF Symbols (&lt;code&gt;waveform.and.mic&lt;/code&gt; / &lt;code&gt;mic.fill&lt;/code&gt;) to PNG once and loaded them as a template image — they auto-adapt to light/dark mode.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Don't fight the OS packaging story when you have a better path.&lt;/strong&gt; A LaunchAgent + venv beat py2app on every axis: simpler, more stable, easier to update.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pick the model for the latency profile, not the leaderboard.&lt;/strong&gt; Haiku 4.5 wins here because users feel every 500 ms in a voice typing loop.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bluetooth mic quality is an OS problem, not an app problem.&lt;/strong&gt; When the mic engages HFP mode, sample rate drops to 8 kHz across the board. No app-level fix exists on macOS. (Windows handles this slightly better in some driver combos, but the underlying HFP/A2DP tradeoff is universal.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Accessibility-driven keystroke injection (&lt;code&gt;osascript&lt;/code&gt;-Cmd-V) is the right primitive.&lt;/strong&gt; It works in every app I've tried — Slack, browsers, native editors. No per-app integration needed.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vocabulary learning UI&lt;/strong&gt; — show the user what AgentCore Memory has learned.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Windows port&lt;/strong&gt; — swap &lt;code&gt;mlx-whisper&lt;/code&gt; for &lt;code&gt;faster-whisper&lt;/code&gt;, &lt;code&gt;rumps&lt;/code&gt; for &lt;code&gt;pystray&lt;/code&gt;. The core loop is OS-agnostic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Approval-gated agent actions&lt;/strong&gt; — wire it into &lt;a href="https://github.com/yama3133/aegis-slack-app" rel="noopener noreferrer"&gt;Aegis&lt;/a&gt;, a Slack-based approval plane I'm building for AI agents.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/yama3133/apex-voice.git
&lt;span class="nb"&gt;cd &lt;/span&gt;apex-voice
/opt/homebrew/bin/python3.12 &lt;span class="nt"&gt;-m&lt;/span&gt; venv .venv
.venv/bin/pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;span class="c"&gt;# Run directly:&lt;/span&gt;
.venv/bin/python voicetype.py
&lt;span class="c"&gt;# Or install as a LaunchAgent (auto-start, auto-restart):&lt;/span&gt;
&lt;span class="c"&gt;# Edit com.yamashita.apexvoice.plist to point at your clone path&lt;/span&gt;
&lt;span class="nb"&gt;cp &lt;/span&gt;com.yamashita.apexvoice.plist ~/Library/LaunchAgents/
launchctl load ~/Library/LaunchAgents/com.yamashita.apexvoice.plist
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll need an Apple Silicon Mac (mlx is Apple Silicon only) and AWS credentials if you want post-processing and agent features.&lt;/p&gt;




&lt;p&gt;If you build something similar, or hit the same py2app wall I did, I'd love to hear about it. Code, issues, and PRs welcome on &lt;a href="https://github.com/yama3133/apex-voice" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>bedrock</category>
      <category>macos</category>
      <category>standsagents</category>
    </item>
    <item>
      <title>I gave my AI agent a boss: a human-approval gate in Slack, over MCP</title>
      <dc:creator>Yuuki Yamashita</dc:creator>
      <pubDate>Thu, 11 Jun 2026 14:49:11 +0000</pubDate>
      <link>https://dev.to/_76130e67067eab4c8510/i-gave-my-ai-agent-a-boss-a-human-approval-gate-in-slack-over-mcp-5age</link>
      <guid>https://dev.to/_76130e67067eab4c8510/i-gave-my-ai-agent-a-boss-a-human-approval-gate-in-slack-over-mcp-5age</guid>
      <description>&lt;p&gt;AI agents can now &lt;em&gt;act&lt;/em&gt;, not just suggest. They issue refunds, run migrations, message customers. That's powerful — and a little terrifying. "Autonomous" should not mean "unsupervised." The moment an agent can spend money or drop a production table, someone needs to be able to say &lt;strong&gt;"wait — not like that."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;Aegis&lt;/strong&gt;: a human-approval control plane for AI agents. Before an agent does anything high-risk, it asks a human for approval &lt;strong&gt;in Slack&lt;/strong&gt;, and waits.&lt;/p&gt;

&lt;p&gt;▶️ &lt;strong&gt;90-second demo:&lt;/strong&gt; &lt;a href="https://youtu.be/c1jqPDPo6AU" rel="noopener noreferrer"&gt;https://youtu.be/c1jqPDPo6AU&lt;/a&gt;&lt;br&gt;
💻 &lt;strong&gt;Code:&lt;/strong&gt; &lt;a href="https://github.com/yama3133/aegis-slack-app" rel="noopener noreferrer"&gt;https://github.com/yama3133/aegis-slack-app&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  The shape of the idea
&lt;/h2&gt;

&lt;p&gt;The gate has to be &lt;strong&gt;decoupled&lt;/strong&gt; from the agent — I didn't want to wire approval logic into every agent or framework. The Model Context Protocol (MCP) is the perfect seam: Aegis is just an MCP server with three tools, and &lt;em&gt;any&lt;/em&gt; agent can adopt it without changing its reasoning.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/mcp-server.ts&lt;/span&gt;
&lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;registerTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;request_approval&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Request human approval in Slack before executing a high-risk action. &lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ALWAYS call this before destructive, financial, or externally visible actions.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;record&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
    &lt;span class="na"&gt;risk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;low&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;medium&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;high&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;critical&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="na"&gt;reason&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;optional&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;postApprovalRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;slack&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;DEFAULT_CHANNEL&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;request_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;auto_approved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;autoApproved&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three tools in total: &lt;code&gt;request_approval&lt;/code&gt;, &lt;code&gt;check_approval&lt;/code&gt;, and &lt;code&gt;wait_for_approval&lt;/code&gt;. The agent calls &lt;code&gt;request_approval&lt;/code&gt; before a risky action, then blocks on &lt;code&gt;wait_for_approval&lt;/code&gt; until a human decides.&lt;/p&gt;

&lt;h2&gt;
  
  
  The agent side
&lt;/h2&gt;

&lt;p&gt;In the demo, the agent is an Amazon Bedrock model (Claude Sonnet 4.6) running a Converse API tool-use loop. The only thing that makes it "safe" is a system prompt and the MCP tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Before ANY high-risk action (refunds, deletions, external messages, anything
financial or destructive), you MUST call request_approval ... then call
wait_for_approval until you get a terminal status.
If approved with edited arguments, you MUST use the returned arguments.
If denied or expired, do NOT perform the action.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent pauses on its own. No orchestration framework required — just a tool call.&lt;/p&gt;

&lt;h2&gt;
  
  
  The approval card
&lt;/h2&gt;

&lt;p&gt;When a request comes in, Aegis posts a &lt;strong&gt;Block Kit&lt;/strong&gt; card to Slack. A human gets the agent, the action, the risk level, the exact arguments, and four buttons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ &lt;strong&gt;Approve&lt;/strong&gt; — the agent proceeds&lt;/li&gt;
&lt;li&gt;❌ &lt;strong&gt;Deny&lt;/strong&gt; — the agent safely aborts&lt;/li&gt;
&lt;li&gt;✏️ &lt;strong&gt;Edit &amp;amp; Approve&lt;/strong&gt; — the part I'm proudest of&lt;/li&gt;
&lt;li&gt;❓ &lt;strong&gt;Request Info&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The feature I didn't expect to love: Edit &amp;amp; Approve
&lt;/h2&gt;

&lt;p&gt;A plain yes/no felt too blunt. Real reviewers don't just &lt;em&gt;stop&lt;/em&gt; an agent — they &lt;em&gt;correct&lt;/em&gt; it. So Edit &amp;amp; Approve opens a modal with the agent's arguments as editable JSON:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/blocks.ts&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;editModal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;modal&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;callback_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;aegis_edit_modal&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;plain_text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Edit #&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;submit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;plain_text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Approve edited&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;blocks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;input&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;element&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;plain_text_input&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;multiline&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;initial_value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}],&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the demo, a reviewer lowers a refund from &lt;strong&gt;$1,200 to $800&lt;/strong&gt; and approves. The agent then executes the &lt;em&gt;corrected&lt;/em&gt; amount and reports the change. Control, not just a veto.&lt;/p&gt;

&lt;h2&gt;
  
  
  Not everything needs a human
&lt;/h2&gt;

&lt;p&gt;If every $45 goodwill refund paged a person, nobody would use this. So Aegis ships a small policy engine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/policy.ts&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;multi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;risks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;risk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;multi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;minAmountUsd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;decision&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;human&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;approvalsRequired&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;   &lt;span class="c1"&gt;// e.g. drop a prod table&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;enabled&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;risks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;risk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="nx"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;maxAmountUsd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;decision&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;auto_approve&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;                  &lt;span class="c1"&gt;// e.g. a $45 refund&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;decision&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;human&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;approvalsRequired&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Auto-approve&lt;/strong&gt; low-risk actions (🤖 instantly).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;N-of-M&lt;/strong&gt; for critical ones — dropping a production table needs &lt;em&gt;two&lt;/em&gt; approvers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TTL&lt;/strong&gt; — anything left pending simply expires. Fail-safe by default.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Context, so it's a decision and not a rubber stamp
&lt;/h2&gt;

&lt;p&gt;The thing I learned: a good approval is less about a button and more about &lt;em&gt;context&lt;/em&gt;. Two touches:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Plain-language summary&lt;/strong&gt; of every action via Amazon Bedrock (Claude Haiku 4.5), so reviewers don't read raw JSON.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Related Slack messages&lt;/strong&gt; pulled onto the card with Slack's &lt;strong&gt;Real-Time Search API&lt;/strong&gt; (&lt;code&gt;assistant.search.context&lt;/code&gt;) — the relevant conversation is right there, with permalinks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The fiddly bit: &lt;code&gt;assistant.search.context&lt;/code&gt; needs a fresh &lt;code&gt;action_token&lt;/code&gt;, which only arrives on a Slack assistant/mention event and lives for minutes — but approval requests originate &lt;em&gt;outside&lt;/em&gt; any Slack event. I cache the latest token so out-of-band cards can still search.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it all fits together
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6gm2iovmnj65xac9j1r8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6gm2iovmnj65xac9j1r8.png" alt="Aegis architecture" width="800" height="404"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Any agent → &lt;code&gt;request_approval&lt;/code&gt; over MCP → Aegis (policy + context + Block Kit) → Slack → human → the decision returns to the agent via &lt;code&gt;wait_for_approval&lt;/code&gt;. Every step is written to an audit log.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd tell past me
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MCP is the right abstraction for guardrails.&lt;/strong&gt; Because the gate is decoupled, it works with any agent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"Human in the loop" is a UX problem, not just a yes/no.&lt;/strong&gt; Summary + context + the ability to &lt;em&gt;edit&lt;/em&gt; are what make it usable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Put the control surface where people already are.&lt;/strong&gt; Slack means zero new tooling for the approver.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Built for the &lt;strong&gt;Slack Agent Builder Challenge&lt;/strong&gt;. Stack: TypeScript · Slack Bolt (Socket Mode) · MCP · Amazon Bedrock · Slack Real-Time Search API.&lt;/p&gt;

&lt;p&gt;⭐ Code &amp;amp; setup: &lt;a href="https://github.com/yama3133/aegis-slack-app" rel="noopener noreferrer"&gt;https://github.com/yama3133/aegis-slack-app&lt;/a&gt;&lt;br&gt;
▶️ Demo: &lt;a href="https://youtu.be/c1jqPDPo6AU" rel="noopener noreferrer"&gt;https://youtu.be/c1jqPDPo6AU&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>slack</category>
      <category>mcp</category>
      <category>aws</category>
    </item>
    <item>
      <title>I built a web app that overlays Niconico-style scrolling comments onto your slides (PowerPoint-ready, keyless OIDC)</title>
      <dc:creator>Yuuki Yamashita</dc:creator>
      <pubDate>Mon, 08 Jun 2026 00:13:34 +0000</pubDate>
      <link>https://dev.to/_76130e67067eab4c8510/i-built-a-web-app-that-overlays-niconico-style-scrolling-comments-onto-your-slides-407d</link>
      <guid>https://dev.to/_76130e67067eab4c8510/i-built-a-web-app-that-overlays-niconico-style-scrolling-comments-onto-your-slides-407d</guid>
      <description>&lt;p&gt;What if comments scrolled across your talk slides like on Niconico Douga (a Japanese video site famous for comments flying over the video)? That idea turned into a web app that overlays scrolling comments onto an existing &lt;code&gt;.pptx&lt;/code&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Live: &lt;a href="https://nico-comment-app.vercel.app" rel="noopener noreferrer"&gt;https://nico-comment-app.vercel.app&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Source: &lt;a href="https://github.com/yama3133/nico-comment-app" rel="noopener noreferrer"&gt;https://github.com/yama3133/nico-comment-app&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faikq9srx6a4wignfq3v0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faikq9srx6a4wignfq3v0.png" alt="Main screen" width="800" height="538"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Upload a &lt;code&gt;.pptx&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Pick which slides get comments (&lt;code&gt;all&lt;/code&gt; / &lt;code&gt;1,3,5-7&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Type comments with colors, tune the scroll speed, etc.&lt;/li&gt;
&lt;li&gt;Hit "Generate" and download a &lt;code&gt;.pptx&lt;/code&gt; with the comments baked in&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Run the slideshow in PowerPoint / Keynote and the comments scroll right-to-left.&lt;/p&gt;

&lt;h2&gt;
  
  
  First hurdle: how to animate inside PowerPoint
&lt;/h2&gt;

&lt;p&gt;My first attempt used &lt;strong&gt;PowerPoint's native motion-path animation&lt;/strong&gt; — writing the XML that moves a text box from right to left directly via &lt;code&gt;python-pptx&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It failed &lt;strong&gt;three times in a row&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1st: all comments stacked in the center (wrong initial position)&lt;/li&gt;
&lt;li&gt;2nd: nothing moved in the slideshow — it just froze&lt;/li&gt;
&lt;li&gt;3rd: PowerPoint reported "there is a problem with the content" and &lt;strong&gt;"repaired" it by deleting the animation&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The root cause was hand-written animation XML that didn't match PowerPoint's schema — and, more importantly, I was shipping it &lt;strong&gt;without being able to verify playback in my own environment&lt;/strong&gt;. Lesson: don't ship "it should work" for something you can't verify.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix: overlay a transparent animated GIF
&lt;/h2&gt;

&lt;p&gt;I switched approaches: &lt;strong&gt;generate a transparent animated GIF of scrolling comments and overlay it full-bleed on the slide.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PowerPoint / Keynote &lt;strong&gt;auto-play animated GIFs&lt;/strong&gt; in slideshow mode (built-in)&lt;/li&gt;
&lt;li&gt;A GIF lets me &lt;strong&gt;extract frames and verify it actually moves&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;No more animation-XML schema headaches&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Generating the GIF with Pillow is just drawing comments frame by frame:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_frames&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;fps&lt;/span&gt;
    &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;RGBA&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ImageDraw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Draw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;speed&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;start&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;  &lt;span class="c1"&gt;# right -&amp;gt; left
&lt;/span&gt;        &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;y&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;font&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;font&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="n"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;color&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;stroke_width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stroke&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stroke_fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="c1"&gt;# quantize to a palette with transparency, then append the frame
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then &lt;code&gt;python-pptx&lt;/code&gt;'s &lt;code&gt;add_picture&lt;/code&gt; overlays the GIF over each slide. The pptx diff turned out to be just 4 spots ("register GIF in Content_Types", "add GIF to media", "add the relationship", "add a &lt;code&gt;&amp;lt;p:pic&amp;gt;&lt;/code&gt; to the slide XML"), so it's reproducible in the browser or on the server.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwl3us7smy23vo596wds7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwl3us7smy23vo596wds7.png" alt="Architecture" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontend (Next.js / Vercel)&lt;/strong&gt;: pptx parsing (JSZip) and live preview (Canvas) happen entirely in the browser — you can preview without sending the deck anywhere&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backend (AWS Lambda, Tokyo)&lt;/strong&gt;: the heavy GIF generation runs in a container Lambda with Pillow + python-pptx, bundling Noto Sans JP for Japanese text&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage (S3)&lt;/strong&gt;: uploads go straight to S3 via a presigned PUT, dodging Vercel/Lambda payload limits and handling larger decks. Inputs/outputs auto-delete after 24h&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Keyless with Vercel OIDC Federation
&lt;/h2&gt;

&lt;p&gt;I initially planned to expose a Lambda Function URL and call it directly, but &lt;strong&gt;public access was blocked by the environment's guardrails&lt;/strong&gt; (&lt;code&gt;403 Forbidden&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;So I moved to &lt;strong&gt;Vercel OIDC Federation&lt;/strong&gt;. The OIDC token issued to the Vercel function is exchanged via AWS &lt;code&gt;AssumeRoleWithWebIdentity&lt;/code&gt; for short-lived credentials to invoke Lambda. The win: &lt;strong&gt;no long-lived access keys stored anywhere.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;awsCredentialsProvider&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@vercel/oidc-aws-credentials-provider&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;lambda&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;LambdaClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ap-northeast-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;credentials&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;awsCredentialsProvider&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;roleArn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;AWS_ROLE_ARN&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On the AWS side, register Vercel's issuer (&lt;code&gt;https://oidc.vercel.com/&amp;lt;team&amp;gt;&lt;/code&gt;) as an OIDC provider, and scope the IAM role's trust policy to a specific project's production environment.&lt;/p&gt;

&lt;h2&gt;
  
  
  A note on rights
&lt;/h2&gt;

&lt;p&gt;Niconico's feature of comments scrolling over a video (the "comment delivery system") is reportedly patented by its operator.&lt;/p&gt;

&lt;p&gt;What this tool does is overlay &lt;strong&gt;pre-written comments as a visual effect&lt;/strong&gt;; each comment is embedded as a transparent GIF and plays &lt;strong&gt;locally, without any network&lt;/strong&gt;. That differs from a system where an unspecified crowd posts comments over the internet that are synchronized in real time via a server. Still, the right call depends on usage — for commercial/streaming use you should verify the rights yourself, and &lt;strong&gt;displaying real-time incoming comments during a live stream is out of scope&lt;/strong&gt; for this tool. The app includes a notice page about this.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;To make PowerPoint animate &lt;strong&gt;reliably&lt;/strong&gt;, overlaying an &lt;strong&gt;animated GIF&lt;/strong&gt; beat native animation&lt;/li&gt;
&lt;li&gt;Choosing a method you can &lt;strong&gt;verify yourself&lt;/strong&gt; is the fastest path in the end&lt;/li&gt;
&lt;li&gt;Connecting to AWS &lt;strong&gt;keylessly&lt;/strong&gt; via Vercel OIDC is great even for hobby projects&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Give it a try → &lt;a href="https://nico-comment-app.vercel.app" rel="noopener noreferrer"&gt;https://nico-comment-app.vercel.app&lt;/a&gt;&lt;/p&gt;

</description>
      <category>nextjs</category>
      <category>aws</category>
      <category>lambda</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Building Fridge AI Health: An AI-Native Full-Stack App with Vercel and Amazon Aurora</title>
      <dc:creator>Yuuki Yamashita</dc:creator>
      <pubDate>Sat, 06 Jun 2026 18:37:34 +0000</pubDate>
      <link>https://dev.to/_76130e67067eab4c8510/building-fridge-ai-health-an-ai-native-full-stack-app-with-vercel-and-amazon-aurora-2kg6</link>
      <guid>https://dev.to/_76130e67067eab4c8510/building-fridge-ai-health-an-ai-native-full-stack-app-with-vercel-and-amazon-aurora-2kg6</guid>
      <description>&lt;p&gt;&lt;em&gt;Snap a photo of your fridge, and an AI plans meals around your health goals, tracks your nutrition, and coaches you every week — in 7 languages.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;Live demo:&lt;/strong&gt; &lt;a href="https://fridge-ai-app.vercel.app" rel="noopener noreferrer"&gt;https://fridge-ai-app.vercel.app&lt;/a&gt;&lt;br&gt;
💻 &lt;strong&gt;Source:&lt;/strong&gt; &lt;a href="https://github.com/yama3133/fridge-ai-menu" rel="noopener noreferrer"&gt;https://github.com/yama3133/fridge-ai-menu&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  What I built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Fridge AI Health&lt;/strong&gt; turns a single photo of your fridge into a personalized, health-aware meal plan.&lt;/p&gt;

&lt;p&gt;The flow is simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Take a photo&lt;/strong&gt; of the inside of your fridge.&lt;/li&gt;
&lt;li&gt;The app &lt;strong&gt;recognizes the ingredients&lt;/strong&gt; (OCR + vision).&lt;/li&gt;
&lt;li&gt;An AI proposes &lt;strong&gt;menus tailored to your health goals&lt;/strong&gt; (calories, protein, sodium limit, fiber), each with a nutrition breakdown.&lt;/li&gt;
&lt;li&gt;You tap &lt;strong&gt;"Ate this"&lt;/strong&gt; to log a meal.&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;nutrition dashboard&lt;/strong&gt; visualizes your daily progress and a 7-day trend.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;AI Health Coach&lt;/strong&gt; analyzes your accumulated logs and gives concrete weekly advice.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It started as a simple "what can I cook with this?" demo, and grew into a continuous health-management tool — which is exactly where a real database earns its keep.&lt;/p&gt;
&lt;h2&gt;
  
  
  The stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontend / hosting:&lt;/strong&gt; Next.js (Pages Router) on &lt;strong&gt;Vercel&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database:&lt;/strong&gt; &lt;strong&gt;Amazon Aurora PostgreSQL&lt;/strong&gt;, provisioned through the &lt;strong&gt;Vercel Marketplace&lt;/strong&gt; AWS integration&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI:&lt;/strong&gt; &lt;strong&gt;Amazon Bedrock&lt;/strong&gt; — Claude Sonnet 4.5 (&lt;code&gt;us-east-1&lt;/code&gt;) for both menu generation and the weekly coach&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vision:&lt;/strong&gt; Google Cloud Vision (TEXT_DETECTION) for reading product labels&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ORM / auth:&lt;/strong&gt; Drizzle ORM + NextAuth (Google OAuth) with the Drizzle adapter&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Charts / i18n:&lt;/strong&gt; Recharts, plus a small homemade i18n layer (7 languages)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The data model is intentionally relational: &lt;code&gt;users&lt;/code&gt;, &lt;code&gt;accounts&lt;/code&gt;, &lt;code&gt;sessions&lt;/code&gt;, &lt;code&gt;health_profiles&lt;/code&gt;, &lt;code&gt;meal_logs&lt;/code&gt;, and &lt;code&gt;coach_advices&lt;/code&gt;. The coach feature is the whole reason a database matters — it reads back what you've eaten over the last 7 days, aggregates it, and feeds that to the model. &lt;strong&gt;The value isn't in storing data; it's in what the AI does with the accumulated data.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Why this fits "frontend in minutes, scalable backend"
&lt;/h2&gt;

&lt;p&gt;The hackathon's pitch is "build a frontend in minutes and a scalable backend." That mapped almost perfectly to how this came together:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The UI (camera capture, menu cards, dashboard, coach) is plain Next.js + Recharts.&lt;/li&gt;
&lt;li&gt;The backend scalability comes for free from &lt;strong&gt;Aurora PostgreSQL&lt;/strong&gt; behind Vercel serverless functions.&lt;/li&gt;
&lt;li&gt;The "AI-native" part is Bedrock doing the heavy lifting on both the menu and the coaching side.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  The hard parts (where I actually spent my time)
&lt;/h2&gt;

&lt;p&gt;The happy path is boring to read about. Here's what actually cost me hours.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Aurora DSQL looked perfect — until foreign keys
&lt;/h3&gt;

&lt;p&gt;My first instinct was &lt;strong&gt;Aurora DSQL&lt;/strong&gt;: serverless, PostgreSQL-compatible, scales to zero. Great fit for a hackathon.&lt;/p&gt;

&lt;p&gt;Then I checked the compatibility docs. &lt;strong&gt;DSQL does not support foreign key constraints&lt;/strong&gt; (along with triggers, views, sequences at the time, etc.). My schema leans heavily on &lt;code&gt;ON DELETE CASCADE&lt;/code&gt; foreign keys — every app table references &lt;code&gt;users&lt;/code&gt;. Migrating would have meant ripping out referential integrity and re-implementing it in the application layer.&lt;/p&gt;

&lt;p&gt;For a one-week build, that was too much risk. I switched to &lt;strong&gt;Aurora PostgreSQL&lt;/strong&gt;, kept my schema as-is, and moved on. Lesson: "PostgreSQL-compatible" is a spectrum — always check the unsupported-features list before you commit your schema.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Passwordless DB access with Vercel OIDC + RDS IAM
&lt;/h3&gt;

&lt;p&gt;The Vercel Marketplace Aurora integration doesn't hand you a &lt;code&gt;DATABASE_URL&lt;/code&gt; with a password. Instead it sets up &lt;strong&gt;RDS IAM authentication via Vercel's OIDC federation&lt;/strong&gt; — your serverless function assumes an AWS role and generates a short-lived token that acts as the DB password. No secrets hardcoded anywhere. Very nice... once it works.&lt;/p&gt;

&lt;p&gt;My &lt;code&gt;lib/db&lt;/code&gt; ended up supporting both modes — a connection string for local Docker, and IAM tokens for production:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Signer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@aws-sdk/rds-signer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;awsCredentialsProvider&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@vercel/functions/oidc&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;signer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Signer&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DATABASE_PGHOST&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Number&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DATABASE_PGPORT&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;username&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DATABASE_PGUSER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DATABASE_AWS_REGION&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;credentials&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;awsCredentialsProvider&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;roleArn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DATABASE_AWS_ROLE_ARN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;clientConfig&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DATABASE_AWS_REGION&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Pool&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DATABASE_PGHOST&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DATABASE_PGUSER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;database&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DATABASE_PGDATABASE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Number&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DATABASE_PGPORT&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;password&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;signer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getAuthToken&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="c1"&gt;// fresh token per connection&lt;/span&gt;
  &lt;span class="na"&gt;ssl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;rejectUnauthorized&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. "No OpenIDConnect provider found"
&lt;/h3&gt;

&lt;p&gt;First production connection attempt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;No OpenIDConnect provider found in your account for
https://oidc.vercel.com/&amp;lt;team&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The IAM &lt;strong&gt;role&lt;/strong&gt; existed (the integration created it), but the &lt;strong&gt;OIDC identity provider&lt;/strong&gt; it trusts did not. The role's trust policy pointed at &lt;code&gt;oidc.vercel.com/&amp;lt;team&amp;gt;&lt;/code&gt; as a federated principal — and that provider simply wasn't registered in my AWS account.&lt;/p&gt;

&lt;p&gt;The fix was one command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws iam create-open-id-connect-provider &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--url&lt;/span&gt; &lt;span class="s2"&gt;"https://oidc.vercel.com/&amp;lt;team&amp;gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--client-id-list&lt;/span&gt; &lt;span class="s2"&gt;"https://vercel.com/&amp;lt;team&amp;gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Right after that, my migration endpoint reported &lt;code&gt;current_user: postgres&lt;/code&gt;, &lt;code&gt;can_create: true&lt;/code&gt;, and the tables went in. There's also a subtle gotcha worth flagging: the Aurora instance is &lt;strong&gt;not publicly accessible&lt;/strong&gt;, so you can't just &lt;code&gt;psql&lt;/code&gt; into it from your laptop — the Vercel-function-over-OIDC path is the way in. I ran migrations through a tiny, secret-protected &lt;code&gt;/api/admin/migrate&lt;/code&gt; endpoint (and deleted it once the schema was applied).&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Bigger model ≠ better product
&lt;/h3&gt;

&lt;p&gt;For the AI, I assumed "use the most powerful model available." So I tried swapping Claude Sonnet 4.5 for an Opus model.&lt;/p&gt;

&lt;p&gt;The result was instructive: &lt;strong&gt;Opus over-engineered the recipes.&lt;/strong&gt; It happily designed elaborate dishes that required konjac, burdock root, daikon — ingredients that weren't in the fridge — which blew up the shopping list and quietly broke the app's core premise of &lt;em&gt;cook with what you already have&lt;/em&gt;. It was also slower.&lt;/p&gt;

&lt;p&gt;Sonnet 4.5 stayed closer to the detected ingredients, ran faster, and matched the product's intent better. I reverted. The "best" model is the one that fits the job, not the one with the highest benchmark.&lt;/p&gt;

&lt;h2&gt;
  
  
  Features I'm happy with
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Health-aware generation.&lt;/strong&gt; The user's profile (target calories, protein, sodium cap, fiber, goal) is injected straight into the prompt, so the menus actually respect "I'm cutting" vs "I'm bulking."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Structured shopping list.&lt;/strong&gt; Each menu splits ingredients into &lt;em&gt;in your fridge&lt;/em&gt; (blue) vs &lt;em&gt;to buy&lt;/em&gt; (orange), and the page aggregates every "to buy" item into one shopping list. This also fixed an earlier inconsistency where the model would sometimes tag missing items and sometimes not — structuring the output removed the ambiguity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The AI Health Coach.&lt;/strong&gt; This is the centerpiece. It pulls the last 7 days from &lt;code&gt;meal_logs&lt;/code&gt;, computes daily averages vs targets, and asks Claude for a summary, what's going well, what to improve, and concrete suggestions for tomorrow — returned as JSON and stored in &lt;code&gt;coach_advices&lt;/code&gt;. Watching it say "your protein is great but you're ~340 kcal short and low on fiber — add whole grains tomorrow," grounded in real logged data, is the moment the whole app clicks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;7 languages.&lt;/strong&gt; UI strings &lt;em&gt;and&lt;/em&gt; the AI-generated menu text both switch between Japanese, English, Chinese, Korean, French, Spanish, and Portuguese — the language code is passed into the Bedrock prompt so the recipe names and descriptions come back localized.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I'd do next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Move nutrition numbers from model estimates to a real nutrition API for accuracy.&lt;/li&gt;
&lt;li&gt;Persist and chart the coach's weekly advice history as a trend.&lt;/li&gt;
&lt;li&gt;Add household/shared fridges — a natural multi-user extension now that the relational schema and auth are in place.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Check the database's unsupported features before designing your schema.&lt;/strong&gt; It saved me from a painful DSQL migration mid-build.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Passwordless IAM + OIDC is worth the setup.&lt;/strong&gt; Once the OIDC provider is registered, you get short-lived credentials with zero secrets in your env.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Let the product pick the model.&lt;/strong&gt; A more powerful model made the experience worse here.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A database becomes interesting when AI reads it back.&lt;/strong&gt; Logging meals is mundane; an AI coach reasoning over a week of logs is the feature people remember.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Vercel handled the frontend and serverless glue, Aurora PostgreSQL handled the durable, relational core, and Bedrock turned stored data into advice. That combination is exactly the "AI-native full-stack" loop this hackathon is about.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Built for the H0 Hackathon (Vercel × AWS Databases).&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>aws</category>
      <category>showdev</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Detect AI writing, then rewrite it human — Bedrock + Strands Agents + AgentCore Gateway</title>
      <dc:creator>Yuuki Yamashita</dc:creator>
      <pubDate>Sat, 06 Jun 2026 14:03:29 +0000</pubDate>
      <link>https://dev.to/_76130e67067eab4c8510/detect-ai-writing-then-rewrite-it-human-bedrock-strands-agents-agentcore-gateway-1c3p</link>
      <guid>https://dev.to/_76130e67067eab4c8510/detect-ai-writing-then-rewrite-it-human-bedrock-strands-agents-agentcore-gateway-1c3p</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Live demo: &lt;a href="https://ai-text-checker-pi.vercel.app" rel="noopener noreferrer"&gt;https://ai-text-checker-pi.vercel.app&lt;/a&gt;&lt;br&gt;
Code: &lt;a href="https://github.com/yama3133/ai-text-checker" rel="noopener noreferrer"&gt;https://github.com/yama3133/ai-text-checker&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I built &lt;strong&gt;AI Text Checker&lt;/strong&gt; (Japanese name: 「AI文章判定くん」): paste some text and it scores how "AI-like" the writing is &lt;strong&gt;with reasons&lt;/strong&gt;, then rewrites it to read as if a human wrote it. It works in &lt;strong&gt;Japanese and English&lt;/strong&gt;, accepts paste / &lt;code&gt;.txt&lt;/code&gt; / &lt;code&gt;.md&lt;/code&gt; / &lt;code&gt;.pdf&lt;/code&gt;, and is also exposed as an &lt;strong&gt;MCP tool&lt;/strong&gt; so you can call it from Claude Code and other agents.&lt;/p&gt;

&lt;p&gt;This post is less a feature tour and more a write-up of the decisions and gotchas — including one honest design choice that shaped the whole thing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq6cg53lq2uy6le9lkfq2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq6cg53lq2uy6le9lkfq2.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The honest premise: you cannot "detect AI" reliably
&lt;/h2&gt;

&lt;p&gt;There is no technology today that &lt;em&gt;definitively&lt;/em&gt; decides whether a text was written by a human or an AI. Dedicated detectors have high false-positive rates, especially on Japanese and on AI text that a human lightly edited. Asking an LLM to output a binary "AI or human" verdict is even shakier.&lt;/p&gt;

&lt;p&gt;So I deliberately did &lt;strong&gt;not&lt;/strong&gt; build an "AI detector." I built an &lt;strong&gt;AI-likeness editor&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;❌ No binary "AI/human" verdict.&lt;/li&gt;
&lt;li&gt;✅ An &lt;strong&gt;AI-likeness score (an estimate)&lt;/strong&gt; plus the &lt;strong&gt;specific phrases&lt;/strong&gt; that read AI-ish (mechanical connectives, safe generalities, formulaic closers, uniform rhythm, lack of specifics).&lt;/li&gt;
&lt;li&gt;✅ A &lt;strong&gt;rewrite&lt;/strong&gt; that keeps the meaning but sounds human.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The UI says this in plain text: it is an estimate, it can be wrong, don't use it as proof of authorship. That framing plays to what an LLM is actually good at — editorial feedback and rewriting — instead of pretending it can do something it can't.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Next.js 16&lt;/strong&gt; (App Router) + Tailwind, deployed on &lt;strong&gt;Vercel&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amazon Bedrock&lt;/strong&gt;, Claude &lt;strong&gt;Sonnet 4.6&lt;/strong&gt; via the &lt;strong&gt;Converse API&lt;/strong&gt; (&lt;code&gt;us-east-1&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Strands Agents&lt;/strong&gt; (TypeScript SDK) for the "rewrite until it passes" loop&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amazon Bedrock AgentCore Gateway&lt;/strong&gt; to expose the tool over &lt;strong&gt;MCP&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjc1pg5luosv5ibgk43qr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjc1pg5luosv5ibgk43qr.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Pick a model you can actually invoke
&lt;/h3&gt;

&lt;p&gt;I wanted Opus 4.8, but on this account it returned &lt;code&gt;AccessDenied (403)&lt;/code&gt;. A model showing up in &lt;code&gt;list-foundation-models&lt;/code&gt; / &lt;code&gt;list-inference-profiles&lt;/code&gt; does &lt;strong&gt;not&lt;/strong&gt; guarantee you can invoke it. I checked candidates with a one-line Converse "ping" and landed on Sonnet 4.6:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws bedrock-runtime converse &lt;span class="nt"&gt;--region&lt;/span&gt; us-east-1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model-id&lt;/span&gt; us.anthropic.claude-sonnet-4-6 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--messages&lt;/span&gt; &lt;span class="s1"&gt;'[{"role":"user","content":[{"text":"ping"}]}]'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--inference-config&lt;/span&gt; &lt;span class="s1"&gt;'{"maxTokens":5}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The core analysis is a single Converse call with a system prompt that returns strict JSON (&lt;code&gt;aiLikenessScore&lt;/code&gt;, &lt;code&gt;label&lt;/code&gt;, &lt;code&gt;summary&lt;/code&gt;, &lt;code&gt;findings[]&lt;/code&gt;, &lt;code&gt;humanized&lt;/code&gt;). One detail that mattered for bilingual support: &lt;strong&gt;the commentary follows the UI language, but the rewrite must stay in the source language.&lt;/strong&gt; The prompt says, in effect, &lt;em&gt;"write the labels and reasons in English, but keep the rewrite in the same language as the input — never translate it."&lt;/em&gt; Without that line, English input sometimes came back rewritten into Japanese.&lt;/p&gt;

&lt;h2&gt;
  
  
  "Rewrite until it passes" with Strands Agents
&lt;/h2&gt;

&lt;p&gt;A single rewrite might still read AI-ish. So there's a second mode: an agent that &lt;strong&gt;scores, rewrites, re-scores, and repeats&lt;/strong&gt; until the AI-likeness drops below a threshold (or it hits a max number of rewrites). That is a genuine agent loop — a perfect fit for Strands.&lt;/p&gt;

&lt;p&gt;The TypeScript SDK (&lt;code&gt;@strands-agents/sdk&lt;/code&gt;) gives you an &lt;code&gt;Agent&lt;/code&gt;, a &lt;code&gt;tool()&lt;/code&gt; helper (Zod schemas), a &lt;code&gt;BedrockModel&lt;/code&gt;, and &lt;code&gt;structuredOutputSchema&lt;/code&gt; for typed results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;BedrockModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;tool&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@strands-agents/sdk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;zod&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;scoreTool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;score_ai_likeness&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Score how AI-like a text is (0-100). Call after every rewrite.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="na"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;scoreOnce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BedrockModel&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;us-east-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;us.anthropic.claude-sonnet-4-6&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;maxTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;scoreTool&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;structuredOutputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;finalScore&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;number&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;iterations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;number&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;humanized&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;notes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="na"&gt;systemPrompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Rewrite to read human. Score with score_ai_likeness,
    rewrite, re-score, repeat until below &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; or &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;maxRewrites&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; times.`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userPrompt&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;structuredOutput&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// typed&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In practice the model often nails it in one rewrite — a 92/100 sample dropped to 12/100, and the structured output told me exactly what changed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gotcha — bundling.&lt;/strong&gt; Strands pulls in many optional peer deps. In Next.js, mark it external so the bundler leaves it alone at runtime:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// next.config.ts&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;nextConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;serverExternalPackages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@strands-agents/sdk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;pdf-parse-fork&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Gotcha — Vercel timeouts.&lt;/strong&gt; The loop makes several Bedrock calls and can take 20–45s. Vercel Hobby caps functions at 60s, so a long auto-rewrite can time out. Fine locally; budget for it in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  Exposing it over MCP with AgentCore Gateway
&lt;/h2&gt;

&lt;p&gt;I wanted the tool callable from Claude Code, so I put it behind &lt;strong&gt;AgentCore Gateway&lt;/strong&gt; as an MCP server. The shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MCP client (Claude Code) → AgentCore Gateway (MCP / JWT)
   → Lambda (detect_ai_style) → Bedrock (Sonnet 4.6)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Lambda is a single Node file (the Node 22 runtime already bundles the AWS SDK v3, so no dependencies to package). The Gateway target points at the Lambda and carries the tool schema.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The big surprise:&lt;/strong&gt; AgentCore Gateway's &lt;code&gt;authorizerType&lt;/code&gt; is &lt;strong&gt;&lt;code&gt;CUSTOM_JWT&lt;/code&gt; — and that's the only option.&lt;/strong&gt; There is no anonymous gateway. So an MCP client must present a &lt;strong&gt;bearer JWT&lt;/strong&gt;. I set up a Cognito user pool with a &lt;code&gt;client_credentials&lt;/code&gt; (machine-to-machine) app client, and pointed the gateway's authorizer at Cognito's discovery URL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws bedrock-agentcore-control create-gateway &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; ai-text-checker-gateway &lt;span class="nt"&gt;--protocol-type&lt;/span&gt; MCP &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--authorizer-type&lt;/span&gt; CUSTOM_JWT &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--authorizer-configuration&lt;/span&gt; &lt;span class="s1"&gt;'{"customJWTAuthorizer":{
     "discoveryUrl":"https://cognito-idp.us-east-1.amazonaws.com/&amp;lt;POOL_ID&amp;gt;/.well-known/openid-configuration",
     "allowedClients":["&amp;lt;CLIENT_ID&amp;gt;"]}}'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--role-arn&lt;/span&gt; &amp;lt;GATEWAY_ROLE_ARN&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two more sharp edges:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Gateway invokes the Lambda with the &lt;strong&gt;tool arguments as the &lt;code&gt;event&lt;/code&gt;&lt;/strong&gt;, and passes the tool name via &lt;code&gt;context&lt;/code&gt; as &lt;code&gt;${targetName}___${toolName}&lt;/code&gt;. With a single tool you can ignore the routing entirely.&lt;/li&gt;
&lt;li&gt;The target &lt;strong&gt;&lt;code&gt;toolSchema&lt;/code&gt; does not accept &lt;code&gt;enum&lt;/code&gt;&lt;/strong&gt; in property definitions — only &lt;code&gt;type&lt;/code&gt;, &lt;code&gt;properties&lt;/code&gt;, &lt;code&gt;required&lt;/code&gt;, &lt;code&gt;items&lt;/code&gt;, &lt;code&gt;description&lt;/code&gt;. I moved the allowed values into the &lt;code&gt;description&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once it's up, the flow from a client is plain MCP JSON-RPC (&lt;code&gt;initialize&lt;/code&gt; → &lt;code&gt;tools/list&lt;/code&gt; → &lt;code&gt;tools/call&lt;/code&gt;), with &lt;code&gt;Authorization: Bearer &amp;lt;token&amp;gt;&lt;/code&gt;. The tool appears as &lt;code&gt;detect___detect_ai_style&lt;/code&gt;. Tokens are short-lived, so I keep the client secret &lt;strong&gt;out of the repo&lt;/strong&gt; and fetch a fresh JWT at runtime via the AWS CLI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;For "is this AI?", be honest: ship an &lt;strong&gt;estimate + editorial feedback&lt;/strong&gt;, not a verdict.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verify model access with a real Converse call&lt;/strong&gt; before you build on a model id.&lt;/li&gt;
&lt;li&gt;Strands makes an &lt;strong&gt;iterative, tool-using rewrite&lt;/strong&gt; loop a few lines of typed code — but mark it external in Next.js.&lt;/li&gt;
&lt;li&gt;AgentCore Gateway is a clean way to turn a Lambda into an MCP tool, but &lt;strong&gt;plan for mandatory JWT auth&lt;/strong&gt; (Cognito), not an anonymous endpoint.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Try it: &lt;a href="https://ai-text-checker-pi.vercel.app" rel="noopener noreferrer"&gt;https://ai-text-checker-pi.vercel.app&lt;/a&gt; — and tell me where the score is wrong. It will be, sometimes. That's the point.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>bedrock</category>
      <category>ai</category>
      <category>nextjs</category>
    </item>
    <item>
      <title>Never oversell, anywhere on Earth — building a global drop platform on Amazon Aurora DSQL</title>
      <dc:creator>Yuuki Yamashita</dc:creator>
      <pubDate>Fri, 05 Jun 2026 17:01:22 +0000</pubDate>
      <link>https://dev.to/_76130e67067eab4c8510/never-oversell-anywhere-on-earth-building-a-global-drop-platform-on-amazon-aurora-dsql-4mcb</link>
      <guid>https://dev.to/_76130e67067eab4c8510/never-oversell-anywhere-on-earth-building-a-global-drop-platform-on-amazon-aurora-dsql-4mcb</guid>
      <description>&lt;p&gt;&lt;em&gt;Built for the H0 hackathon (Hack the Zero Stack with Vercel v0 and AWS Databases). #H0Hackathon&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;When a limited drop goes live, thousands of people press &lt;strong&gt;Buy&lt;/strong&gt; in the same second. The classic failure modes — overselling, double-booking, the site falling over — are really one failure: &lt;strong&gt;losing track of a single number under global concurrency.&lt;/strong&gt; Sell 101 units of a 100-unit drop and you've broken a promise to a real customer.&lt;/p&gt;

&lt;p&gt;I kept watching limited drops oversell, double-book, and crash — and I wanted to know whether a database could make that &lt;em&gt;impossible&lt;/em&gt; by design, not just patched over with queues bolted on top. So I built &lt;strong&gt;DROPZERO&lt;/strong&gt;, a drop platform that sells &lt;em&gt;exactly&lt;/em&gt; its stock and not one unit more, no matter how many buyers hit it at once or which region they come from. The whole thing runs on &lt;strong&gt;Amazon Aurora DSQL&lt;/strong&gt; with a Next.js frontend on Vercel. Live demo: &lt;a href="https://dsql-drop-app.vercel.app" rel="noopener noreferrer"&gt;https://dsql-drop-app.vercel.app&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Aurora DSQL
&lt;/h2&gt;

&lt;p&gt;A single-region database can keep a stock counter honest with a transaction. The hard part is doing it &lt;strong&gt;across regions&lt;/strong&gt; without giving up correctness. Most distributed setups force a choice: strong consistency &lt;em&gt;or&lt;/em&gt; low latency &lt;em&gt;or&lt;/em&gt; multi-region writes — pick two.&lt;/p&gt;

&lt;p&gt;Aurora DSQL is the reason DROPZERO can refuse that trade-off. It's a serverless, distributed SQL database with &lt;strong&gt;multi-Region, active-active clusters that are strongly consistent.&lt;/strong&gt; Two regional endpoints, one logical database, synchronous commit quorum. That means a buyer in Tokyo and a buyer in Seoul racing for the last unit are resolved against &lt;em&gt;the same truth&lt;/em&gt; — exactly one wins, and the counter never goes negative. This is the property the entire product is built on.&lt;/p&gt;

&lt;h2&gt;
  
  
  The stack (and zero hardcoded secrets)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontend:&lt;/strong&gt; Next.js (scaffolded with v0), deployed on Vercel.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auth:&lt;/strong&gt; Vercel &lt;strong&gt;OIDC federation&lt;/strong&gt; → assume an AWS IAM role → mint a DSQL auth token &lt;em&gt;per connection&lt;/em&gt;. No long-lived credentials anywhere in the app.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DB:&lt;/strong&gt; Amazon Aurora DSQL.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The connection is refreshingly boring — which is the point:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;AuroraDSQLPool&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@aws/aurora-dsql-node-postgres-connector&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;awsCredentialsProvider&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@vercel/oidc-aws-credentials-provider&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AuroraDSQLPool&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PGHOST&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;AWS_REGION&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;admin&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;database&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;postgres&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5432&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;customCredentialsProvider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;awsCredentialsProvider&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;roleArn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;AWS_ROLE_ARN&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;        &lt;span class="c1"&gt;// injected by Vercel&lt;/span&gt;
    &lt;span class="na"&gt;clientConfig&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;AWS_REGION&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first time I tried DSQL I burned an evening on "auth problems" — which turned out to be unrelated credentials. Lesson: let Vercel's OIDC integration own the token lifecycle and &lt;em&gt;don't&lt;/em&gt; hand-roll DSQL tokens.&lt;/p&gt;

&lt;h2&gt;
  
  
  A deliberate, DSQL-native data model
&lt;/h2&gt;

&lt;p&gt;Aurora DSQL is PostgreSQL-compatible, but it is &lt;strong&gt;not&lt;/strong&gt; vanilla Postgres. Three constraints shaped the schema:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No foreign keys&lt;/strong&gt; — integrity is enforced in the app.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No sequences / SERIAL&lt;/strong&gt; — primary keys are UUIDs (&lt;code&gt;gen_random_uuid()&lt;/code&gt; works as a server-side default).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One DDL statement per transaction&lt;/strong&gt; — migrations run statement-by-statement.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The schema is tiny on purpose:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;products(id uuid PK, name, drop_name)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;inventory(id uuid PK, product_id, stock)&lt;/code&gt; ← the entire drop hinges on this &lt;strong&gt;one row&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;&lt;code&gt;orders(id uuid PK, product_id, user_ref, status, region)&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every buyer on Earth is racing for the same &lt;code&gt;stock&lt;/code&gt; cell. That's the whole game.&lt;/p&gt;

&lt;h2&gt;
  
  
  The core: never oversell
&lt;/h2&gt;

&lt;p&gt;A purchase is a single transaction — check and decrement — and the only interesting part is what happens when two of them collide. Aurora DSQL uses &lt;strong&gt;optimistic concurrency control&lt;/strong&gt;: it detects the conflict at &lt;strong&gt;commit&lt;/strong&gt; time and rejects the loser with &lt;code&gt;SQLSTATE 40001&lt;/code&gt; (or DSQL's &lt;code&gt;OC000&lt;/code&gt; / &lt;code&gt;OC001&lt;/code&gt;). So the app must retry.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;OCC&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;40001&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;OC000&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;OC001&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;purchase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;userRef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;region&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;maxAttempts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;attempt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;attempt&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;pool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;BEGIN&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;upd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;UPDATE inventory SET stock = stock - 1 WHERE product_id = $1 AND stock &amp;gt; 0&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;upd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rowCount&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ROLLBACK&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;sold_out&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;INSERT INTO orders (product_id, user_ref, status, region) VALUES ($1,$2,'confirmed',$3)&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;userRef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;region&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;COMMIT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;           &lt;span class="c1"&gt;// a conflict surfaces here&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;confirmed&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ROLLBACK&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{});&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;OCC&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;attempt&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;maxAttempts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;backoffWithJitter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;attempt&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;     &lt;span class="c1"&gt;// exponential + jitter&lt;/span&gt;
        &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                             &lt;span class="c1"&gt;// retry re-reads the latest stock&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;release&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The elegant part: a retry re-reads the &lt;em&gt;latest&lt;/em&gt; snapshot. Once stock hits 0, the &lt;code&gt;WHERE stock &amp;gt; 0&lt;/code&gt; matches no rows and the request cleanly returns &lt;strong&gt;sold out&lt;/strong&gt;. The conflict becomes a correct answer, not an error.&lt;/p&gt;

&lt;h2&gt;
  
  
  Proving multi-region consistency
&lt;/h2&gt;

&lt;p&gt;I created a multi-Region peered cluster: &lt;strong&gt;Tokyo (ap-northeast-1)&lt;/strong&gt; and &lt;strong&gt;Seoul (ap-northeast-2)&lt;/strong&gt; as active peers, with &lt;strong&gt;Osaka (ap-northeast-3)&lt;/strong&gt; as the witness.&lt;/p&gt;

&lt;p&gt;A gotcha worth saving you the hour I lost: I assumed the witness had to be a US region (a claim you'll find repeated online). &lt;code&gt;us-east-1&lt;/code&gt; and &lt;code&gt;us-west-2&lt;/code&gt; were both &lt;strong&gt;rejected&lt;/strong&gt;. The witness must live in the &lt;strong&gt;same region set&lt;/strong&gt; as the peers — for an APAC cluster, that's Osaka:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws dsql create-cluster &lt;span class="nt"&gt;--region&lt;/span&gt; ap-northeast-1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--multi-region-properties&lt;/span&gt; &lt;span class="s1"&gt;'{"witnessRegion":"ap-northeast-3"}'&lt;/span&gt;
&lt;span class="c"&gt;# then create the Seoul peer and link the two ARNs&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The demo writes to &lt;strong&gt;both&lt;/strong&gt; regional endpoints at the same instant. Stock = 1, one purchase to Tokyo, one to Seoul: &lt;strong&gt;exactly one confirms, the other gets sold-out, and both endpoints report the same final stock.&lt;/strong&gt; Strong consistency, across regions, with zero application-side coordination.&lt;/p&gt;

&lt;h2&gt;
  
  
  Does it hold at scale? (k6)
&lt;/h2&gt;

&lt;p&gt;I pointed k6 at both regional endpoints: &lt;strong&gt;100 units in stock, 3,000 concurrent purchases.&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Retry budget&lt;/th&gt;
&lt;th&gt;confirmed&lt;/th&gt;
&lt;th&gt;sold_out&lt;/th&gt;
&lt;th&gt;errors&lt;/th&gt;
&lt;th&gt;final stock&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;8 attempts&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;2,810&lt;/td&gt;
&lt;td&gt;90&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;40 attempts&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;100&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2,900&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Two takeaways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Oversell never happened&lt;/strong&gt; — &lt;code&gt;confirmed&lt;/code&gt; is exactly 100 and final stock is exactly 0 in both runs. The 90 "errors" in the first run were &lt;em&gt;failed&lt;/em&gt; purchases (conflicts that exhausted retries), never extra sales.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hot-row contention needs a real retry budget.&lt;/strong&gt; A single stock row under 50 concurrent VUs generates a lot of OCC conflicts; bumping the retry ceiling turned those 90 errors into clean sold-outs. Idempotent, retryable transactions are not optional here — they're the design.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Global from the first commit
&lt;/h2&gt;

&lt;p&gt;A worldwide drop should read like home wherever you are. The UI ships in &lt;strong&gt;8 languages&lt;/strong&gt; — English, Japanese, Chinese, Korean, Spanish, French, Portuguese, and Arabic — including full &lt;strong&gt;right-to-left&lt;/strong&gt; layout for Arabic (the whole interface mirrors; numbers and IDs stay LTR).&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd tell the next builder
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Lean on the managed auth path.&lt;/strong&gt; Vercel OIDC → IAM → DSQL token means no secrets in the app and no token plumbing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Respect DSQL's dialect.&lt;/strong&gt; No FKs, no sequences, one DDL per transaction, fixed Repeatable Read isolation. Design &lt;em&gt;with&lt;/em&gt; it, not around it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Treat OCC retries as a feature, not error handling.&lt;/strong&gt; They're how "never oversell" actually works.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-region witness = same region set.&lt;/strong&gt; Don't trust the "US-only" folklore; trust the API error.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;DROPZERO sells exactly what's in stock — to one buyer or to a million, in Tokyo or São Paulo — because Aurora DSQL gives you a single, strongly consistent source of truth across regions, and the rest is a small, careful transaction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Live:&lt;/strong&gt; &lt;a href="https://dsql-drop-app.vercel.app" rel="noopener noreferrer"&gt;https://dsql-drop-app.vercel.app&lt;/a&gt; · &lt;strong&gt;Architecture:&lt;/strong&gt; &lt;a href="https://dsql-drop-app.vercel.app/architecture" rel="noopener noreferrer"&gt;https://dsql-drop-app.vercel.app/architecture&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;#H0Hackathon — built with Amazon Aurora DSQL and Vercel.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>database</category>
      <category>distributedsystems</category>
      <category>showdev</category>
    </item>
    <item>
      <title>From static slides to a postable video: finishing my LINE-style chat generator with GitHub Copilot</title>
      <dc:creator>Yuuki Yamashita</dc:creator>
      <pubDate>Wed, 03 Jun 2026 11:42:35 +0000</pubDate>
      <link>https://dev.to/_76130e67067eab4c8510/from-static-slides-to-a-postable-video-finishing-my-line-style-chat-generator-with-github-copilot-2p59</link>
      <guid>https://dev.to/_76130e67067eab4c8510/from-static-slides-to-a-postable-video-finishing-my-line-style-chat-generator-with-github-copilot-2p59</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the GitHub Finish-Up-A-Thon Challenge.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;A while back I built a small web tool that turns a block of conversation text into a vertical (9:16), iPhone-style &lt;strong&gt;LINE chat&lt;/strong&gt; slide — with the message bubbles animating in one by one. It was made for talks: paste a conversation, download an animated &lt;code&gt;.pptx&lt;/code&gt;, present it. I even wrote it up here on DEV.&lt;/p&gt;

&lt;p&gt;But it always had a missing half. A &lt;code&gt;.pptx&lt;/code&gt; is great on a projector and awkward everywhere else — you can't really post it. What I actually wanted, and never finished, was to turn the same conversation into a &lt;strong&gt;video&lt;/strong&gt; I could drop into a social post, where the bubbles pop in like a chat happening live.&lt;/p&gt;

&lt;p&gt;This challenge was the push to finish that. The app now exports the conversation as a vertical &lt;strong&gt;mp4&lt;/strong&gt; (and webm), with bubbles fading in one at a time and a "typing…" indicator before each incoming message — all in the browser, no server involved.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;App:&lt;/strong&gt; &lt;a href="https://line-chat-app-nine.vercel.app" rel="noopener noreferrer"&gt;https://line-chat-app-nine.vercel.app&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Repo:&lt;/strong&gt; &lt;a href="https://github.com/yama3133/line-chat-app" rel="noopener noreferrer"&gt;https://github.com/yama3133/line-chat-app&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbp6rhag25i1wf6stb8sb.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbp6rhag25i1wf6stb8sb.gif" alt=" " width="360" height="640"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Comeback Story
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Where it was
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;pptx generation only (python-pptx + hand-written animation XML), already shipped and documented.&lt;/li&gt;
&lt;li&gt;There was an animated &lt;em&gt;preview&lt;/em&gt; in the browser, but the only thing you could &lt;strong&gt;save&lt;/strong&gt; was a pptx.&lt;/li&gt;
&lt;li&gt;The "export it as a video" idea sat untouched for months.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What I added
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Rebuilt the LINE / iPhone UI on a &lt;code&gt;&amp;lt;canvas&amp;gt;&lt;/code&gt;.&lt;/strong&gt; The pptx version draws with python-pptx shapes and the old browser preview was CSS — neither can be recorded as a video, so the UI had to be redrawn on canvas.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Accurate bubble wrapping&lt;/strong&gt; with &lt;code&gt;ctx.measureText&lt;/code&gt;, instead of the character-width guess the python version had to rely on.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The one-by-one fade-in animation&lt;/strong&gt;, driven by a single &lt;code&gt;renderFrame(ctx, opts, t)&lt;/code&gt; that draws the exact state at time &lt;code&gt;t&lt;/code&gt; (the same function is reused for recording).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recording with &lt;code&gt;MediaRecorder&lt;/code&gt;.&lt;/strong&gt; I expected to need ffmpeg.wasm for mp4 — but Chrome's MediaRecorder supports &lt;code&gt;video/mp4;codecs=avc1.42E01E&lt;/code&gt; (H.264) directly, so mp4 comes out with no extra dependency.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A reliability fix that surprised me:&lt;/strong&gt; a naive &lt;code&gt;requestAnimationFrame&lt;/code&gt; record loop silently stalls when the tab isn't focused. I switched to &lt;code&gt;setTimeout&lt;/code&gt; + &lt;code&gt;canvas.captureStream(0)&lt;/code&gt; + &lt;code&gt;track.requestFrame()&lt;/code&gt;, so recording keeps going even in a background tab.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A "typing…" indicator&lt;/strong&gt; before each incoming bubble, for a more chat-like feel.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Still on the list
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;GIF export (handy for Slack / LINE) via ffmpeg.wasm or gif.js.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  My Experience with GitHub Copilot
&lt;/h2&gt;

&lt;p&gt;I leaned on Copilot for the part that's tedious but well-defined: &lt;strong&gt;drawing the LINE / iPhone UI on canvas&lt;/strong&gt;. The approach was comment-driven — I wrote each function's spec as a comment first, then let Copilot fill in the body:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Draw the phone frame (black, rounded) filling the logical area, then the&lt;/span&gt;
&lt;span class="c1"&gt;// inner screen (chat background) inset by L.phone.pad. Return {x, y, w, h}.&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;drawPhoneFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* Copilot */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Where it shone:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Repetitive shape code — the 4 signal bars, the 3-dot menu, the battery with its little nub — came out correct from a one-line comment.&lt;/li&gt;
&lt;li&gt;It picked up my existing constants (&lt;code&gt;V&lt;/code&gt; for colors, &lt;code&gt;L&lt;/code&gt; for layout) and reused them instead of hardcoding values.&lt;/li&gt;
&lt;li&gt;Asking it to implement all the &lt;code&gt;TODO(Copilot)&lt;/code&gt; functions at once filled the whole UI layer in a single pass.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Where I took over: the bubble-wrapping math, the animation timing, and the background-recording fix were things I designed and wrote by hand.&lt;/p&gt;

&lt;p&gt;The split that worked for me: &lt;strong&gt;Copilot builds the UI scaffolding fast; I own the logic.&lt;/strong&gt; My one practical tip is to treat comments as the spec — the more precise the comment, the better the completion.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>githubchallenge</category>
    </item>
    <item>
      <title>Generating animated LINE-style chat slides with python-pptx + raw XML (and shipping it on Vercel)</title>
      <dc:creator>Yuuki Yamashita</dc:creator>
      <pubDate>Tue, 02 Jun 2026 15:20:32 +0000</pubDate>
      <link>https://dev.to/_76130e67067eab4c8510/generating-animated-line-style-chat-slides-with-python-pptx-raw-xml-and-shipping-it-on-vercel-3g32</link>
      <guid>https://dev.to/_76130e67067eab4c8510/generating-animated-line-style-chat-slides-with-python-pptx-raw-xml-and-shipping-it-on-vercel-3g32</guid>
      <description>&lt;p&gt;I was putting together a talk and had one of those half-baked ideas that you can't shake off: what if I showed an iPhone with a LINE chat screen, and the messages popped in one by one, like a real conversation happening live? The problem is, building that by hand in PowerPoint sounds miserable — laying out every bubble, then setting up an entrance animation for each one, one at a time. So instead I built a little tool that takes a chunk of conversation text and spits out a &lt;code&gt;.pptx&lt;/code&gt;, and then I turned it into a web app and put it on Vercel.&lt;/p&gt;

&lt;p&gt;This post focuses on the three things that actually gave me trouble:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Building an iPhone / LINE-style UI out of nothing but shapes in python-pptx&lt;/li&gt;
&lt;li&gt;Working around the fact that python-pptx has no animation support — by hand-writing the &lt;code&gt;timing&lt;/code&gt; XML and injecting it into the slide&lt;/li&gt;
&lt;li&gt;Wiring up "conversation in, pptx out" with a static HTML front end and a Vercel serverless function (including the deploy that bit me)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's what came out of it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;App: &lt;a href="https://line-chat-app-nine.vercel.app" rel="noopener noreferrer"&gt;https://line-chat-app-nine.vercel.app&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Repo: &lt;a href="https://github.com/yama3133/line-chat-app" rel="noopener noreferrer"&gt;https://github.com/yama3133/line-chat-app&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And the slides it generates look like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7zpkv5o94yrgewow7jel.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7zpkv5o94yrgewow7jel.jpg" alt="Example slides" width="800" height="471"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The overall shape of it
&lt;/h2&gt;

&lt;p&gt;It's nothing fancy. This is the whole file list:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.
├─ index.html        front end (input form + live preview)
├─ api/generate.py   conversation JSON → pptx (build_pptx) + handler
├─ requirements.txt  python-pptx
├─ vercel.json       builds (static + @vercel/python)
└─ dev_server.py     for local testing
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All the real generation logic lives in Python (python-pptx), and the front end is plain HTML/CSS/JS. On Vercel, &lt;code&gt;api/generate.py&lt;/code&gt; runs as a serverless function and &lt;code&gt;index.html&lt;/code&gt; is served as a static file.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building the iPhone / LINE UI out of shapes
&lt;/h2&gt;

&lt;p&gt;Set the slide to a tall 9:16 aspect ratio, and from there it's just a matter of stacking rounded rectangles, ellipses, and text boxes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;SLIDE_W&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Inches&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;7.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;SLIDE_H&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Inches&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;13.333&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# 7.5 : 13.333 = 9 : 16
&lt;/span&gt;&lt;span class="n"&gt;prs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;slide_width&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SLIDE_W&lt;/span&gt;
&lt;span class="n"&gt;prs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;slide_height&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SLIDE_H&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The stacking order goes: phone frame (black rounded rect) → screen (chat background) → green header → Dynamic Island → status bar (clock, signal, battery) → input bar. For something like the Dynamic Island, a black rectangle with its corner roundness (adjustment) cranked up to 0.5 already reads as the real thing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;isl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;slide&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shapes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_shape&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MSO_SHAPE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ROUNDED_RECTANGLE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...)&lt;/span&gt;
&lt;span class="n"&gt;isl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;adjustments&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;   &lt;span class="c1"&gt;# fully rounded into a "pill" shape
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For the header and the input bar, I used &lt;code&gt;ROUND_2_SAME_RECTANGLE&lt;/code&gt; ("rounded on the top two corners only" / "bottom two only") so they don't fight with the rounded corners of the screen. The input bar is just rotated 180° so its rounded side faces down.&lt;/p&gt;

&lt;p&gt;One thing that quietly mattered: Japanese fonts. If you only set &lt;code&gt;a:latin&lt;/code&gt;, Japanese text can fall back to some other font, so I push the same typeface into &lt;code&gt;a:ea&lt;/code&gt; (East Asian) as well. That kept things consistent.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;rPr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_or_add_rPr&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;a:latin&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;a:ea&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;a:cs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;el&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rPr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;qn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;rPr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;makeelement&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;qn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;typeface&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hiragino Sans&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Sizing the bubbles by guesswork
&lt;/h3&gt;

&lt;p&gt;Each bubble gets a width and height estimated from its text, then placed. python-pptx can't measure rendered text, so I just approximate the display width as full-width characters = 1.0 and half-width = 0.5: short lines get a one-line width, longer ones wrap at a maximum width. Pretty crude, but it works.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;measure_bubble&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;char_w&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BUBBLE_FONT_PT&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mf"&gt;72.0&lt;/span&gt;           &lt;span class="c1"&gt;# one full-width char ≈ 0.208in
&lt;/span&gt;    &lt;span class="n"&gt;cap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;MAX_BUBBLE_W&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;PAD&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;char_w&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;1.06&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# full-width chars per line
&lt;/span&gt;    &lt;span class="bp"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;longest&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;cap&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;width&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;longest&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;char_w&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;PAD&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.18&lt;/span&gt;   &lt;span class="c1"&gt;# slack so it doesn't wrap
&lt;/span&gt;        &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;width&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MAX_BUBBLE_W&lt;/span&gt;
        &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;ceil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;longest&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;cap&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;height&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;LINE_H&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;PAD_Y&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.06&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At first I didn't add the slack (that trailing &lt;code&gt;+ 0.18&lt;/code&gt;) and used the exact width. The result: a short phrase like &lt;code&gt;おつかれさま！&lt;/code&gt; ("hey, nice work today!") would wrap onto two lines and look weirdly stretched. The actual rendered character width comes out a bit wider than my estimate, so giving it a little breathing room settled things down.&lt;/p&gt;

&lt;p&gt;For the arrangement I copied real LINE and made "bottom-aligned" the default (newest message sits right above the input bar). I just sum up the height of the whole conversation first and set the start position to &lt;code&gt;CHAT_BOTTOM - block_h&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The main event: no animation API, so write the XML yourself
&lt;/h2&gt;

&lt;p&gt;This is where I got stuck the longest. python-pptx can create shapes and text just fine, but it gives you no way to touch animations (entrance effects and the like) through its API. So what do you do? Animations are stored as an OOXML element called &lt;code&gt;&amp;lt;p:timing&amp;gt;&lt;/code&gt;, so I build that element myself and append it to the end of an already-generated slide.&lt;/p&gt;

&lt;p&gt;PowerPoint animations are a tree of time nodes. Under &lt;code&gt;mainSeq&lt;/code&gt; (the sequence that advances on click), you hang a &lt;code&gt;&amp;lt;p:par&amp;gt;&lt;/code&gt; for each effect. What I wanted was "on click, bubbles appear one after another at 0.8s intervals," so:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The first effect is &lt;code&gt;nodeType="clickEffect"&lt;/code&gt; (fires on click)&lt;/li&gt;
&lt;li&gt;Every effect after that is &lt;code&gt;nodeType="withEffect"&lt;/code&gt; with a &lt;code&gt;delay&lt;/code&gt;, staggered in time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There's also an "After Previous" option, but it works by waiting for the previous effect to finish, and the behavior felt flaky. So I leaned on absolute offsets of &lt;code&gt;i × 800ms&lt;/code&gt; instead. That turned out to be both more predictable and, honestly, less work.&lt;/p&gt;

&lt;p&gt;The effect on each bubble is "Float In" — it fades in while drifting up slightly from below. I use &lt;code&gt;&amp;lt;p:set&amp;gt;&lt;/code&gt; to make it visible, &lt;code&gt;&amp;lt;p:anim&amp;gt;&lt;/code&gt; to move &lt;code&gt;ppt_y&lt;/code&gt; (vertical position) from a touch below up to its real spot, and &lt;code&gt;&amp;lt;p:animEffect filter="fade"&amp;gt;&lt;/code&gt; to fade it in, all running together.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_effect_par&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;spid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;delay&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;node_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;grp&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'''&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;p:par ...&amp;gt;
  &amp;lt;p:cTn id=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;eid&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; presetID=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;42&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; presetClass=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;entr&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; presetSubtype=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;
         fill=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hold&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; grpId=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;grp&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; nodeType=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;node_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;
    &amp;lt;p:stCondLst&amp;gt;&amp;lt;p:cond delay=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;delay&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&amp;gt;&amp;lt;/p:stCondLst&amp;gt;
    &amp;lt;p:childTnLst&amp;gt;
      &amp;lt;p:set&amp;gt; ... style.visibility=visible ... &amp;lt;/p:set&amp;gt;
      &amp;lt;p:anim calcmode=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lin&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; valueType=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;num&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;
        &amp;lt;p:cBhvr additive=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;base&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;
          &amp;lt;p:cTn id=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;eid&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; dur=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;EFFECT_MS&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; fill=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hold&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&amp;gt;
          &amp;lt;p:tgtEl&amp;gt;&amp;lt;p:spTgt spid=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;spid&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&amp;gt;&amp;lt;/p:tgtEl&amp;gt;
          &amp;lt;p:attrNameLst&amp;gt;&amp;lt;p:attrName&amp;gt;ppt_y&amp;lt;/p:attrName&amp;gt;&amp;lt;/p:attrNameLst&amp;gt;
        &amp;lt;/p:cBhvr&amp;gt;
        &amp;lt;p:tavLst&amp;gt;
          &amp;lt;p:tav tm=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;&amp;lt;p:val&amp;gt;&amp;lt;p:strVal val=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ppt_y+0.04&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&amp;gt;&amp;lt;/p:val&amp;gt;&amp;lt;/p:tav&amp;gt;
          &amp;lt;p:tav tm=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;100000&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;&amp;lt;p:val&amp;gt;&amp;lt;p:strVal val=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ppt_y&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&amp;gt;&amp;lt;/p:val&amp;gt;&amp;lt;/p:tav&amp;gt;
        &amp;lt;/p:tavLst&amp;gt;
      &amp;lt;/p:anim&amp;gt;
      &amp;lt;p:animEffect transition=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; filter=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fade&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt; ... &amp;lt;/p:animEffect&amp;gt;
    &amp;lt;/p:childTnLst&amp;gt;
  &amp;lt;/p:cTn&amp;gt;
&amp;lt;/p:par&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;'''&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;spid&lt;/code&gt; is &lt;code&gt;shape.shape_id&lt;/code&gt;. For an incoming message I wanted the bubble and the avatar to appear together, so I give them the same &lt;code&gt;grpId&lt;/code&gt; and the same &lt;code&gt;delay&lt;/code&gt;. Once the &lt;code&gt;&amp;lt;p:timing&amp;gt;&lt;/code&gt; string is assembled, I parse it with lxml and just append it to the slide.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;timing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;etree&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromstring&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timing_xml&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;slide&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timing&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A nice bonus: once a shape has an entrance effect, PowerPoint handles "keep it hidden until it fires" on its own. So I didn't have to write any code to hide the initial state myself.&lt;/p&gt;

&lt;p&gt;One more note on checking my work. I was eyeballing layout by exporting to PDF with LibreOffice — but LibreOffice's PDF export only renders the &lt;em&gt;final&lt;/em&gt; state of an animation. It's perfectly fine for catching layout problems, but to verify the motion itself I had to fall back to running an actual slideshow in PowerPoint/Keynote.&lt;/p&gt;

&lt;h2&gt;
  
  
  Turning it into a web app
&lt;/h2&gt;

&lt;p&gt;Because I'd already factored the generation into a single &lt;code&gt;build_pptx(data) -&amp;gt; bytes&lt;/code&gt; function, the web side only needed to POST the conversation as JSON. I wrote the serverless function with the standard-library &lt;code&gt;BaseHTTPRequestHandler&lt;/code&gt; and return the bytes from &lt;code&gt;build_pptx&lt;/code&gt; directly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseHTTPRequestHandler&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;do_POST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;{}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;pptx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;build_pptx&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_header&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/vnd.openxmlformats-officedocument.presentationml.presentation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_header&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Disposition&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;attachment; filename=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;line_chat.pptx&lt;/span&gt;&lt;span class="sh"&gt;"'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end_headers&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pptx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Input is just one text area
&lt;/h3&gt;

&lt;p&gt;I didn't get clever with the input UI — it's a single text area. The only rules are "&lt;code&gt;L:&lt;/code&gt; is the other person, &lt;code&gt;R:&lt;/code&gt; is you, a blank line starts a new slide." Keeping it this loose is actually what makes it pleasant: you just dump in whatever conversation pops into your head.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;L: Hey, nice work today!
L: There's something I want to run by you
R: What's up?

L: This thing's been eating up my time lately
R: Oh, I totally get that
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Parsing is a regex applied line by line. Any line without an &lt;code&gt;L:&lt;/code&gt;/&lt;code&gt;R:&lt;/code&gt; prefix gets joined onto the previous bubble with a newline (i.e. multi-line bubbles).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sr"&gt;/^&lt;/span&gt;&lt;span class="se"&gt;([&lt;/span&gt;&lt;span class="sr"&gt;LRlr&lt;/span&gt;&lt;span class="se"&gt;])[&lt;/span&gt;&lt;span class="sr"&gt;:：&lt;/span&gt;&lt;span class="se"&gt;]\s?(&lt;/span&gt;&lt;span class="sr"&gt;.*&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="sr"&gt;$/&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;last&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="na"&gt;side&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;toUpperCase&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]};&lt;/span&gt; &lt;span class="nx"&gt;slide&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;last&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;last&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;last&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;   &lt;span class="c1"&gt;// multi-line bubble&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  A live preview in the browser
&lt;/h3&gt;

&lt;p&gt;Having to download the file just to see the result is a miserable loop, so I rebuilt the same look in HTML/CSS and added a preview-and-play feature right in the browser. The colors and the bottom-alignment match the generated pptx, and for playback I just add CSS transitions (opacity and translateY) one after another with &lt;code&gt;setTimeout&lt;/code&gt;. It ends up feeling about the same as the animation inside the pptx.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two things that bit me on deploy
&lt;/h2&gt;

&lt;h3&gt;
  
  
  A requirements.txt makes Vercel think it's a "Python app"
&lt;/h3&gt;

&lt;p&gt;My very first &lt;code&gt;vercel --prod&lt;/code&gt; died immediately with this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The pattern "api/generate.py" defined in `functions`
doesn't match any Serverless Functions inside the `api` directory.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The cause was the &lt;code&gt;requirements.txt&lt;/code&gt; sitting at the project root. With that present, Vercel decides the whole project is a "Python app" and stops picking up &lt;code&gt;api/generate.py&lt;/code&gt; as an individual function. I fixed it by spelling out &lt;code&gt;builds&lt;/code&gt; in &lt;code&gt;vercel.json&lt;/code&gt;, telling Vercel to build the static HTML and the Python function separately.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"builds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"src"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"index.html"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nl"&gt;"use"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@vercel/static"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"src"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"api/generate.py"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"use"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@vercel/python"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"routes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"src"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/api/generate"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"dest"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/api/generate.py"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"src"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;             &lt;/span&gt;&lt;span class="nl"&gt;"dest"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/index.html"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After that, python-pptx installed itself from &lt;code&gt;requirements.txt&lt;/code&gt;, and both the function and the static serving worked exactly as I'd hoped.&lt;/p&gt;

&lt;h3&gt;
  
  
  You can't protect production on the free plan
&lt;/h3&gt;

&lt;p&gt;Once it was live, I wanted to put it behind some access control, so I tried enabling &lt;code&gt;ssoProtection&lt;/code&gt; (Vercel Authentication) via Vercel's REST API. This came back:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"invalid_sso_protection"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Vercel Authentication is not available on your plan for production deployments"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Turns out the Hobby (free) plan doesn't let you put Vercel Authentication / Password Protection on production deployments (you need Pro). Preview deployments can be protected on the free plan. If you want to keep production hidden while staying free, you're looking at rolling your own gate — Edge Middleware, or Basic auth inside the Python function. For now I left it open.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping up
&lt;/h2&gt;

&lt;p&gt;After doing all this, the part I'm most glad I pushed through was writing the animation XML by hand. I'd nearly written off animations as impossible with python-pptx, but it turns out that hand-assembling a &lt;code&gt;&amp;lt;p:timing&amp;gt;&lt;/code&gt; element and appending it is all it takes to get bubbles floating in, one by one, on a real PowerPoint. And the dead-simple &lt;code&gt;withEffect + absolute delay&lt;/code&gt; approach was more than enough.&lt;/p&gt;

&lt;p&gt;Factoring the generation into a single function also paid off more than I expected — moving from a CLI to a web app was basically copy-paste.&lt;/p&gt;

&lt;p&gt;Being able to type out a conversation and get an animated chat slide back made prepping for talks noticeably less of a chore. The code is up on &lt;a href="https://github.com/yama3133/line-chat-app" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; if you want to poke around — happy to hear what you'd build with it.&lt;/p&gt;

</description>
      <category>automation</category>
      <category>productivity</category>
      <category>python</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
