<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: naoki_JPN</title>
    <description>The latest articles on DEV Community by naoki_JPN (@bokuno_log).</description>
    <link>https://dev.to/bokuno_log</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3849123%2Fb9f651cc-b700-4b86-abdc-5048c5a502d7.png</url>
      <title>DEV Community: naoki_JPN</title>
      <link>https://dev.to/bokuno_log</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/bokuno_log"/>
    <language>en</language>
    <item>
      <title>I shipped a paid web app in half a day using Claude Code + Codex (Garmin AI CSV converter)</title>
      <dc:creator>naoki_JPN</dc:creator>
      <pubDate>Thu, 07 May 2026 12:20:22 +0000</pubDate>
      <link>https://dev.to/bokuno_log/i-shipped-a-paid-web-app-in-half-a-day-using-claude-code-codex-garmin-ai-csv-converter-5b63</link>
      <guid>https://dev.to/bokuno_log/i-shipped-a-paid-web-app-in-half-a-day-using-claude-code-codex-garmin-ai-csv-converter-5b63</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This article reflects information as of May 2026.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What I built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Garmin AI Export&lt;/strong&gt; — a web app that converts the activity, sleep, and health data ZIP exported from Garmin Connect into clean CSVs that ChatGPT, Gemini, and Claude can readily analyze.&lt;/p&gt;

&lt;p&gt;🔗 &lt;a href="https://garmin-ai-export.vercel.app" rel="noopener noreferrer"&gt;https://garmin-ai-export.vercel.app&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you've worn a Garmin watch for years, you've accumulated a massive trail of data: runs, sleep records, heart rate, stress, Body Battery — all of it. Garmin Connect lets you export everything, but what you get is a tangled bundle of nested JSON and binary FIT files. Hand that directly to ChatGPT and tell it to "analyze this," and it won't know where to start.&lt;/p&gt;

&lt;p&gt;This app:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Takes the raw ZIP you exported from Garmin Connect&lt;/li&gt;
&lt;li&gt;Unpacks and parses it &lt;strong&gt;entirely in your browser&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Returns a ZIP containing four files — &lt;code&gt;activities.csv&lt;/code&gt;, &lt;code&gt;sleep.csv&lt;/code&gt;, &lt;code&gt;daily_health.csv&lt;/code&gt;, &lt;code&gt;laps.csv&lt;/code&gt; — plus an AI prompt template&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Feed that into the Code Interpreter on ChatGPT, Gemini, or Claude and you can immediately ask things like "show me how my pace has trended" or "find correlations between sleep and next-day performance."&lt;/p&gt;

&lt;p&gt;And here's the punchline: I built and shipped this — including the payment integration — &lt;strong&gt;in about half a day, by pairing Claude Code (Opus 4.7) with Codex (GPT-5 Codex)&lt;/strong&gt;. I'll get into the role-split below, but the short version is: Claude Code played PM and reviewer; Codex did the implementation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Next.js (App Router)&lt;/strong&gt; + Tailwind CSS&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JSZip&lt;/strong&gt; — ZIP unpacking and re-compression&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;@streamparser/json&lt;/strong&gt; — streaming parser for huge JSON&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;fit-file-parser&lt;/strong&gt; — for Garmin's binary &lt;code&gt;.fit&lt;/code&gt; files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PapaParse&lt;/strong&gt; — CSV generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web Worker&lt;/strong&gt; — to keep the main thread responsive&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Square Payment Links&lt;/strong&gt; — payments (Apple Pay / Google Pay supported)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vercel&lt;/strong&gt; — hosting&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key principle is &lt;strong&gt;"everything in the browser."&lt;/strong&gt; Garmin exports can be hundreds of megabytes to multiple gigabytes — there's no way you're streaming that to a server (Vercel's request body limit is 4.5MB; you're done before you start). Plus, on principle, I didn't want users' health data sitting on my server.&lt;/p&gt;

&lt;p&gt;So upload, parse, convert, and download all happen inside the user's browser. The only server-side code is a tiny API route that creates a Square payment link.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Claude Code and Codex split the work
&lt;/h2&gt;

&lt;p&gt;This project ran on two AI agents with &lt;strong&gt;clearly separated roles&lt;/strong&gt;.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Agent&lt;/th&gt;
&lt;th&gt;Responsibilities&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;PM / Research&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Claude Code (Opus 4.7)&lt;/td&gt;
&lt;td&gt;Translate requests into GitHub Issues, manage labels, set priorities&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Implementation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Codex (GPT-5 Codex)&lt;/td&gt;
&lt;td&gt;Cut branches, write code, open Draft PRs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Code review&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Claude Code (Opus 4.7)&lt;/td&gt;
&lt;td&gt;Read PR diffs, leave sharp inline comments on GitHub&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Merge / deploy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Claude Code (Opus 4.7)&lt;/td&gt;
&lt;td&gt;Approve, squash-merge, verify production&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Approval / GO&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Me (the human)&lt;/td&gt;
&lt;td&gt;Just say "OK," "implement #X," "ship it"&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The actual flow looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;I describe what I want
  → Claude Code creates a GitHub Issue (label: triage)
  → I say "OK" → label changes to ready
  → I say "implement #X"
  → Codex cuts a branch, writes code, opens a Draft PR
  → I say "please review"
  → Claude Code posts a thorough review on GitHub
  → Codex addresses the comments
  → Claude Code re-reviews → LGTM, squash merge
  → Vercel auto-deploys
  → Claude Code curl-checks production
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What's interesting about this setup is that &lt;strong&gt;using a different model for review actually works&lt;/strong&gt;. When Claude Code reads code that Codex wrote, there's no self-confirmation bias — comments like "&lt;code&gt;disabled={!result}&lt;/code&gt; is dead code here," "you're using a private JSZip API," "the worker isn't terminated on reset" surface naturally. You don't get this when the same model writes and reviews.&lt;/p&gt;

&lt;p&gt;I never wrote a line of code. I only said "I want this," "OK," and "ship it."&lt;/p&gt;

&lt;h2&gt;
  
  
  Things that bit me
&lt;/h2&gt;

&lt;p&gt;I stepped on a few mines worth documenting.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Parsing 135MB JSON on the main thread froze the page
&lt;/h3&gt;

&lt;p&gt;Garmin's &lt;code&gt;*_summarizedActivities.json&lt;/code&gt; is over &lt;strong&gt;100MB&lt;/strong&gt; for heavy users. A naive &lt;code&gt;JSON.parse(text)&lt;/code&gt; blocks the main thread for ~20 seconds and the browser starts asking the user if they want to kill the page.&lt;/p&gt;

&lt;p&gt;The fix is two-layered.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;(a) Move it to a Web Worker&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/workers/garmin-converter.worker.ts&lt;/span&gt;
&lt;span class="nb"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onmessage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;convertGarminExportCoreBuffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;postProgress&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nb"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;postMessage&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;done&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;transfer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Putting the &lt;code&gt;ArrayBuffer&lt;/code&gt; in the &lt;code&gt;transfer&lt;/code&gt; list does a &lt;strong&gt;zero-copy&lt;/strong&gt; handoff to the main thread. Even a 100MB+ buffer crosses the worker boundary instantly because nothing is copied.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;(b) Stream-parse the JSON itself&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Use &lt;code&gt;@streamparser/json&lt;/code&gt; to &lt;strong&gt;iterate over the giant array element-by-element&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;JSONParser&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@streamparser/json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JSONParser&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;$.*.summarizedActivitiesExport.*&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onValue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;extractActivityRow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;$.*&lt;/code&gt; prefix in &lt;code&gt;paths&lt;/code&gt; is a sneaky trap. The actual Garmin JSON shape is &lt;code&gt;[{"summarizedActivitiesExport": [...]}]&lt;/code&gt; — an &lt;strong&gt;array at the root&lt;/strong&gt;. So you have to indicate "the array root" with &lt;code&gt;$.*&lt;/code&gt;, otherwise nothing matches and your CSV ends up empty. I forgot this on the first pass and spent way too long staring at empty output.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Reaching into JSZip's private API and getting burned
&lt;/h3&gt;

&lt;p&gt;The first version of the code, written by Claude, was using &lt;code&gt;zip.file().internalStream("uint8array")&lt;/code&gt;. It worked, but it's an undocumented JSZip internal API with no type definitions. Review caught it and we rewrote it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;buffer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;uint8array&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunkSize&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// 1MB&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;offset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;offset&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;offset&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;chunkSize&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// process 1MB at a time, yielding back to the event loop&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;yieldToEventLoop&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nf"&gt;processChunk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;subarray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;offset&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;chunkSize&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;yieldToEventLoop()&lt;/code&gt; is just &lt;code&gt;new Promise(resolve =&amp;gt; setTimeout(resolve, 0))&lt;/code&gt;. Without it, you're back to freezing the UI on huge files.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. The iOS Safari Blob + IndexedDB landmine
&lt;/h3&gt;

&lt;p&gt;For the payment gate, I needed to &lt;strong&gt;persist the converted ZIP somewhere&lt;/strong&gt; while the user bounces over to Square's checkout and back. The natural choice: stash a Blob in IndexedDB. But:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The biggest risk is the case where iOS Safari can't restore the Blob after payment.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This actually happens. iOS Safari has a &lt;strong&gt;long-standing bug where Blobs stored in IndexedDB come back corrupted&lt;/strong&gt; when retrieved, especially for larger Blobs. "Paid the money, can't get the file" is the worst possible UX, so:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Blob → ArrayBuffer before storing&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;buffer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;blob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;arrayBuffer&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;files&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Reconstruct the Blob from the ArrayBuffer when retrieving&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;blob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Blob&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;application/zip&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;ArrayBuffer round-trips cleanly through IndexedDB even on iOS Safari. Catching this in review saved a lot of pain.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. The Worker keeping running after a Reset
&lt;/h3&gt;

&lt;p&gt;If a user clicks "Reset" while a conversion is in progress, you have to actually terminate the worker — otherwise it leaks memory and burns CPU.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;activeWorker&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Worker&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;abortConversion&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;activeWorker&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;terminate&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nx"&gt;activeWorker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Easy to forget. A review comment of "is this still running in the background after reset?" is what surfaced it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Adding payments
&lt;/h2&gt;

&lt;p&gt;I gated the download behind a paywall.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Square?
&lt;/h3&gt;

&lt;p&gt;Candidates were Stripe, PayPal, Square, Paddle. Reasons I picked Square:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stripe rejected my application&lt;/strong&gt; (sole-proprietor application takes time)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Apple Pay and Google Pay are included by default&lt;/strong&gt; (zero extra implementation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Payment Links API gives you hosted checkout in seconds&lt;/strong&gt; — no card form on my own site, so no PCI DSS scope&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The implementation is shockingly small. Server-side, you just hit Square's API to mint a payment link:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/app/api/square/payment-link/route.ts&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;squareResponse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nf"&gt;getSquareApiBaseUrl&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;&lt;span class="s2"&gt;/v2/online-checkout/payment-links`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;Authorization&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Bearer &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;SQUARE_ACCESS_TOKEN&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Square-Version&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;2026-01-22&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;idempotency_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;crypto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;randomUUID&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="na"&gt;quick_pay&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Garmin AI Export&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;price_money&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PRICE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;currency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;JPY&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;location_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;SQUARE_LOCATION_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;checkout_options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;allow_tipping&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;redirect_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://garmin-ai-export.vercel.app?paid=true&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Redirect the user to the returned &lt;code&gt;payment_link.url&lt;/code&gt;, they pay (card or Apple Pay or Google Pay) on Square's hosted checkout, and they come back to your site with &lt;code&gt;?paid=true&lt;/code&gt; appended.&lt;/p&gt;

&lt;h3&gt;
  
  
  Soft gate, by design
&lt;/h3&gt;

&lt;p&gt;I do &lt;strong&gt;not&lt;/strong&gt; server-side verify that the user actually paid. If &lt;code&gt;?paid=true&lt;/code&gt; is in the URL, the download unlocks.&lt;/p&gt;

&lt;p&gt;This is intentional:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I have no database and no user accounts, so there's no real way to verify (you could log purchases via webhook, but you'd need a way to identify users)&lt;/li&gt;
&lt;li&gt;For a tool meant for individual personal use, someone bypassing it by typing the URL is fine&lt;/li&gt;
&lt;li&gt;Philosophy: "If you feel like paying, please pay. If not, that's also fine."&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  About currency
&lt;/h3&gt;

&lt;p&gt;Square only lets you receive revenue in your &lt;strong&gt;account's home country currency&lt;/strong&gt;. My account is registered in Japan, so JPY only. When an overseas user pays via Apple Pay, Apple does the FX conversion on their side, and the merchant always sees JPY in the books. So an English UI with JPY pricing is perfectly workable for international users.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploy and cost
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Moving from Vercel Hobby to Pro immediately
&lt;/h3&gt;

&lt;p&gt;The Vercel Hobby plan is &lt;strong&gt;strictly "personal, non-commercial use."&lt;/strong&gt; Running a paid site on Hobby violates the ToS and risks account termination.&lt;/p&gt;

&lt;p&gt;→ I upgraded to &lt;strong&gt;Pro ($20/month)&lt;/strong&gt; right after wiring up payments.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bandwidth is a non-issue
&lt;/h3&gt;

&lt;p&gt;I worried at first: "These Garmin ZIPs can be hundreds of MB — am I going to blow through bandwidth?" But once I thought about it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Uploading the Garmin ZIP → handled in-browser (never touches Vercel)&lt;/li&gt;
&lt;li&gt;Downloading the converted ZIP → from browser to local disk (never touches Vercel)&lt;/li&gt;
&lt;li&gt;Payment screen → on Square's domain (never touches Vercel)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Vercel only serves the &lt;strong&gt;initial page load (~2MB)&lt;/strong&gt; and the Square API call (~1KB). Pro's 1TB allotment covers ~500,000 users. The browser-only architecture really pays off here.&lt;/p&gt;

&lt;h2&gt;
  
  
  Google Analytics and a favicon
&lt;/h2&gt;

&lt;p&gt;Last touches: GA and a custom favicon.&lt;/p&gt;

&lt;p&gt;GA4 via &lt;code&gt;next/script&lt;/code&gt; with &lt;code&gt;afterInteractive&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;SHOULD_LOAD_GOOGLE_ANALYTICS&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Script&lt;/span&gt;
      &lt;span class="na"&gt;src&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;`https://www.googletagmanager.com/gtag/js?id=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;GA_MEASUREMENT_ID&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
      &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"afterInteractive"&lt;/span&gt;
    &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Script&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"google-analytics"&lt;/span&gt; &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"afterInteractive"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;`window.dataLayer = window.dataLayer || [];
        function gtag(){dataLayer.push(arguments);}
        gtag('js', new Date());
        gtag('config', '&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;GA_MEASUREMENT_ID&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;');`&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Script&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;/&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Favicons in Next.js App Router are &lt;strong&gt;convention-based&lt;/strong&gt; — drop &lt;code&gt;src/app/icon.tsx&lt;/code&gt; and &lt;code&gt;src/app/apple-icon.tsx&lt;/code&gt; and they're auto-served at &lt;code&gt;/icon&lt;/code&gt; and &lt;code&gt;/apple-icon&lt;/code&gt;. You generate them dynamically from &lt;code&gt;ImageResponse&lt;/code&gt; so no external image file required:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/app/app-icon-image.tsx&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;createAppIconResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;width&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;height&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ImageResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;style&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;background&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#104c3f&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;borderRadius&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;20%&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="cm"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;svg&lt;/span&gt; &lt;span class="na"&gt;viewBox&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"0 0 24 24"&lt;/span&gt; &lt;span class="na"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"none"&lt;/span&gt; &lt;span class="na"&gt;stroke&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"white"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="cm"&gt;/* FileArchive icon SVG paths */&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;svg&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nx"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Reflections on building with Claude Code + Codex
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What worked well&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The role-split is fast.&lt;/strong&gt; Claude Code focusing on requirement-shaping and review while Codex steadily implements is a great fit for "one human plus a fleet of AIs."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mixed-model review actually catches things.&lt;/strong&gt; When the same model writes and reviews, it misses stuff. A different model has no qualms about saying "wait, this is wrong."&lt;/li&gt;
&lt;li&gt;When I got stuck, Claude Code could surface fixes like "iOS Safari has a known IndexedDB Blob bug, switch to ArrayBuffer storage" without me prompting it.&lt;/li&gt;
&lt;li&gt;Codex is callable from CLI, so I can shoot off "implement this issue and open a PR" via Bash in one shot.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What I made sure to do&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Never let "I'd kinda want this" turn into "started implementing." I wrote an explicit rule in CLAUDE.md: implementation only starts on "implement #X."&lt;/li&gt;
&lt;li&gt;Reviews always go to a different model (Claude Code) so the implementer (Codex) doesn't grade its own homework.&lt;/li&gt;
&lt;li&gt;I don't say "merge it" — I say "ship it to main" / "release to prod" so deployments are an explicit step.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The "I'll just whip up a thing" mode is incredibly powerful. The flip side is that &lt;strong&gt;without checks, it'll happily over-build&lt;/strong&gt;, so the review-and-approval ritual is worth keeping.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrap-up
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Half a day → a working web app with payments, analytics, and a custom favicon&lt;/li&gt;
&lt;li&gt;Browser-only architecture wins on both server cost and privacy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code (PM/reviewer) + Codex (implementer)&lt;/strong&gt; as a two-agent setup gives you both speed and quality&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're a Garmin user curious to throw your own activity data at ChatGPT, Gemini, or Claude, give it a try.&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;&lt;a href="https://garmin-ai-export.vercel.app" rel="noopener noreferrer"&gt;https://garmin-ai-export.vercel.app&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>nextjs</category>
      <category>typescript</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Building Production AI Agents with Google Cloud ADK + Claude [30-min Workshop]</title>
      <dc:creator>naoki_JPN</dc:creator>
      <pubDate>Thu, 07 May 2026 07:15:54 +0000</pubDate>
      <link>https://dev.to/bokuno_log/building-production-ai-agents-with-google-cloud-adk-claude-30-min-workshop-1lll</link>
      <guid>https://dev.to/bokuno_log/building-production-ai-agents-with-google-cloud-adk-claude-30-min-workshop-1lll</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This article summarizes the following X post video (approx. 30 min) in English.&lt;br&gt;
Speaker: Ivan Nardini (Google Cloud Developer Relations Engineer, AI/ML) / Recorded at an Anthropic-hosted event.&lt;br&gt;
Original YouTube: &lt;a href="https://www.youtube.com/watch?v=TUysIAtxyrQ" rel="noopener noreferrer"&gt;Building AI agents with Claude in Google Cloud's Vertex AI | Code w/ Claude&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;You've built an AI agent — but can't ship it to production. That's the wall Ivan Nardini (Google Cloud) dismantles in this 30-minute workshop.&lt;/p&gt;

&lt;p&gt;Using ADK, MCP, Vertex AI Agent Engine, and A2A Protocol, he walks through building and deploying a multi-agent system powered by Claude — end to end.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why AI Agents Are Hard to Productionize
&lt;/h2&gt;

&lt;p&gt;Prototypes are easy. Production is hard. Three root causes:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Challenge&lt;/th&gt;
&lt;th&gt;Details&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Fragmented landscape&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Too many frameworks — unclear what to choose&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hard to integrate&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cross-framework agent communication is complex&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Lack of ops &amp;amp; governance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Monitoring, logging, and scaling must all be hand-rolled&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Google Cloud's &lt;strong&gt;Agentic Stack&lt;/strong&gt; is designed to solve all three.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Google Cloud Agentic Stack
&lt;/h2&gt;

&lt;p&gt;Four layers, each targeting one of the above challenges:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Agent Development Kit (ADK)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Open-source, code-first agent development framework&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Model Context Protocol (MCP)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Open protocol standardizing how apps provide context to LLMs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Vertex AI Agent Engine&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Managed platform for deploying and scaling agents in production&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Agent2Agent (A2A) Protocol&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Open standard enabling cross-framework agent collaboration&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Demo 1: Build Your First Agent with 3 Files
&lt;/h2&gt;

&lt;p&gt;Using a birthday planner agent as the example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.agents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LlmAgent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.models.anthropic_llm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Claude&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.models.registry&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LLMRegistry&lt;/span&gt;

&lt;span class="n"&gt;root_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LlmAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;birthday_planner&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-3-7-sonnet@20250219&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;An agent that helps plan birthday parties&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instruction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Handle guest lists, venue suggestions, and scheduling...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Just three files: &lt;code&gt;agent.py&lt;/code&gt;, &lt;code&gt;.env&lt;/code&gt;, &lt;code&gt;requirements.txt&lt;/code&gt;. One command to run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;adk run birthday_planner    &lt;span class="c"&gt;# CLI interaction&lt;/span&gt;
adk web                     &lt;span class="c"&gt;# Browser UI + debug view&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;ADK supports &lt;code&gt;LlmAgent&lt;/code&gt;, &lt;code&gt;SequentialAgent&lt;/code&gt;, and other patterns — compatible with Claude, Gemini, and more.&lt;/p&gt;




&lt;h2&gt;
  
  
  Demo 2: Go Multi-Agent with MCP
&lt;/h2&gt;

&lt;p&gt;To extend the birthday planner to also schedule calendar events, you add two more agents and an orchestrator:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;BirthdayPlannerAgent&lt;/strong&gt; — party suggestions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CalendarServiceAgent&lt;/strong&gt; — calendar operations via MCP server&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EventOrganizerAgent&lt;/strong&gt; — routes requests to the right agent&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Connecting an MCP server is two lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;mcp_tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exit_stack&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;MCPToolset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;connection_params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;SseServerParams&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MCP_CALENDAR_SERVER_URL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LlmAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CalendarServiceAgent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-3-7-sonnet@20250219&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mcp_tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Any existing MCP server can be plugged in as a tool. The orchestrator auto-routes requests based on agent descriptions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Demo 3: Deploy to Vertex AI Agent Engine
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;agent_engines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;root_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;requirements&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;google-cloud-aiplatform[adk]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What you get automatically after deploy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Observability&lt;/strong&gt; via Cloud Trace / Logging / Monitoring&lt;/li&gt;
&lt;li&gt;Session management (persistent conversation history)&lt;/li&gt;
&lt;li&gt;Integration with Vertex AI Evaluation Service for continuous improvement&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Works with LangGraph, LangChain, LlamaIndex, and CrewAI too — not just ADK.&lt;/p&gt;




&lt;h2&gt;
  
  
  Bonus: A2A Protocol — Cross-Framework Agent Communication
&lt;/h2&gt;

&lt;p&gt;When you need a LangChain agent and an ADK agent to collaborate, you need a shared language: &lt;strong&gt;Agent2Agent (A2A) Protocol&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Two core concepts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Agent Card&lt;/strong&gt;: A digital business card for the agent — lets other agents discover what it can do&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent Skills&lt;/strong&gt;: Describes the agent's specific capabilities and API&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Built on HTTP / JSON-RPC, enterprise-ready security included.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Takeaway&lt;/th&gt;
&lt;th&gt;Detail&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ADK: 3 files, 1 command&lt;/td&gt;
&lt;td&gt;Fastest path to a working agent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP: 2 lines&lt;/td&gt;
&lt;td&gt;Plug in any existing MCP server as a tool&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent Engine: zero-ops deploy&lt;/td&gt;
&lt;td&gt;Observability, scaling, sessions — all managed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A2A: break the framework wall&lt;/td&gt;
&lt;td&gt;Claude, Gemini, LangChain, CrewAI can coexist&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;ADK + MCP + Agent Engine + A2A gives you a complete stack from local dev to production scale.&lt;/p&gt;

</description>
      <category>googlecloud</category>
      <category>claude</category>
      <category>agents</category>
      <category>mcp</category>
    </item>
    <item>
      <title>Anthropic's Prompting 101 — A Practical Guide to Building Production-Quality Claude Prompts</title>
      <dc:creator>naoki_JPN</dc:creator>
      <pubDate>Fri, 01 May 2026 05:27:23 +0000</pubDate>
      <link>https://dev.to/bokuno_log/anthropics-prompting-101-a-practical-guide-to-building-production-quality-claude-prompts-23k4</link>
      <guid>https://dev.to/bokuno_log/anthropics-prompting-101-a-practical-guide-to-building-production-quality-claude-prompts-23k4</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This article is a Japanese summary of a ~25-minute video posted by &lt;a href="https://x.com/jota_snchez" rel="noopener noreferrer"&gt;@jota_snchez&lt;/a&gt; on X. This is the English translation. Original video: &lt;a href="https://x.com/jota_snchez/status/2049898145346105395" rel="noopener noreferrer"&gt;https://x.com/jota_snchez/status/2049898145346105395&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Hannah Moran and Christian Ryan from Anthropic's Applied AI Team walk through prompt engineering best practices with live console demos.&lt;/p&gt;

&lt;p&gt;Using a real customer case — having Claude analyze Swedish car accident insurance forms — they show how to evolve a prompt from step one through five versions, going from "Claude thinks it's a ski accident" to production-quality structured output. Every iteration is highly instructive.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Basic Prompt Structure
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjkzcy6cxecho0fykdvky.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjkzcy6cxecho0fykdvky.jpg" alt="Prompt structure with 5 elements (5:00)" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Anthropic recommends organizing prompts around 5 core elements:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Element&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Task description&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1–2 sentences defining Claude's role and the task&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Dynamic content&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Data, images, or retrieved information to process&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Detailed instructions&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Step-by-step guidance on how to approach the task&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Examples (optional)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Few-shot samples&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Reminder of critical points&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Restate the most important rules at the end&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; For long prompts, &lt;strong&gt;repeating critical instructions at the end&lt;/strong&gt; is especially effective.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Organizing Information: XML Tags
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fchmqralhglingoowkfbm.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fchmqralhglingoowkfbm.jpg" alt="How to organize information in prompts with XML tags (10:00)" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Claude excels with structured information. Anthropic's top recommendation is using &lt;strong&gt;XML tags&lt;/strong&gt; as delimiters:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;user_preferences&amp;gt;&lt;/span&gt;
  {{USER_PREFERENCES}}
&lt;span class="nt"&gt;&amp;lt;/user_preferences&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Explicitly declares what's inside the tags&lt;/li&gt;
&lt;li&gt;Makes it easier for Claude to reference that information later in the prompt&lt;/li&gt;
&lt;li&gt;Clearer boundaries than Markdown, and more token-efficient&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Disorganized prompts are hard for Claude to parse and degrade output quality. XML tags alone can make a significant difference.&lt;/p&gt;




&lt;h2&gt;
  
  
  Live Demo: Building a Prompt Step by Step
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7qgxka0qmeiai7c1tba5.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7qgxka0qmeiai7c1tba5.jpg" alt="6 steps to build a great prompt from scratch in the console (15:00)" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The demo task: "Determine which vehicle is at fault from a Swedish car accident report form." They built the prompt by adding elements one at a time in the console.&lt;/p&gt;

&lt;h3&gt;
  
  
  V1 → V2: Adding Task Context and Tone
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;V1's problem&lt;/strong&gt;: Claude output "a ski accident occurred on Chapman Gotham Street" — a wild miss because there was zero background context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What V2 added&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This is an auto insurance claims processing system&lt;/li&gt;
&lt;li&gt;Inputs are a Swedish accident report form and a hand-drawn sketch&lt;/li&gt;
&lt;li&gt;Do not make a determination if not confident (hallucination prevention)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;→ Claude now correctly identifies it as a car accident, but the verdict is still vague due to missing information.&lt;/p&gt;

&lt;h3&gt;
  
  
  V3: Adding Background Information to the System Prompt
&lt;/h3&gt;

&lt;p&gt;Added the form's structure (17 checkboxes, two columns for Vehicle A and B) to the system prompt.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;strong&gt;Static information belongs in the system prompt.&lt;/strong&gt;&lt;br&gt;
The form structure never changes. This type of static background is ideal for the system prompt — and &lt;strong&gt;maximizes prompt caching effectiveness&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;→ Form reading accuracy improved. Claude issued its first clear verdict: "Vehicle B is at fault."&lt;/p&gt;

&lt;h3&gt;
  
  
  V4: Detailed Step-by-Step Instructions (Order Matters)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. First, carefully examine the form and list every checked box
2. Then analyze the sketch (informed by what you learned from the form)
3. Deliver your final verdict
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;"Read the form before the sketch" is the critical ordering.&lt;/strong&gt; A hand-drawn sketch alone is meaningless — but once you've read the form and know you're dealing with a car accident, the sketch makes sense. Mirror the order a human would naturally work through this.&lt;/p&gt;

&lt;h3&gt;
  
  
  V5: Specifying Output Format
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F62cbj15n9a6xsv4k5am9.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F62cbj15n9a6xsv4k5am9.jpg" alt="Anthropic Console V5 demo (21:00)" width="800" height="450"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Wrap your final verdict in &amp;lt;final_verdict&amp;gt; XML tags.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;→ The application can now extract just the information it needs (the verdict) from the XML tag. &lt;strong&gt;Ski accident misread → ambiguous → confident structured output&lt;/strong&gt; — the evolution is complete.&lt;/p&gt;




&lt;h2&gt;
  
  
  Additional Techniques
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Few-Shot Examples
&lt;/h3&gt;

&lt;p&gt;Label difficult edge cases with human annotations and add them as examples. Images can be Base64-encoded and included in the samples. Production systems often carry dozens to hundreds of examples.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conversation History
&lt;/h3&gt;

&lt;p&gt;For user-facing applications, passing prior conversation history as context improves accuracy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pre-fill (Specifying the Start of Output)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fovxi21fjyi2cq2pzayvk.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fovxi21fjyi2cq2pzayvk.jpg" alt="Controlling response format with pre-fill (23:00)" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Set a starting string in the Assistant role to force Claude's output format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;final_verdict&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;  &lt;span class="c1"&gt;# ← pre-fill
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude will continue from &lt;code&gt;&amp;lt;final_verdict&amp;gt;&lt;/code&gt;. The same works for forcing JSON output.&lt;/p&gt;

&lt;h3&gt;
  
  
  Extended Thinking
&lt;/h3&gt;

&lt;p&gt;Available in Claude 3.7+. Claude's reasoning process appears in &lt;code&gt;&amp;lt;thinking&amp;gt;&lt;/code&gt; tags.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ Warning:&lt;/strong&gt; Treat Extended Thinking as a &lt;strong&gt;diagnostic tool&lt;/strong&gt;, not a permanent crutch. Use it to identify where Claude struggles, then encode those reasoning steps as explicit instructions in the system prompt. That approach achieves the same quality without Extended Thinking — and uses fewer tokens.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Technique&lt;/th&gt;
&lt;th&gt;Effect&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Explicit task context&lt;/td&gt;
&lt;td&gt;Prevents off-base interpretations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Static info in system prompt&lt;/td&gt;
&lt;td&gt;Maximizes prompt caching&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;XML tag structure&lt;/td&gt;
&lt;td&gt;Improves information retrieval accuracy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Specify processing order&lt;/td&gt;
&lt;td&gt;Mirrors human reasoning order&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Specify output format&lt;/td&gt;
&lt;td&gt;Simplifies app integration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Few-shot examples&lt;/td&gt;
&lt;td&gt;Improves accuracy on hard cases&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pre-fill&lt;/td&gt;
&lt;td&gt;Forces output format&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Extended Thinking&lt;/td&gt;
&lt;td&gt;Visualizes reasoning for debugging&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Prompt engineering is an &lt;strong&gt;iterative empirical science&lt;/strong&gt;. Build test cases, find failure patterns, encode fixes into the system prompt — keep running this loop to reach production quality.&lt;/p&gt;




&lt;h2&gt;
  
  
  Full Video Transcript
&lt;/h2&gt;




&lt;h3&gt;
  
  
  Opening (0:00)
&lt;/h3&gt;

&lt;p&gt;Hey everyone, thank you for joining us today for Prompting 101. My name is Hannah, I'm part of the Applied AI Team at Anthropic. With me is Christian, also from the Applied AI Team. Today we're going to take you through some prompting best practices using a real-world scenario and build up a prompt together.&lt;/p&gt;

&lt;p&gt;Prompt Engineering is the practice of writing clear instructions for the model, giving the model the context it needs to complete a task, and thinking through how to arrange that information for the best result. The best way to learn this is just to practice doing it.&lt;/p&gt;

&lt;p&gt;We're using an example inspired by a real customer we worked with — analyzing images and having Claude make a judgment about what it finds there. I don't speak the language this content is in, but luckily Christian and Claude both do.&lt;/p&gt;




&lt;h3&gt;
  
  
  Scenario Introduction (1:00)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Christian:&lt;/strong&gt; Imagine you're working for an auto insurance company, dealing with car insurance claims daily. You have two pieces of information: a car accident report form in Swedish (17 checkboxes detailing what happened) and a hand-drawn sketch of how the accident occurred. We want to pass these to Claude and determine who is at fault.&lt;/p&gt;

&lt;p&gt;Let's start by just throwing them into the console and seeing what happens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Console settings:&lt;/strong&gt; claude-sonnet (latest model), temperature 0, large max token budget.&lt;/p&gt;

&lt;p&gt;First prompt: "This is an accident report form. Determine what happened and who is at fault."&lt;/p&gt;

&lt;p&gt;Result: Claude thinks it's a ski accident on "Chapman Gotham Street" — a very common street in Sweden. You can understand this: in the prompt we haven't done anything to set the stage about what's actually taking place. Claude's first guess isn't terrible, but we have a lot of intuition we can bake in.&lt;/p&gt;




&lt;h3&gt;
  
  
  Best Practices: Prompt Structure (4:00)
&lt;/h3&gt;

&lt;p&gt;Prompt engineering is iterative empirical science. We could have a test case where Claude needs to understand it's in a vehicular environment, not a skiing one, and iteratively build the prompt from there.&lt;/p&gt;

&lt;p&gt;Anthropic's recommended structure:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Task description&lt;/strong&gt; — tell Claude what it's here to do, its role, what task it's trying to accomplish&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic content&lt;/strong&gt; — in this case, the images; may also be information retrieved from another system&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Detailed instructions&lt;/strong&gt; — almost like a step-by-step list of how we want Claude to tackle the reasoning&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Examples&lt;/strong&gt; — here's an example piece of content; here's how you should respond&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Repeat critical instructions&lt;/strong&gt; — review the information with Claude, emphasize things that are extra critical, then tell Claude to go ahead&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  Building V2 (6:00)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Christian:&lt;/strong&gt; Starting with task context. We want to give clearer instructions and make sure Claude understands what we're doing. We also add tone: Claude should be factual and confident. If Claude can understand what it's looking at, we want that assessment to be as clear and confident as possible.&lt;/p&gt;

&lt;p&gt;Back in the console, V2 explicitly labels the data — this is a car accident report form with Vehicle A and Vehicle B in left and right columns. The system prompt specifies that this AI system assists a human claims adjuster reviewing Swedish car accident report forms. It should not make an assessment if it's not fully confident.&lt;/p&gt;

&lt;p&gt;Running it: Claude now correctly identifies it as car accidents — not skiing. It can pick up that Vehicle A checked box 1 and Vehicle B checked box 12. Scrolling down, Claude still says there's information missing to make a fully confident determination. This is great — it's behaving as instructed. But there's a lot of information still missing regarding what the form actually entails.&lt;/p&gt;




&lt;h3&gt;
  
  
  V3: Background Information and Structure (9:00)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Hannah:&lt;/strong&gt; Next we add background data, documents, and images. We actually know a lot about this form — it will be the same every single time. This is a great type of information to put into the system prompt, and a great candidate for prompt caching since it will always be the same. This helps Claude spend less time figuring out what the form is each time.&lt;/p&gt;

&lt;p&gt;Claude loves structure and organization. XML tags let you specify what's inside those tags — &lt;code&gt;&amp;lt;user_preferences&amp;gt;&lt;/code&gt; tells Claude everything wrapped in those tags is related to user preferences. Claude understands all types of delimiters; we prefer XML because its boundaries are clear and it's token-efficient.&lt;/p&gt;

&lt;p&gt;In V3, we tell Claude everything about the form: it's a Swedish car accident form, it'll have this title, two columns representing different vehicles, and what each of the 17 rows means. We also tell it that humans fill this out — so it won't be perfect, people might put a circle, might scribble, might not put an X in the box.&lt;/p&gt;

&lt;p&gt;Running it: Claude spends less time narrating the form to us because it already knows what it is. It gives us a list of what's checked and — Claude now confidently says Vehicle B is at fault based on the drawing and the sketch.&lt;/p&gt;




&lt;h3&gt;
  
  
  V4: Detailed Instructions (14:00)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Hannah:&lt;/strong&gt; One thing we really highlight: examples. Few-shot is a mechanism that's really powerful for steering Claude. You can bake in concrete accidents that were tricky for Claude to get right — with human-labeled correct conclusions. You can include visual examples using Base64-encoded images. This is how you push the limits of your LLM application. If you're building this for an insurance company, you might have 10, maybe hundreds of examples of difficult edge cases.&lt;/p&gt;

&lt;p&gt;Conversation history: not used here, but for user-facing apps with long history, this is the right place to bring that in.&lt;/p&gt;

&lt;p&gt;Next step: a reminder of the immediate task and important guidelines. Preventing hallucinations — we don't want Claude to invent details it's not finding in the data. If the sketch is unintelligible and even a human couldn't figure it out, we want Claude to be able to say that.&lt;/p&gt;

&lt;p&gt;In V4, we keep the system prompt the same and add a detailed task list. The order in which Claude analyzes this information is very important. You'd probably not look at the drawing first — it's just boxes and lines without context. But if you read the form first, understand we're talking about a car accident and see checkboxes indicating what vehicles were doing, then you know how to interpret the drawing.&lt;/p&gt;

&lt;p&gt;So: first, carefully examine the form, make sure you can tell what boxes are checked, make a list. Then move to the sketch, informed by what you learned.&lt;/p&gt;

&lt;p&gt;Running it: Claude now very carefully examines each and every box. It gives structured XML output: form analysis, accident summary, sketch analysis. It continues to say Vehicle B appears to be clearly at fault. With more complicated drawings and less clarity in forms, this step-by-step thinking is really impactful.&lt;/p&gt;




&lt;h3&gt;
  
  
  V5: Output Format and Pre-fill (19:00)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Christian:&lt;/strong&gt; Final step: we keep the system prompt the same and add important guidelines. Summary should be clear, concise, and accurate. Nothing should impede Claude's assessment. Then output formatting: wrap the final verdict in &lt;code&gt;&amp;lt;final_verdict&amp;gt;&lt;/code&gt; XML tags so the application can extract just the verdict.&lt;/p&gt;

&lt;p&gt;Running it: much more succinct. At the end, the output is wrapped in &lt;code&gt;&amp;lt;final_verdict&amp;gt;&lt;/code&gt; tags. We've gone from skiing accident to uncertain to unconfident secure output to now a much more strictly formatted, confident output we can build a real application around.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Christian:&lt;/strong&gt; Another key way to shape output is pre-filled responses. If you want structured JSON output, you just add that Claude needs to begin its output with a certain format. In the Assistant field, write &lt;code&gt;&amp;lt;final_verdict&amp;gt;&lt;/code&gt; or &lt;code&gt;{&lt;/code&gt; — Claude will continue from where you left off. This gives you greater control over output formatting without the preamble.&lt;/p&gt;

&lt;p&gt;Finally: Extended Thinking in Claude 3.7+. You can use this as a crutch for prompt engineering — enable it to make sure Claude has time to think. The beauty is you can analyze that thinking transcript to understand how Claude goes about the data. Try to help Claude by building this into your system prompt itself. It's more token-efficient, and it's a good way of understanding how these models actually go about the data. That's key to making your system prompt a lot better.&lt;/p&gt;

&lt;p&gt;Thank you everyone for coming. We'll be around all day for questions. Don't miss "Prompting for Agents" and the Claude plays Pokémon demo!&lt;/p&gt;

</description>
      <category>claude</category>
      <category>promptengineering</category>
      <category>anthropic</category>
      <category>ai</category>
    </item>
    <item>
      <title>OpenAI Codex Desktop Complete Guide — Mastering Skills, Plugins &amp; Automations</title>
      <dc:creator>naoki_JPN</dc:creator>
      <pubDate>Thu, 30 Apr 2026 03:38:50 +0000</pubDate>
      <link>https://dev.to/bokuno_log/openai-codex-desktop-complete-guide-mastering-skills-plugins-automations-45k5</link>
      <guid>https://dev.to/bokuno_log/openai-codex-desktop-complete-guide-mastering-skills-plugins-automations-45k5</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This article is a Japanese summary of a ~103-minute video by &lt;a href="https://x.com/DeRonin_" rel="noopener noreferrer"&gt;@DeRonin_&lt;/a&gt; on X. This is the English translation. Original video: &lt;a href="https://twitter.com/DeRonin_/status/2048823420977119727" rel="noopener noreferrer"&gt;https://twitter.com/DeRonin_/status/2048823420977119727&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;The OpenAI Codex desktop app is a comprehensive AI agent platform that goes far beyond coding assistance — covering design, document creation, research, and automation. This article summarizes the full 103-minute guide video.&lt;/p&gt;




&lt;h2&gt;
  
  
  Core Features of the Codex Desktop App
&lt;/h2&gt;

&lt;p&gt;Here are the key features introduced at the start of the video.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3quw9szx00d3ys0uzib2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3quw9szx00d3ys0uzib2.jpg" alt="Codex app feature overview and demo screen (0:00)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Project Management and File Organization
&lt;/h3&gt;

&lt;p&gt;Codex manages chats in "project" units, each linked 1:1 to a local folder on your computer. Files generated through chat are automatically saved to an &lt;code&gt;outputs/&lt;/code&gt; folder inside the project directory, and any file in that folder can be referenced with &lt;code&gt;@filename&lt;/code&gt;. You can open the folder instantly via the "Open in Finder" button.&lt;/p&gt;

&lt;h3&gt;
  
  
  Parallel Multitasking
&lt;/h3&gt;

&lt;p&gt;You can run multiple chat threads simultaneously. Even while one agent is working, you can start new tasks in another chat. A blue dot notification appears when a task completes, so you can check results and give the next instruction right away.&lt;/p&gt;

&lt;h3&gt;
  
  
  Skills and Plugins
&lt;/h3&gt;

&lt;p&gt;Skills are "reusable recipes"; plugins are "installable packages that bring those recipes into Codex." Hundreds of pre-built plugins exist for services like Google Calendar, Gmail, Figma, and Remotion. You can also combine external APIs with the &lt;code&gt;skill creator&lt;/code&gt; to build your own custom skills. Once created, skills can be invoked in future sessions with &lt;code&gt;/skill-name&lt;/code&gt; or &lt;code&gt;@skill-name&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Automations
&lt;/h3&gt;

&lt;p&gt;Set up recurring tasks with natural language — for example, "Every Friday at 4am, summarize my weekly calendar and send it via email." You can view, test, and edit automations from the Automations tab.&lt;/p&gt;

&lt;h3&gt;
  
  
  Computer Control
&lt;/h3&gt;

&lt;p&gt;The agent literally controls your mouse and keyboard. This enables working with GUI apps that have no API, such as building apps in Xcode or navigating a browser.&lt;/p&gt;

&lt;h3&gt;
  
  
  In-App Image Generation
&lt;/h3&gt;

&lt;p&gt;Generate images from prompts and use them directly in your workflow. The video demonstrated generating product images for a shoe brand and 10 iOS app icon variations. Transparent background generation is also supported.&lt;/p&gt;

&lt;h3&gt;
  
  
  Steer Feature
&lt;/h3&gt;

&lt;p&gt;Even while an agent is processing, you can paste text or images and immediately redirect it ("fix this part"). Normally prompts queue up and wait their turn, but the "Steer" button lets you interrupt instantly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Terminal Integration (Claude Code)
&lt;/h3&gt;

&lt;p&gt;For design-heavy tasks, you can launch Claude Code from the terminal with &lt;code&gt;claude --dangerously-skip-permissions&lt;/code&gt;. In the video, Claude Code was used to finalize landing pages and slide decks when Codex's design precision reached its limits.&lt;/p&gt;

&lt;h3&gt;
  
  
  Canva Export
&lt;/h3&gt;

&lt;p&gt;Created PowerPoint files can be opened in Canva with one click for manual finishing of the last 5–10%.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Difference Between Skills and Plugins
&lt;/h2&gt;

&lt;p&gt;The video's mid-section used the Excalidraw skill to auto-generate a structure diagram.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7zrbxc78bjikcett8ies.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7zrbxc78bjikcett8ies.jpg" alt="Skill and Plugin structure diagrammed with Excalidraw (10:00)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Skill&lt;/th&gt;
&lt;th&gt;Plugin&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Definition&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A reusable workflow package for specific tasks&lt;/td&gt;
&lt;td&gt;A unit that installs additional functionality into Codex&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Role&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Bundles instructions, resources, and scripts to extend Codex's task-handling ability&lt;/td&gt;
&lt;td&gt;Bundles skills, apps, MCP Servers, and integrations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Purpose&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A recipe that ensures Codex executes workflows reliably&lt;/td&gt;
&lt;td&gt;Provides access to connected systems and packaged tools&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Simple way to remember:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Skill = reusable recipe&lt;/li&gt;
&lt;li&gt;Plugin = installable package that brings that recipe into Codex&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Design Tool Integration (Paper / Figma)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffc9bm3t2qyarauf2616r.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffc9bm3t2qyarauf2616r.jpg" alt="Landing page being auto-generated in Paper Alpha (30:00)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Codex integrates with &lt;strong&gt;Paper (Alpha)&lt;/strong&gt;, a Figma-like design tool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Demo flow:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Prompt: "Using the new Noo Shoo company logo image, create a landing page directly in Paper"&lt;/li&gt;
&lt;li&gt;Codex confirms Paper MCP actions and selects a transparent hero image&lt;/li&gt;
&lt;li&gt;Codex auto-decides design direction: editorial-tech, warm near-black neutral, cyan accents&lt;/li&gt;
&lt;li&gt;Auto-builds 4 sections: Hero, Performance Strip, Product Story, CTA/Footer&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Paper is a design tool built for AI agent collaboration, offering more intuitive operation than direct Figma editing.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Automations
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxgvdbib3cezucxt4u6yo.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxgvdbib3cezucxt4u6yo.jpg" alt="Automation settings screen (35:00)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Automations can be created just by typing "do X every week" in chat. The video demonstrated two:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Weekly Calendar Summary&lt;/strong&gt;&lt;br&gt;
After connecting Google Calendar and Gmail plugins, just say "Every Friday at 4am, summarize this week's schedule and send it via email." Done. You can immediately see when the next run is scheduled.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Monthly YouTube Report&lt;/strong&gt;&lt;br&gt;
After creating a YouTube Researcher skill with the SuperData API, instruct: "On the last day of each month, use that skill to analyze this month's videos and compile them into a Word document." The resulting report includes hook analysis and a views-ranked table — delivered automatically.&lt;/p&gt;


&lt;h2&gt;
  
  
  Part 2 Highlights: 6 Projects in Parallel
&lt;/h2&gt;

&lt;p&gt;In the second half of the video, using "Chorus (an AI agent learning app)" as the subject, the following 6 projects were created simultaneously:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Tools Used&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;iOS App (design + implementation)&lt;/td&gt;
&lt;td&gt;Swift, Xcode, Supabase, mobile design skill&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web Landing Page&lt;/td&gt;
&lt;td&gt;Tally, React, Claude Code, Vercel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Launch Video&lt;/td&gt;
&lt;td&gt;Remotion plugin, Claude Code&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Investor Deck&lt;/td&gt;
&lt;td&gt;PowerPoint skill, Claude Code, Canva&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;X Post Automation&lt;/td&gt;
&lt;td&gt;Typefully skill&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Project Plan&lt;/td&gt;
&lt;td&gt;Markdown (checklist)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The key is: after giving instructions to each task, move on to the next without waiting. Serial task accumulation becomes effective multitasking.&lt;/p&gt;


&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;The Codex desktop app is a comprehensive AI agent platform covering &lt;strong&gt;not just coding, but design, documents, research, and automation&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Skills + Plugins&lt;/strong&gt; — automate any workflow&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automations&lt;/strong&gt; — fully automate recurring research and report creation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Design tool integration&lt;/strong&gt; — applicable to non-engineer workflows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multitasking&lt;/strong&gt; (give instructions, then move on) is the core skill of the AI era&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Codex + Claude Code combination&lt;/strong&gt;: Codex for general orchestration, Claude Code for design-precision tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The ability to choose models and processing load based on task size and precision requirements is another strength of Codex.&lt;/p&gt;


&lt;h2&gt;
  
  
  Detailed Video Guide
&lt;/h2&gt;


&lt;h3&gt;
  
  
  Part 1 — Mastering the Basics
&lt;/h3&gt;
&lt;h4&gt;
  
  
  Download and Project Management
&lt;/h4&gt;

&lt;p&gt;Search "Codex app download" in your browser and download from chatgpt.com. The initial screen looks like ChatGPT's chat interface, but the internals are completely different.&lt;/p&gt;

&lt;p&gt;Codex's standout feature is &lt;strong&gt;project management linked to local folders&lt;/strong&gt;. Before starting a chat, you specify which folder to work in. That folder becomes the "project," and all files created by the agent are auto-saved to its &lt;code&gt;outputs/&lt;/code&gt; folder.&lt;/p&gt;

&lt;p&gt;From the project side panel you can open the folder in Finder or reference files with &lt;code&gt;@filename&lt;/code&gt;. Even with 30+ projects, Command+G search lets you find any chat instantly by name or content.&lt;/p&gt;

&lt;p&gt;In permission settings, "Full Access" mode lets the agent work without approval prompts. The recommended defaults are GPT-5.4 model and Extra High processing load.&lt;/p&gt;


&lt;h4&gt;
  
  
  Using Skills and Plugins
&lt;/h4&gt;

&lt;p&gt;Skills and plugins are often confused, but the essential difference is:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Skills&lt;/strong&gt; are "recipes" for the agent — step-by-step instructions for executing specific tasks. &lt;strong&gt;Plugins&lt;/strong&gt; are those recipes packaged into installable units. Think of it as "plugin = container for skills."&lt;/p&gt;

&lt;p&gt;To explore what a new plugin can do, the fastest approach is to open a new chat and ask: &lt;code&gt;@Figma tell me everything you can do with this plugin&lt;/code&gt;. Clicking the ▼ (caret) on the response shows the thinking process too.&lt;/p&gt;


&lt;h4&gt;
  
  
  Practical Demo: Automating with Google Calendar + Gmail
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7zrbxc78bjikcett8ies.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7zrbxc78bjikcett8ies.jpg" alt="Google Calendar integration and automation setup (15:00)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Installing the Google Calendar plugin is as simple as selecting "Google Calendar" from Plugins and signing in via browser.&lt;/p&gt;

&lt;p&gt;After connecting, these operations complete in a single conversation:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;"List all my events this week" → All calendar events displayed&lt;/li&gt;
&lt;li&gt;"Send me a weekly summary by email" → Sent via Gmail immediately&lt;/li&gt;
&lt;li&gt;"Set this as an automation every Friday at 4am" → Registered as a weekly task&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The Automations tab shows next run time, status, and a test run button. After creation, you can edit with natural language like "always use the Gmail skill."&lt;/p&gt;


&lt;h4&gt;
  
  
  Generating Designs with Figma and Paper MCP
&lt;/h4&gt;

&lt;p&gt;Figma plugin's main use is "converting existing Figma boards to code." It's not suited for the reverse direction (having AI generate designs and place them in Figma).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Paper&lt;/strong&gt; (Alpha) fills that role. It's a design tool built for AI agent collaboration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Using the new shoe PNG (no background), create a landing page in Paper"
↓
Codex calls Paper MCP and decides design direction
↓
Auto-builds 4 sections: Hero, Performance Strip, Product Story, CTA
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Setting chat to "mini-window" mode lets you float Codex minimized to the side while viewing Paper.&lt;/p&gt;

&lt;p&gt;Codex also has a &lt;strong&gt;Steer&lt;/strong&gt; feature. Normally prompts queue up while AI is working, but pressing "Steer" lets you interrupt instantly. You can paste a screenshot and say "this part is overlapping, fix it" — and the agent course-corrects mid-task.&lt;/p&gt;




&lt;h4&gt;
  
  
  Building a Custom Skill: YouTube Researcher
&lt;/h4&gt;

&lt;p&gt;By combining external APIs, you can add capabilities Codex doesn't have natively. Here's the process using a YouTube transcript-fetching skill as an example:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Find an API&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ask Codex "give me the top 5 APIs for getting YouTube transcripts" — it suggests SuperData, Transcript API, YouTube Transcript.io, etc. SuperData is free up to 100 requests/month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Create the skill&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In a new chat, enter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Use skill creator to build a skill that fetches and summarizes
the latest 10 video transcripts from a specific channel using SuperData API.
API key: [paste here]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Typing &lt;code&gt;skill creator&lt;/code&gt; activates a skill-creation focused mode.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Use the skill&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;After creation, open a new chat and type "YouTube Researcher" to use it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Research Riley Brown's latest 10 YouTube videos,
get transcripts and compile them into a document.
Include which videos performed well, with hook (intro) analysis.
Add thumbnails too.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The resulting report includes hook win/loss analysis — "Claude is taking over" (urgency, big market shift) and "Claude Code Leak" rated highly, while vibe-coding videos showed low performance.&lt;/p&gt;

&lt;p&gt;Afterwards, it was automated: "On the last day of each month, use this skill to analyze this month's videos and auto-create a Word report."&lt;/p&gt;




&lt;h3&gt;
  
  
  Part 2 — Building 6 Projects in Parallel
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; From here, "Chorus (an AI agent learning app)" is used as the subject to demo building 6 projects simultaneously. The core is "give instructions, then move on to the next" — serial task accumulation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  How to Set Up a Project Plan
&lt;/h4&gt;

&lt;p&gt;First create a "My New Business" folder and start a new project. The plan is created from chat as a Markdown file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Attach a screenshot and say:

"Looking at this, create a checklist-format project plan.
 Items: iOS app, web landing page,
 mobile app design, launch video, investor deck,
 X post automation — 6 items total.
 Include the app idea at the top."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Chorus app concept: A platform for learning about AI agents. An iOS app providing tool comparisons, a skills library (copy-paste ready), and learning content.&lt;/p&gt;




&lt;h4&gt;
  
  
  iOS App: From Design to TestFlight
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Creating the screen design skill&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Just paste the instructions exported from claude.ai/design's new design tool into Codex and say "create a mobile design skill that can do the same thing" — your custom skill is complete.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Using the mobile design skill, create screens for the Chorus app
 in basic Apple style"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result shows a prototype link with a 4-tab mockup: Learn, Platforms, Skills, Saved.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Building in Xcode&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight swift"&gt;&lt;code&gt;&lt;span class="s"&gt;"Create a Swift mobile app called Chorus.
 For now just display 'Hello, this is Chorus' in the center of the screen.
 When done, open the Xcode project."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pressing "Play" in Xcode + iOS Simulator (or real device) reflects the latest build each time. After integrating the screen designs, connect Supabase.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Supabase Connection and Auth&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Supabase is the de facto database for AI agents. After configuring MCP and restarting Codex, the connection is reflected. Post-restart, say "create all tables once connected" — skill categories, platforms, skills, and saved items tables are auto-generated.&lt;/p&gt;

&lt;p&gt;Authentication was implemented with email + password (Google Sign-In was attempted first, but Supabase's native email auth was the fastest path). Turn off email confirmation in Supabase and you can sign in immediately.&lt;/p&gt;

&lt;p&gt;The video completed upload to TestFlight.&lt;/p&gt;




&lt;h4&gt;
  
  
  Web Landing Page: Tally + React + Vercel
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Preparing the form (tally.so)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Create a waitlist form in tally.so using a template with name and email fields. Copy the "embed code" when done.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Running as a React app&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;I'm using tally.so. Embed this form in the site
 and run it locally as a React app. We'll do design later.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Styling with Claude Code&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Since Codex struggles with design, call Claude Code from the terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;--dangerously-skip-permissions&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Forget all the styling on this page.
 Look at the Chorus app code and match the fonts and design.
 Keep the Tally embed as is.
 Minimal text, simple, conversion-focused."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude Code dramatically improves it in minutes. When done, "Deploy to Vercel and give me a public link" completes the process.&lt;/p&gt;




&lt;h4&gt;
  
  
  Launch Video: Remotion Plugin
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffc9bm3t2qyarauf2616r.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffc9bm3t2qyarauf2616r.jpg" alt="Creating motion graphics video with Remotion (55:00)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Install the Remotion plugin and just type &lt;code&gt;@remotion&lt;/code&gt; in a new chat.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Create a launch video for the Chorus app.
 As a test video: take the attached app screen screenshots,
 put them in iPhone mockups on a white background with animation.
 Get it running on localhost."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Opening &lt;code&gt;localhost:3031&lt;/code&gt; shows the timeline editor. Time is specified in &lt;strong&gt;seconds.frames&lt;/strong&gt; format (e.g., &lt;code&gt;2.20&lt;/code&gt; = 2 seconds 20 frames).&lt;/p&gt;

&lt;p&gt;You can steer corrections at any point during processing. Display gridlines and pass coordinates to the agent (e.g., "X axis 1040, Y axis 540") for precise positioning.&lt;/p&gt;

&lt;p&gt;For design-precision elements (animations, color cards, cut quality), delegating to Claude Code produces dramatically better results. For BGM, just attach an MP3 file to the message and say "add this at 50% volume."&lt;/p&gt;




&lt;h4&gt;
  
  
  Investor Deck: Chat Fork and Canva Integration
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Fork the chat&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Right-click the mobile app chat and select "Fork into Local" to create a new chat inheriting the same context. Rename it "Investor Deck" and start working.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Analyze the app's features, icon, and style,
 then create an investor slide deck with the same design.
 Use the PowerPoint skill.
 Research what investors want in April 2026 and match the style."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Refine with Claude Code&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;--dangerously-skip-permissions&lt;/span&gt;
&lt;span class="s2"&gt;"Look at this deck, reduce text, increase visuals.
 Add charts and diagrams for readability. Don't add more slides."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Export to Canva&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A "Canva" icon appears next to the PowerPoint file. Click it and Canva opens for final touches. Animations can be added too.&lt;/p&gt;




&lt;h4&gt;
  
  
  X Post Automation: Typefully Skill
&lt;/h4&gt;

&lt;p&gt;Get the Typefully API key (a scheduling tool for multiple Twitter accounts) and instruct:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Research the Typefully API and create a skill for full control.
 Test with the Riley Brown account (identify with fruit emoji).
 API key: [paste here]"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After the skill is complete, automate it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Set up an automation to create 3 X post drafts every morning.
 Use the Typefully control skill."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Final Results: All Tasks Summary
&lt;/h3&gt;

&lt;p&gt;Results achieved by the end of the video:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;iOS App&lt;/td&gt;
&lt;td&gt;Published to TestFlight (Learn, Platforms, Skills, Saved features)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web Landing Page&lt;/td&gt;
&lt;td&gt;Live on Vercel, Tally form working&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Launch Video&lt;/td&gt;
&lt;td&gt;First draft complete with Remotion + Claude Code&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Investor Deck&lt;/td&gt;
&lt;td&gt;Exported to Canva, manually polished&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;X Post Automation&lt;/td&gt;
&lt;td&gt;3 daily drafts scheduled&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Project Plan&lt;/td&gt;
&lt;td&gt;All 6 items checked off&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Key lesson from the video:&lt;br&gt;
AI agents can take 1–2 hours per task. Instead of waiting, "give a new agent new instructions → move on" — repeated. This serial task accumulation is the core of AI-era productivity.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>openai</category>
      <category>codex</category>
      <category>ai</category>
      <category>automation</category>
    </item>
    <item>
      <title>How We Build Effective Agents — 3 Principles from Anthropic's Barry Zhang</title>
      <dc:creator>naoki_JPN</dc:creator>
      <pubDate>Tue, 28 Apr 2026 07:03:39 +0000</pubDate>
      <link>https://dev.to/bokuno_log/how-we-build-effective-agents-3-principles-from-anthropics-barry-zhang-gfo</link>
      <guid>https://dev.to/bokuno_log/how-we-build-effective-agents-3-principles-from-anthropics-barry-zhang-gfo</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; This article summarizes the ~15-minute talk by Barry Zhang (Anthropic) posted by &lt;a href="https://x.com/Ronycoder/status/2048649743602237493" rel="noopener noreferrer"&gt;@Ronycoder&lt;/a&gt; on X. Presented at the "Agents at Work" event.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Barry Zhang at Anthropic co-authored the blog post "Building Effective Agents" with Eric. In this talk, he goes deeper on three core ideas from that post and shares practical lessons on building AI agents in production.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft3kuw0w5dsovjvvflhre.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft3kuw0w5dsovjvvflhre.jpg" alt="How We Build Effective Agents - Barry Zhang / Anthropic" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Three core principles:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Don't build agents for everything&lt;/li&gt;
&lt;li&gt;Keep it simple&lt;/li&gt;
&lt;li&gt;Think like your agent&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Evolution of AI Systems
&lt;/h2&gt;

&lt;p&gt;Barry opened with a recap of how AI systems have evolved.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ezm6jh7ce3hhxpqe9ux.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6ezm6jh7ce3hhxpqe9ux.jpg" alt="Single-LLM Features → Workflows → Agents → ?" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Single-LLM Features&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Summarization, classification, extraction — one LLM call, simple and complete&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Workflows&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Multiple LLM calls orchestrated by code in predefined control flows. Allows trading off cost vs. agency for better performance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Agents&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;LLMs deciding their own trajectories, operating almost independently based on environment feedback&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Next phase (?)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;More general-purpose single agents, or multi-agent collaboration — still unknown&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The trend is clear: more agency means more capability and usefulness, but also higher cost, latency, and consequences of errors. This tension is the foundation for all three principles.&lt;/p&gt;

&lt;h2&gt;
  
  
  Principle 1: Don't Build Agents for Everything
&lt;/h2&gt;

&lt;p&gt;Agents are a way to scale complex and valuable tasks — not a drop-in upgrade for every use case. So when &lt;em&gt;should&lt;/em&gt; you build one? Barry presented a four-item checklist.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk7zqn30o6ovxcilkzi9a.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk7zqn30o6ovxcilkzi9a.jpg" alt="Idea 1: Don't build agents for everything" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Check 1: Task Complexity
&lt;/h3&gt;

&lt;p&gt;Agents thrive in &lt;strong&gt;ambiguous problem spaces&lt;/strong&gt;. If you can map out the entire decision tree fairly easily, just build that explicitly and optimize each node. It's more cost-effective and gives you more control.&lt;/p&gt;

&lt;h3&gt;
  
  
  Check 2: Task Value
&lt;/h3&gt;

&lt;p&gt;Agents consume a lot of tokens. If your budget per task is around 10 cents (e.g., a high-volume customer support system), that only affords 30–50 tool-using tokens — use a workflow instead. On the other hand, if your first thought is "I don't care how many tokens I spend, I just want to get the task done," agents are the right call.&lt;/p&gt;

&lt;h3&gt;
  
  
  Check 3: De-risk Critical Capabilities
&lt;/h3&gt;

&lt;p&gt;Before running an agent, check for significant bottlenecks in its trajectory. For a coding agent: can it write good code, debug, and recover from errors? Gaps here aren't fatal but will multiply your cost and latency — reduce scope and try again.&lt;/p&gt;

&lt;h3&gt;
  
  
  Check 4: Cost of Error and Error Discovery
&lt;/h3&gt;

&lt;p&gt;If errors are high-stakes and hard to discover, it's difficult to trust an agent with autonomous actions. You can mitigate with read-only access or human-in-the-loop, but these also limit how well the agent scales.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Coding Is a Great Agent Use Case
&lt;/h3&gt;

&lt;p&gt;Barry walked through the checklist with coding as an example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Complexity&lt;/strong&gt;: Going from design doc to PR is clearly ambiguous and complex&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Value&lt;/strong&gt;: Good code is highly valuable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Capability&lt;/strong&gt;: Models are already great at many parts of the coding workflow&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verifiability&lt;/strong&gt;: Output is easily verified through unit tests and CI — this is probably why so many successful coding agents exist today&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Principle 2: Keep It Simple
&lt;/h2&gt;

&lt;p&gt;"Agents are models using tools in a loop."&lt;/p&gt;

&lt;p&gt;Barry defined agents with this single sentence and identified three components.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmeajo7y5awrpo0pf4gwr.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmeajo7y5awrpo0pf4gwr.jpg" alt="Agents are models using tools in a loop" width="800" height="450"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;env&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Tools&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;system_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Goals, constraints, and how to act&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;action&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Environment&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The system the agent operates in (filesystem, browser, APIs, etc.)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tools&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Interface for the agent to take actions and get feedback&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;System Prompt&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Defines the agent's goals, constraints, and ideal behavior&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A coding agent, a search agent, and a Computer Use agent look completely different on the surface — but internally at Anthropic, they share &lt;strong&gt;almost the same code backbone&lt;/strong&gt;. Only the environment varies by use case; the real design decisions are which tools to offer and what the system prompt says.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complexity Kills Iteration Speed
&lt;/h3&gt;

&lt;p&gt;The reason Barry emphasizes simplicity is empirical: any upfront complexity kills iteration speed. Build these three components first — the highest ROI comes from getting them right. Optimization comes later.&lt;/p&gt;

&lt;p&gt;Examples of optimization after the basics are solid:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Coding / Computer Use&lt;/strong&gt;: Cache the directory to reduce cost&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Search (many tools)&lt;/strong&gt;: Parallelize tool calls to reduce latency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;All cases&lt;/strong&gt;: Present the agent's progress to gain user trust&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Principle 3: Think Like Your Agent
&lt;/h2&gt;

&lt;p&gt;"Put yourself in their &lt;del&gt;shoes&lt;/del&gt; context window"&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbaf6fdylzcbitgwwon5n.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbaf6fdylzcbitgwwon5n.jpg" alt="Put yourself in their context window — LLM agent loop diagram" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Agents can exhibit remarkably sophisticated behavior. But at each step, what the model is actually doing is &lt;strong&gt;running inference on a limited context of 10–20K tokens&lt;/strong&gt;. Everything the model knows about the current state of the world fits in that context window.&lt;/p&gt;

&lt;p&gt;Barry illustrated this with a Computer Use exercise:&lt;/p&gt;




&lt;p&gt;&lt;em&gt;"Imagine you are a Computer Use agent."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;You receive a static screenshot and a poorly-written description. The only way you can affect the environment is through your tools. When you attempt a click, while inference and tool execution are happening, it's equivalent to &lt;strong&gt;closing your eyes for 3–5 seconds and using the computer in the dark&lt;/strong&gt;. Then your eyes open and you see another screenshot. Whatever you did could have worked — or you might have shut the computer down. You just don't know.&lt;/p&gt;




&lt;p&gt;After going through this (mildly uncomfortable) exercise, what the agent actually needs becomes obvious:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Screen resolution (so it knows where to click)&lt;/li&gt;
&lt;li&gt;Recommended actions and limitations (guardrails to avoid unnecessary exploration)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Using Claude to Evaluate Claude
&lt;/h3&gt;

&lt;p&gt;Barry shared a practical technique: throw the entire agent trajectory into Claude and ask:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Why do you think we made this decision here? Is there anything we could do to help you make better decisions?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You can also pass your system prompt directly to the model and ask "Is anything ambiguous? Can you follow this?" This shouldn't replace your own understanding of the context — but it gives you a much closer perspective on how the agent sees the world.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Open Questions for the Future
&lt;/h2&gt;

&lt;p&gt;Barry closed with three things that are always on his mind as an AI engineer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Budget-aware agents&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Unlike workflows, we don't have good control over cost and latency for agents. Defining and enforcing budgets in terms of time, money, and tokens would unlock many more production use cases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Self-evolving tools&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We're already using models to help iterate on tool descriptions. This should generalize into a meta-tool where agents design and improve their own tool ergonomics — making agents dramatically more general-purpose.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Multi-agent collaboration&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Barry is personally convinced we'll see significantly more multi-agent collaboration in production by end of 2026. The benefits are clear: parallelization, separation of concerns, sub-agents protecting the main agent's context window. But the open question is: how do agents actually communicate with each other? Moving from synchronous user-assistant turns to asynchronous communication and mutual recognition between agents is the next frontier.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Principle&lt;/th&gt;
&lt;th&gt;Key Point&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Don't build agents for everything&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Evaluate use cases on 4 dimensions: complexity, value, capability, error cost&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Keep it simple&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Start with just Environment + Tools + System Prompt, optimize later&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Think like your agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Step into the context window and experience the world as your agent does&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Getting agents to work reliably is still hard — but much of that difficulty comes from developers not understanding what the agent sees and doesn't see. These three principles are a practical starting point for bridging that gap.&lt;/p&gt;




&lt;h2&gt;
  
  
  Full Transcript
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;The following is a translated and lightly edited transcript of the talk. Some recognition errors may be present.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  [0:00–5:00] Part 1: The Evolution of Agents and the "Don't Build" Checklist
&lt;/h3&gt;

&lt;p&gt;My name is Barry, and today we're going to be talking about how we build effective agents. About two months ago, Eric and I wrote a blog post called "Building Effective Agents." In there, we shared some opinionated takes on what an agent is and isn't, and we gave some practical learnings that we've gained along the way. Today, I'd like to go deeper on three core ideas from the blog post and provide you with some personal musings at the end.&lt;/p&gt;

&lt;p&gt;Here are those ideas. First, don't build agents for everything. Second, keep it simple. And third, think like your agents.&lt;/p&gt;

&lt;p&gt;Let's first start with a recap of how we got here. Most of us probably started building very simple features — things like summarization, classification, extraction. Really simple things that felt like magic two to three years ago and have now become table stakes. Then, as we got more sophisticated and as products matured, we got more creative. One model call often wasn't enough. So we started orchestrating multiple model calls in predefined control flows. This basically gave us a way to trade off cost and agency for better performance, and we called these workflows. We believe this is the beginning of agentic systems.&lt;/p&gt;

&lt;p&gt;Now, models are even more capable. And we're seeing more and more domain-specific agents start to pop up in production. Unlike workflows, agents can decide their own trajectory and operate almost independently based on environment feedback. This is going to be our focus today.&lt;/p&gt;

&lt;p&gt;It's probably a little bit too early to name what the next phase of agentic systems is going to look like, especially in production. Single agents could become a lot more general purpose and more capable, or we could start to see collaboration and delegation in multi-agent settings. Regardless, the broad trend is that as we give these systems more agency, they become more useful and more capable — but as a result, the cost, latency, and consequences of errors also go up.&lt;/p&gt;

&lt;p&gt;That brings us to the first point. Don't build agents for everything. We think of agents as a way to scale complex and valuable tasks — they shouldn't be a drop-in upgrade for every use case.&lt;/p&gt;

&lt;p&gt;We talked a lot about workflows in the blog post because we really like them. They're a great, concrete way to deliver value today. So when should you build an agent? Here's our checklist.&lt;/p&gt;

&lt;p&gt;The first thing to consider is the complexity of your task. Agents really thrive in ambiguous problem spaces. If you can map out the entire decision tree pretty easily, just build that explicitly and optimize every node. It's a lot more cost-effective and gives you a lot more control.&lt;/p&gt;

&lt;p&gt;Next is the value of your task. Agents will cost you a lot of tokens, so the task really needs to justify the cost. If your budget per task is around 10 cents — for example, you're building a high-volume customer support system — that only affords you 30 to 50 tool tokens. In that case, just use a workflow to solve the most common scenarios. You'll capture the majority of the value from there. On the other hand, if your first thought is "I don't care how many tokens I spend, I just want to get the task done" — please see me after the talk, the go-to-market team will love you.&lt;/p&gt;

&lt;p&gt;From there, we want to de-risk the critical capabilities. Make sure there are no significant bottlenecks in the agent's trajectory. If you're doing a coding agent, make sure it's able to write good code, debug, and recover from errors. If you do have gaps, they probably won't be fatal, but they will multiply your cost and latency. In that case, reduce the scope and try again.&lt;/p&gt;

&lt;p&gt;Finally, the last important thing to consider is the cost of error and error discovery. If your errors are going to be high-stakes and very hard to discover, it's going to be very difficult to trust the agent to take actions autonomously. You can mitigate this by limiting scope, using read-only access, or adding human-in-the-loop — but this will also limit how well you can scale.&lt;/p&gt;

&lt;p&gt;Let's see this checklist in action. Why is coding a great agent use case? First, going from a design doc to a PR is obviously a very ambiguous and complex task. Second, we know good code has a lot of value. Third, many of us already use Claude for coding, so we know it's great at many parts of the coding workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  [5:00–10:00] Part 2: Keep It Simple — The Three-Component Design
&lt;/h3&gt;

&lt;p&gt;And last, coding has this really nice property where the output is easily verifiable through unit tests and CI. That's probably why we're seeing so many creative and successful coding agents right now.&lt;/p&gt;

&lt;p&gt;Once you find a good use case for agents, this is the second core idea — keep it as simple as possible. Let me show you what I mean. This is what agents look like to us. They are models using tools in a loop. In this frame, three components define what an agent really looks like.&lt;/p&gt;

&lt;p&gt;First is the environment — the system that the agent is operating in. Then we have a set of tools, which offer an interface for the agent to take action and get feedback. Then we have the system prompt, which defines the goals, the constraints, and the ideal behavior for the agent to work in this environment. Then the model gets called in a loop — and that's agents.&lt;/p&gt;

&lt;p&gt;We've learned the hard way to keep this simple, because any complexity upfront is really going to kill iteration speed. Focusing on just these three basic components is going to give you by far the highest ROI. Optimizations can come later.&lt;/p&gt;

&lt;p&gt;There are three agent use cases that we've built for ourselves or our customers. They look very different on the product surface, very different in scope, very different in capability. But they share almost exactly the same backbone — almost the exact same code.&lt;/p&gt;

&lt;p&gt;The environment largely depends on your use case. So really, the only two design decisions are: what are the tools you want to offer to the agent, and what is the prompt you want to instruct your agent to follow.&lt;/p&gt;

&lt;p&gt;Once you've figured out these three basic components, you have a lot of optimization to do. For coding and computer use, you might want to cache the directory to reduce cost. For search, where you have a lot of tools, you can parallelize to reduce latency. And for almost all of these, we want to make sure to present the agent's progress in a way that gains user trust. But that's it. Keep it as simple as possible as you're iterating. Build these three components first, then optimize once the behavior is stable.&lt;/p&gt;

&lt;h3&gt;
  
  
  [10:00–14:46] Part 3: Think Like Your Agent &amp;amp; Open Questions
&lt;/h3&gt;

&lt;p&gt;All right, this is the last idea — think like your agent. I've seen a lot of builders, myself included, who develop agents from their own perspective and get confused when agents make a mistake. It seems counterintuitive to us. That's why we always recommend getting into the agent's context window.&lt;/p&gt;

&lt;p&gt;Agents can exhibit some really sophisticated behavior — they can look incredibly complex. But at each step, what the model is doing is just running inference on a very limited set of context. Everything the model knows about the current state of the world is explained in that 10 to 20K tokens. It's really helpful to limit ourselves to that context and see if it's actually sufficient and coherent. This gives you a much better understanding of how agents see the world and helps bridge the gap between our understanding and theirs.&lt;/p&gt;

&lt;p&gt;Imagine for a second that we're Computer Use agents now. What we're going to get is a static screenshot and a very poorly written description. We have tools and a task. We can think and reason all we want, but the only thing that's going to take effect in the environment is our tools.&lt;/p&gt;

&lt;p&gt;So we attempt a click without really seeing what's happening. While inference is running and tool execution is happening, this is basically equivalent to closing our eyes for three to five seconds and using the computer in the dark. Then we open our eyes and see another screenshot. Whatever we did could have worked — or we could have shut down the computer. We just don't know. It's a huge leap of faith, and the cycle starts again.&lt;/p&gt;

&lt;p&gt;I highly recommend trying to do a full task from the agent's perspective. It's a fascinating and only mildly uncomfortable experience. But once you go through it, it becomes very clear what the agent would have actually needed. It's crucial to know the screen resolution. It's helpful to have recommended actions and limitations — guardrails to avoid unnecessary exploration.&lt;/p&gt;

&lt;p&gt;Fortunately, we're building systems that speak our language. So we can just ask Claude to understand Claude. You can ask if any instructions in your system prompt are ambiguous. You can throw in a tool description and see whether the agent knows how to use the tool. And one thing we do quite frequently is throw the entire agent's trajectory into Claude and ask: "Why do you think we made this decision here? Is there anything we can do to help you make better decisions?"&lt;/p&gt;

&lt;p&gt;This shouldn't replace your own understanding of the context, but it will help you gain a much closer perspective on how the agent is seeing the world.&lt;/p&gt;

&lt;p&gt;All right, I've spent most of the talk on practical stuff. Let me indulge myself and share one slide of personal musings — my view on how this might evolve, and some open questions I think we need to answer together as AI engineers.&lt;/p&gt;

&lt;p&gt;First, I think we need to make agents a lot more budget-aware. Unlike workflows, we don't have great control over cost and latency for agents. Figuring this out will enable many more use cases by giving us the control we need to deploy them in production. The open question is: what's the best way to define and enforce a budget in terms of time, money, and tokens?&lt;/p&gt;

&lt;p&gt;Next is the concept of self-evolving tools. We're already using models to help iterate on tool descriptions. But this should generalize into a meta-tool where agents can design and improve their own tool ergonomics. This will make agents a lot more general-purpose as they adopt the tools they need for each use case.&lt;/p&gt;

&lt;p&gt;Finally — and I don't think this is a hot take anymore — I have a personal conviction that we'll see a lot more multi-agent collaboration in production by end of this year. They're well-parallelized, they have nice separation of concerns, and having sub-agents will really protect the main agent's context window. But the big open question is: how do these agents actually communicate with each other? We're currently in this very rigid frame of mostly synchronous user-assistant turns. How do we expand from there and build asynchronous communication? Roles that afford agents to communicate and recognize each other? I think that's going to be a big open question as we explore this multi-agent future.&lt;/p&gt;

&lt;p&gt;If you forget everything I said today, here are the three takeaways. First, don't build agents for everything. If you do find a good use case, keep it as simple as possible for as long as possible. And finally, as you iterate, try to think like your agent — gain their perspective and help them do their job.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>llm</category>
      <category>anthropic</category>
    </item>
    <item>
      <title>The Complete Guide to OpenAI Codex Desktop: Skills, Plugins, Automations &amp; Parallel Multitasking</title>
      <dc:creator>naoki_JPN</dc:creator>
      <pubDate>Tue, 28 Apr 2026 05:32:09 +0000</pubDate>
      <link>https://dev.to/bokuno_log/the-complete-guide-to-openai-codex-desktop-skills-plugins-automations-parallel-multitasking-556p</link>
      <guid>https://dev.to/bokuno_log/the-complete-guide-to-openai-codex-desktop-skills-plugins-automations-parallel-multitasking-556p</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Source:&lt;/strong&gt; This article summarizes the ~103-minute video by &lt;a href="https://x.com/DeRonin_/status/2048823420977119727" rel="noopener noreferrer"&gt;@DeRonin_&lt;/a&gt;. All credit goes to the original creator.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;OpenAI Codex desktop app is an AI agent platform that goes far beyond coding assistance — it handles design, document creation, research, and automation all in one place. This article is a Japanese-to-English summary of the complete ~103-minute guide video.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Features of the Codex Desktop App
&lt;/h2&gt;

&lt;p&gt;Here are the features introduced in the video.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3quw9szx00d3ys0uzib2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3quw9szx00d3ys0uzib2.jpg" alt="Codex app feature overview (around 0:00)" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Project Management &amp;amp; File Organization
&lt;/h3&gt;

&lt;p&gt;Codex manages chats in "project" units, each linked 1-to-1 with a local folder on your computer. Files created by the agent are automatically saved inside an &lt;code&gt;outputs/&lt;/code&gt; subfolder of the project folder, and any file in the folder can be referenced with &lt;code&gt;@filename&lt;/code&gt;. You can open the folder in Finder at any time from the side panel.&lt;/p&gt;

&lt;h3&gt;
  
  
  Parallel Multi-tasking
&lt;/h3&gt;

&lt;p&gt;Multiple chat threads can run simultaneously. While one agent is executing, you can start new tasks in separate chats. A blue dot notification appears when a task completes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Skills &amp;amp; Plugins
&lt;/h3&gt;

&lt;p&gt;Skills are "reusable recipes." Plugins are installable packages that bring those recipes into Codex. Hundreds of pre-built plugins exist (Google Calendar, Gmail, Figma, Remotion, etc.), and you can create custom skills by combining external APIs with the &lt;code&gt;skill creator&lt;/code&gt; keyword. Skills are invoked with &lt;code&gt;/skill-name&lt;/code&gt; or &lt;code&gt;@skill-name&lt;/code&gt; in subsequent sessions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Automations
&lt;/h3&gt;

&lt;p&gt;Set recurring tasks with plain language: "Every Friday at 4pm, summarize this week's calendar and email it to me." The Automations tab shows a list, allows test runs, and supports natural-language edits.&lt;/p&gt;

&lt;h3&gt;
  
  
  Computer Control
&lt;/h3&gt;

&lt;p&gt;The agent literally controls your mouse and keyboard. It can operate any GUI app — including Xcode builds and browser interactions — even without an API.&lt;/p&gt;

&lt;h3&gt;
  
  
  In-App Image Generation
&lt;/h3&gt;

&lt;p&gt;Generate images from prompts and use them directly in your workflow. The video demos generating product photos (a futuristic shoe brand) and 10 iOS app icon options, with transparent backgrounds.&lt;/p&gt;

&lt;h3&gt;
  
  
  Steer (Real-Time Steering)
&lt;/h3&gt;

&lt;p&gt;While the agent is running, you can paste text or a screenshot and redirect it immediately using the "Steer" button. Unlike normal queuing, this injects your prompt mid-execution.&lt;/p&gt;

&lt;h3&gt;
  
  
  Terminal Integration (Claude Code)
&lt;/h3&gt;

&lt;p&gt;For design-heavy tasks, launch Claude Code from the integrated terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;--dangerously-skip-permissions&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The video shows switching to Claude Code when Codex's design quality hits its limits — landing pages and slide decks came out significantly better.&lt;/p&gt;

&lt;h3&gt;
  
  
  Canva Export
&lt;/h3&gt;

&lt;p&gt;Click the Canva icon next to any generated PowerPoint file to open it directly in Canva for final polish.&lt;/p&gt;




&lt;h2&gt;
  
  
  Skills vs. Plugins — The Difference
&lt;/h2&gt;

&lt;p&gt;The video used an Excalidraw skill to auto-generate this diagram:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7zrbxc78bjikcett8ies.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7zrbxc78bjikcett8ies.jpg" alt="Skills vs. Plugins diagram (around 10:00)" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Skill&lt;/th&gt;
&lt;th&gt;Plugin&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Definition&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Reusable workflow package for a specific task&lt;/td&gt;
&lt;td&gt;Installable unit that adds capabilities to Codex&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Role&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A recipe that tells Codex how to execute a workflow&lt;/td&gt;
&lt;td&gt;Bundles skills, apps, MCP Servers, and integrations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Purpose&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Reliable execution recipe&lt;/td&gt;
&lt;td&gt;Provides access to connected systems and tools&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Simple way to remember: Skill = reusable recipe. Plugin = installable package that brings the recipe into Codex.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Design Tool Integration (Paper / Figma)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffc9bm3t2qyarauf2616r.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffc9bm3t2qyarauf2616r.jpg" alt="Paper (Alpha) auto-generating a landing page (around 30:00)" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Codex integrates with &lt;strong&gt;Paper (Alpha)&lt;/strong&gt;, a Figma-like design tool built specifically for AI agent workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Demo flow:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;"Using the Noo Shoo logo image, build a landing page directly on this Paper board"&lt;/li&gt;
&lt;li&gt;Codex calls Paper MCP, selects the transparent hero image&lt;/li&gt;
&lt;li&gt;Auto-decides design direction: editorial-tech, warm near-black neutral, cyan accent&lt;/li&gt;
&lt;li&gt;Builds 4 sections automatically: Hero, Performance Strip, Product Story, CTA/Footer&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Paper is designed specifically for AI agent collaboration. It's more intuitive than direct Figma editing for generative design tasks.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Automations in Practice
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxgvdbib3cezucxt4u6yo.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxgvdbib3cezucxt4u6yo.jpg" alt="Automations setup screen (around 35:00)" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Creating an automation is as simple as typing "do X every Y." The video demos two:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Weekly Calendar Summary&lt;/strong&gt;&lt;br&gt;
After connecting Google Calendar and Gmail plugins: "Every Friday at 4pm, summarize this week's events and email them to me." That's it — the automation is registered and the next scheduled run is visible immediately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Monthly YouTube Report&lt;/strong&gt;&lt;br&gt;
After building a YouTube Researcher skill (SuperData API): "On the last day of every month, analyze that month's videos and generate a Word doc with hook analysis and a view-count table." Delivered automatically every month.&lt;/p&gt;


&lt;h2&gt;
  
  
  Part 2 Highlight: Building 6 Projects in Parallel
&lt;/h2&gt;

&lt;p&gt;In the second half of the video, six projects for the &lt;strong&gt;Chorus app&lt;/strong&gt; (an AI agent learning platform) were built simultaneously.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Tools Used&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;iOS App (design + implementation)&lt;/td&gt;
&lt;td&gt;Swift · Xcode · Supabase · Mobile Design Skill&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web Landing Page&lt;/td&gt;
&lt;td&gt;Tally · React · Claude Code · Vercel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Launch Video&lt;/td&gt;
&lt;td&gt;Remotion Plugin · Claude Code&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Investor Deck&lt;/td&gt;
&lt;td&gt;PowerPoint Skill · Claude Code · Canva&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;X Post Automation&lt;/td&gt;
&lt;td&gt;Typefully Skill&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Project Plan&lt;/td&gt;
&lt;td&gt;Markdown (checklist)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The key insight: once you send instructions to an agent, don't wait — move to the next task immediately. This "serial tasking" pattern is what makes AI-era multitasking work.&lt;/p&gt;


&lt;h2&gt;
  
  
  Detailed Video Guide
&lt;/h2&gt;


&lt;h3&gt;
  
  
  Part 1 — Mastering the Basics
&lt;/h3&gt;
&lt;h4&gt;
  
  
  Downloading &amp;amp; Project Management
&lt;/h4&gt;

&lt;p&gt;Search "Codex app download" in your browser and download from chatgpt.com. The first-time interface looks like ChatGPT, but the depth is entirely different.&lt;/p&gt;

&lt;p&gt;Codex's biggest feature is &lt;strong&gt;project management tied to local folders&lt;/strong&gt;. Before starting a chat, you specify which folder to work in. That folder becomes the "project," and all agent-created files are auto-saved to its &lt;code&gt;outputs/&lt;/code&gt; subfolder.&lt;/p&gt;

&lt;p&gt;From the side panel you can open the folder in Finder or reference files inside with &lt;code&gt;@filename&lt;/code&gt;. Even with 30+ projects, Command+G lets you search by chat name or content instantly.&lt;/p&gt;

&lt;p&gt;Set permissions to "Full Access" so the agent works without approval prompts. The recommended defaults are GPT-5.4 model and Extra High effort level.&lt;/p&gt;


&lt;h4&gt;
  
  
  How to Use Skills and Plugins
&lt;/h4&gt;

&lt;p&gt;The core distinction:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Skills&lt;/strong&gt; are "recipes" — instructions telling the agent how to execute a specific task. &lt;strong&gt;Plugins&lt;/strong&gt; are installable packages that bring those recipes into Codex. Think of a plugin as "the container for a skill."&lt;/p&gt;

&lt;p&gt;To learn what a new plugin can do, open a new chat and ask: &lt;code&gt;@Figma tell me everything you can do with this plugin&lt;/code&gt;. Click the caret (▼) in the response to see the agent's reasoning.&lt;/p&gt;


&lt;h4&gt;
  
  
  Hands-On: Automating with Google Calendar + Gmail
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7zrbxc78bjikcett8ies.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7zrbxc78bjikcett8ies.jpg" alt="Google Calendar integration and automation setup (around 15:00)" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Installing the Google Calendar plugin takes seconds: Plugins → Google Calendar → sign in via browser.&lt;/p&gt;

&lt;p&gt;Once connected, this entire workflow happens in one conversation:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;"List this week's events for me" → Full calendar response&lt;/li&gt;
&lt;li&gt;"Email me a weekly summary" → Sent via Gmail immediately&lt;/li&gt;
&lt;li&gt;"Make this an automation for every Friday at 4pm" → Weekly task registered&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The Automations tab shows next run time, status, and a test button. You can edit automations later with natural language.&lt;/p&gt;


&lt;h4&gt;
  
  
  Generating Designs with Figma and Paper MCP
&lt;/h4&gt;

&lt;p&gt;Figma's main use case is converting an &lt;strong&gt;existing&lt;/strong&gt; Figma board into code. It's less suited for having the AI generate designs and place them into Figma.&lt;/p&gt;

&lt;p&gt;That's where &lt;strong&gt;Paper (Alpha)&lt;/strong&gt; comes in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Using the new shoe PNG (no background), create a landing page on the open Paper board"
↓
Codex calls Paper MCP, decides design direction
↓
Auto-builds: Hero · Performance Strip · Product Story · CTA
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Right-click a chat and select "Open in mini window" to float the chat sidebar while working in Paper.&lt;/p&gt;

&lt;p&gt;Codex also has a &lt;strong&gt;Steer&lt;/strong&gt; feature. Normally, typing a prompt while the agent is running queues it for later. Clicking "Steer" injects your message immediately — great for pasting a screenshot and saying "this button is overlapping, fix it while you work."&lt;/p&gt;




&lt;h4&gt;
  
  
  Building a Custom Skill: YouTube Researcher
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Find an API&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ask Codex: "Suggest the top 5 APIs for pulling YouTube transcripts." SuperData, Transcript API, YouTube Transcript.io, and others are returned. SuperData offers 100 free requests/month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Create the skill&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In a new chat:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Use skill creator to build a skill that uses the SuperData API
to fetch and summarize the latest 10 videos from any YouTube channel.
API key: [paste here]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;skill creator&lt;/code&gt; keyword activates Codex's skill-building mode.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Use the skill&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Open a new chat and type &lt;code&gt;YouTube Researcher&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Look up Riley Brown's latest 10 YouTube videos,
pull the transcripts, and create a document.
Include which videos performed well, a hook analysis, and thumbnails.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The resulting report showed "Claude is taking over (high urgency, large market shift)" and "Claude Code Leak" as top-performing hooks. Vibe coding content underperformed. Then this was automated: "On the last day of every month, run this skill and create a Word report."&lt;/p&gt;




&lt;h3&gt;
  
  
  Part 2 — Building 6 Projects Simultaneously
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The following demos use "Chorus" (an AI agent learning iOS app) as the subject. The core idea: send instructions, then move to the next task without waiting.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  Planning the Project
&lt;/h4&gt;

&lt;p&gt;Create a "My New Business" folder and start a new project. Create the plan as a Markdown file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Attach screenshot]

"Create a checklist-style plan from this.
 Items: iOS app · Web landing page · Mobile app design ·
 Launch video · Investor deck · X post automation.
 Include the app idea at the top."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Chorus concept: A platform for learning about AI agents. Compare agent tools, access a copy-paste skill library, and learn fundamentals — all in an iOS app.&lt;/p&gt;




&lt;h4&gt;
  
  
  iOS App: From Design to TestFlight
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Screen Design Skill&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Export the workflow from claude.ai/design, paste it into Codex, and say "create a mobile design skill that does the same thing." That's all it takes to build the custom skill.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Use the mobile design skill to create Chorus app screens
 in basic Apple style"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A prototype link appears showing Learn · Platforms · Skills · Saved in a 4-tab layout.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Xcode Build&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Create a new Swift mobile app project called Chorus.
 For now, just display 'Hello, this is Chorus' in the center.
 Open the Xcode project when done."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hit Play in Xcode + iOS Simulator (or physical device) to see each build. After integrating the screen designs, connect Supabase.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Supabase + Authentication&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Supabase is the de facto AI-agent database. After configuring the MCP, restart Codex for the connection to register. Then: "If connected, create all tables." Tables for skill categories, platforms, skills, and saved items are auto-generated.&lt;/p&gt;

&lt;p&gt;Authentication was implemented with email + password (Google sign-in was initially tried, but Supabase's native email auth was the fastest path). Disable email confirmation in Supabase for instant login. The app eventually shipped to TestFlight.&lt;/p&gt;




&lt;h4&gt;
  
  
  Web Landing Page: Tally + React + Vercel
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Form setup (tally.so)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Create a waitlist form with name and email fields, then copy the embed code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Run as React app&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"I'm using tally.so. Embed this form in the site and
 run it locally as a React app. We'll design it later."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Styling with Claude Code&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Codex struggles with design, so bring in Claude Code from the terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;--dangerously-skip-permissions&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Forget the current page styling completely.
 Look at the Chorus app code and match its font and design.
 Keep the Tally embed. Minimal text, simple, conversion-focused."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude Code improves it dramatically in minutes. Then: "Deploy to Vercel and give me the public link."&lt;/p&gt;




&lt;h4&gt;
  
  
  Launch Video: Remotion Plugin
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffc9bm3t2qyarauf2616r.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffc9bm3t2qyarauf2616r.jpg" alt="Building a motion graphic launch video with Remotion (around 55:00)" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Install the Remotion plugin and type &lt;code&gt;@remotion&lt;/code&gt; in a new chat.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Create a launch video for Chorus app.
 Start with a test: take app screenshots (attached),
 put them in iPhone mockups on a white background, and animate them.
 Run it locally."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;localhost:3031&lt;/code&gt; opens a timeline editor. Specify timing as &lt;strong&gt;seconds.frames&lt;/strong&gt; (e.g., &lt;code&gt;2.20&lt;/code&gt; = 2 seconds, 20 frames).&lt;/p&gt;

&lt;p&gt;Use Steer to adjust in real time. Turn on grid lines and give the agent exact coordinates (e.g., "X:1040, Y:540") for precise placement.&lt;/p&gt;

&lt;p&gt;Hand off design-heavy sections (animation quality, color cards, cut timing) to Claude Code — the results are noticeably better. To add BGM, attach an MP3 and say "add this song at 50% volume."&lt;/p&gt;




&lt;h4&gt;
  
  
  Investor Deck: Chat Fork + Canva Export
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Fork the chat&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Right-click the mobile app chat → "Fork into Local." This creates a new chat with the same context. Rename it "Investor Deck."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Analyze the app features, icons, and style, then
 create a matching investor slide deck.
 Use the PowerPoint skill.
 Research what investors look for in April 2026 and match that style."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Refine with Claude Code&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;claude --dangerously-skip-permissions
"Review this deck and reduce text, add more visuals.
 Add charts and diagrams to improve readability. Don't add slides."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Export to Canva&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Click the Canva icon next to the PowerPoint file. It opens instantly in Canva for final 5–10% polish — add animations, adjust colors.&lt;/p&gt;




&lt;h4&gt;
  
  
  X Post Automation: Typefully Skill
&lt;/h4&gt;

&lt;p&gt;Get a Typefully API key (Settings → API → Create new key), then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Search the Typefully API docs and build a skill that gives me
 full control. Test it on the Riley Brown account
 (use fruit emojis so I know which are yours).
 API key: [paste here]"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then automate it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Set up an automation to create 3 X post drafts every morning.
 Use the Typefully control skill."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Final Results
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Outcome&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;iOS App&lt;/td&gt;
&lt;td&gt;Published to TestFlight (Learn · Platforms · Skills · Saved)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web Landing Page&lt;/td&gt;
&lt;td&gt;Live on Vercel · Tally form confirmed working&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Launch Video&lt;/td&gt;
&lt;td&gt;First draft complete (Remotion + Claude Code)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Investor Deck&lt;/td&gt;
&lt;td&gt;Exported to Canva · Final polish done&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;X Post Automation&lt;/td&gt;
&lt;td&gt;3 daily drafts scheduled&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Project Plan&lt;/td&gt;
&lt;td&gt;All 6 items checked off&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; AI agents can take 1–2 hours on complex tasks. Instead of waiting, the key is to keep sending new instructions to new agents and move on. That's the core productivity skill of the AI era.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>codex</category>
      <category>openai</category>
      <category>automation</category>
    </item>
    <item>
      <title>Practical Tips and Tricks for Claude Code: Mastering AI-Powered Development</title>
      <dc:creator>naoki_JPN</dc:creator>
      <pubDate>Tue, 28 Apr 2026 04:28:25 +0000</pubDate>
      <link>https://dev.to/bokuno_log/practical-tips-and-tricks-for-claude-code-mastering-ai-powered-development-5981</link>
      <guid>https://dev.to/bokuno_log/practical-tips-and-tricks-for-claude-code-mastering-ai-powered-development-5981</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Claude Code represents a new generation of AI-powered development assistants. Unlike earlier tools focused on completing individual lines of code, Claude Code is fully agentic—designed to handle entire features, functions, and complex bugs simultaneously.&lt;/p&gt;

&lt;p&gt;Boris Cherny, a technical staff member at Anthropic and creator of Claude Code, shares practical insights on how to leverage this powerful tool effectively in real-world development workflows. Whether you're new to Claude Code or looking to maximize its potential, these tips and tricks will transform how you approach development tasks.&lt;/p&gt;

&lt;p&gt;What sets Claude Code apart is its integration with your existing workflow. You don't need to abandon your IDE or terminal environment. Claude Code works seamlessly with VS Code, Xcode, JetBrains IDEs, and any terminal setup, whether local or remote via SSH.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcew3ocykm3qtyd57lj44.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcew3ocykm3qtyd57lj44.jpg" alt="Claude Code in Action" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started: Essential Setup and Configuration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Initial Setup Steps
&lt;/h3&gt;

&lt;p&gt;When first launching Claude Code, Boris recommends taking time to properly configure your environment:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Terminal Configuration&lt;/strong&gt;: Run terminal setup to enable shift-enter for new lines, making multi-line input more natural without awkward backslash escapes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theme Selection&lt;/strong&gt;: Use &lt;code&gt;/theme&lt;/code&gt; to configure your preferred appearance (light, dark, or custom themes).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub Integration&lt;/strong&gt;: Execute &lt;code&gt;/install&lt;/code&gt; for the new GitHub App integration, allowing you to mention Claude Code directly in GitHub issues and pull requests.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tool Customization&lt;/strong&gt;: Configure allowed tools to prevent repetitive permission prompts for tools you use frequently.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Accessibility Features&lt;/strong&gt;: Enable dictation on macOS through system settings under accessibility. This allows you to speak prompts directly using the dictation key—a powerful feature for hands-free development when typing feels tedious.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configuration Philosophy
&lt;/h3&gt;

&lt;p&gt;Claude Code is intentionally designed as a power tool without guardrails that funnel you toward a specific workflow. This flexibility means you need to establish your own patterns. The good news is that setup time invested upfront pays dividends in productivity.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Foundation: Code-Based Question and Answering
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why Start with Q&amp;amp;A
&lt;/h3&gt;

&lt;p&gt;Boris's primary recommendation: &lt;strong&gt;Start with code-based question and answering&lt;/strong&gt;. Don't jump immediately to code generation or editing—begin by asking questions about your codebase.&lt;/p&gt;

&lt;p&gt;This approach revolutionizes technical onboarding. At Anthropic, this method reduced new hire technical onboarding from 2-3 weeks to 2-3 days. Instead of bombarding senior engineers with questions, new developers can ask Claude Code directly about the codebase.&lt;/p&gt;

&lt;h3&gt;
  
  
  How Claude Code Explores Your Code
&lt;/h3&gt;

&lt;p&gt;Unlike simple text search, Claude Code performs deep analysis:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Usage Analysis&lt;/strong&gt;: Ask how a piece of code is used throughout the codebase. Claude Code finds and contextualizes actual usage examples rather than just text matches.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Instantiation Patterns&lt;/strong&gt;: Inquire about how to properly instantiate classes or use APIs. Claude Code identifies real examples from your codebase showing correct usage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Git History Analysis&lt;/strong&gt;: Ask why functions have particular signatures or arguments. Claude Code examines git history to understand how features evolved, what issues were addressed, and why decisions were made.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;GitHub Issue Context&lt;/strong&gt;: Cross-reference code with GitHub issues for deeper understanding of context and decision-making.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Commit Understanding&lt;/strong&gt;: Retrieve detailed commit messages and surrounding context to understand the "why" behind code decisions.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Privacy and Performance Benefits
&lt;/h3&gt;

&lt;p&gt;An important advantage: Claude Code performs no indexing. Your code remains entirely local—it's never uploaded to remote databases or used for model training. There's no setup latency. Start Claude Code, begin asking questions immediately. This architectural choice prioritizes both security and responsiveness.&lt;/p&gt;

&lt;h2&gt;
  
  
  Progressing to Code Editing
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Understanding Agentic Code Editing
&lt;/h3&gt;

&lt;p&gt;Once comfortable with Q&amp;amp;A, the next step is code editing. Claude Code is given a minimal set of tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;File editing capabilities&lt;/li&gt;
&lt;li&gt;Bash command execution&lt;/li&gt;
&lt;li&gt;File searching&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What's remarkable is how these simple tools combine effectively. You don't specify which tools to use—you simply describe what you want, and Claude Code sequences the tools intelligently to accomplish your goals.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Planning Pattern
&lt;/h3&gt;

&lt;p&gt;Before Claude Code writes significant code, ask it to brainstorm and create plans. This practice prevents wasted effort:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Request Planning First&lt;/strong&gt;: "Before writing code, make a plan. Let me review it before you proceed."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Iterative Refinement&lt;/strong&gt;: Discuss and refine the approach&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Approval Before Implementation&lt;/strong&gt;: Only after agreement does Claude Code write code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This might seem like extra steps, but it prevents the common scenario where Claude Code builds something incorrect that requires rework.&lt;/p&gt;

&lt;h3&gt;
  
  
  Common Effective Incantations
&lt;/h3&gt;

&lt;p&gt;Certain prompting patterns become standard:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Commit and push for me"&lt;/strong&gt;: This seemingly simple request causes Claude Code to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Examine code changes&lt;/li&gt;
&lt;li&gt;Look at git history to understand commit conventions&lt;/li&gt;
&lt;li&gt;Create appropriately formatted commits&lt;/li&gt;
&lt;li&gt;Push to the current branch&lt;/li&gt;
&lt;li&gt;Create a pull request on GitHub&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All without explicit instruction—Claude Code figures this out because the underlying model is capable of understanding developer intent.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4l9nhhyav3law2putin4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4l9nhhyav3law2putin4.jpg" alt="Claude Code Workflow" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Integrating Your Team's Tools
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Tool Categories
&lt;/h3&gt;

&lt;p&gt;As you advance, teaching Claude Code about your team's tools becomes crucial. Two primary categories exist:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CLIs and Scripts&lt;/strong&gt;: Tell Claude Code about custom command-line tools. Use help flags to let it discover how to use them. For frequently-used tools, document them in your Claude.md file (covered below).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP (Model Context Protocol) Servers&lt;/strong&gt;: Claude Code integrates with MCP servers, giving it access to standardized tool interfaces your team already uses.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Power of Tool Integration
&lt;/h3&gt;

&lt;p&gt;When Claude Code understands your team's tools, it becomes exponentially more powerful. New engineers joining the codebase can access all tools immediately through Claude Code, eliminating individual setup friction.&lt;/p&gt;

&lt;p&gt;Common tool integration patterns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Testing frameworks and runners&lt;/li&gt;
&lt;li&gt;Deployment and DevOps tools&lt;/li&gt;
&lt;li&gt;Code analysis and linting tools&lt;/li&gt;
&lt;li&gt;Custom build systems&lt;/li&gt;
&lt;li&gt;Monitoring and logging systems&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Feedback Loops and Iteration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Critical Pattern: Verify Before Acting
&lt;/h3&gt;

&lt;p&gt;A powerful workflow pattern:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Claude Code makes a plan&lt;/li&gt;
&lt;li&gt;You verify the plan&lt;/li&gt;
&lt;li&gt;Claude Code gets approval&lt;/li&gt;
&lt;li&gt;Implementation begins&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Automated Iteration with Feedback Tools
&lt;/h3&gt;

&lt;p&gt;Where Claude Code truly excels is when you provide feedback mechanisms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unit Tests&lt;/strong&gt;: If Claude Code can run unit tests, it can verify its work and iterate automatically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Screenshot Testing&lt;/strong&gt;: Provide screenshot capabilities (via tools like Puppeteer for web, simulators for mobile), and Claude Code will iterate on implementations until they match requirements&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;UI Feedback&lt;/strong&gt;: For any visual component, tools that provide visual feedback enable automatic iteration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern: &lt;strong&gt;Give Claude Code a way to check its work&lt;/strong&gt;. It will iterate automatically, improving results with each feedback cycle. This often produces near-perfect results after 2-3 iterations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Workflow Recommendations by Task Type
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Exploration + Planning&lt;/strong&gt;: Before code changes, discuss and plan&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Testing-Driven Development&lt;/strong&gt;: Enable unit test running for automatic iteration&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visual Development&lt;/strong&gt;: Use screenshots and visual feedback tools for UI development&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rapid Iteration&lt;/strong&gt;: Set up feedback mechanisms for automated refinement&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Context Management: The Claude.md System
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Understanding Claude.md
&lt;/h3&gt;

&lt;p&gt;As you work deeper with Claude Code, providing context becomes crucial. The Claude.md file system handles this elegantly:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Project-Level Claude.md&lt;/strong&gt;: Place in your project root (same directory where you run Claude Code). Automatically included in every session.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Local Claude.md&lt;/strong&gt;: For personal preferences not shared with the team. Keep in .gitignore.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Nested Claude.md Files&lt;/strong&gt;: Place in subdirectories. Claude Code automatically includes relevant ones when working in those directories.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Goes in Claude.md
&lt;/h3&gt;

&lt;p&gt;Keep Claude.md concise—it uses context, so excessive documentation wastes tokens. Include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Common bash commands&lt;/strong&gt;: Frequently-used commands specific to your project&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Style guides&lt;/strong&gt;: Link or summarize coding standards&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Core files&lt;/strong&gt;: List important files developers should know about&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture notes&lt;/strong&gt;: Key architectural decisions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool documentation&lt;/strong&gt;: How to use project-specific tools&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Anthropic's internal Claude.md for their main codebase includes common commands, style guidelines, and critical files—nothing verbose.&lt;/p&gt;

&lt;h3&gt;
  
  
  Context Hierarchy
&lt;/h3&gt;

&lt;p&gt;Claude Code respects a hierarchy of context sources:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Project Config&lt;/strong&gt;: Project-specific settings checked into version control&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;User Config&lt;/strong&gt;: Personal settings and preferences&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enterprise Policies&lt;/strong&gt;: Organization-wide settings applied automatically&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Practical Context Management Scenarios
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Shared Context&lt;/strong&gt;: Write Claude.md once, share with your team. Network effects mean one person's effort benefits everyone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Memory Files&lt;/strong&gt;: Use &lt;code&gt;/remember&lt;/code&gt; to teach Claude Code about patterns it missed. The &lt;code&gt;#&lt;/code&gt; symbol lets you save to specific memory files.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tool Documentation&lt;/strong&gt;: For tools you use frequently, document how to use them in Claude.md rather than explaining each time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Permissions Management&lt;/strong&gt;: Use enterprise policies to pre-approve common commands across your organization, or block dangerous operations universally.&lt;/p&gt;

&lt;h2&gt;
  
  
  Advanced Features and Keyboard Shortcuts
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Essential Keyboard Shortcuts
&lt;/h3&gt;

&lt;p&gt;Terminal interfaces make some features hard to discover. Key shortcuts include:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Shift+Tab&lt;/strong&gt;: Enter auto-accept edit mode. Bash commands still require approval, but file edits are automatically accepted. Useful when Claude Code is on a roll with test iteration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;#&lt;/strong&gt; (pound sign): Teach Claude Code something. For example, "# Remember to use the async/await pattern in this codebase." Claude Code incorporates this into memory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;!&lt;/strong&gt; (exclamation): Drop into bash mode to run a command directly. Output enters the context window, so Claude Code sees both command and results on the next turn. Useful for long-running processes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Escape&lt;/strong&gt;: Stop Claude Code at any time. Never corrupts the session. Use for course-correcting mid-task.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Escape Escape&lt;/strong&gt; (double press): Jump back in conversation history.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Control+R&lt;/strong&gt;: View full output of Claude Code's current reasoning, matching what Claude Code sees in its context window.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Resume Flag&lt;/strong&gt;: After closing, run &lt;code&gt;claude code --resume&lt;/code&gt; or &lt;code&gt;--continue&lt;/code&gt; to resume the previous session.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fypxn0b54xekfm57bt7yn.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fypxn0b54xekfm57bt7yn.jpg" alt="Claude Code Advanced Features" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Claude Code SDK: Programmatic Access
&lt;/h2&gt;

&lt;p&gt;For advanced users and automation scenarios, the Claude Code SDK provides programmatic access to the same capabilities available in the CLI.&lt;/p&gt;

&lt;h3&gt;
  
  
  SDK Basics
&lt;/h3&gt;

&lt;p&gt;The SDK can be invoked with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Custom prompts&lt;/li&gt;
&lt;li&gt;Specified allowed tools&lt;/li&gt;
&lt;li&gt;Output format options (JSON, streaming JSON, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Unix Philosophy Integration
&lt;/h3&gt;

&lt;p&gt;Think of Claude Code as a "super-intelligent Unix utility." You can:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude-code &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"Analyze this data"&lt;/span&gt; &lt;span class="nt"&gt;--tools&lt;/span&gt; allow-cmd-execution
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pipe data in and out:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git status | claude-code &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"What changes matter here?"&lt;/span&gt;
jq &lt;span class="s1"&gt;'.data'&lt;/span&gt; large-file.json | claude-code &lt;span class="nt"&gt;-p&lt;/span&gt; &lt;span class="s2"&gt;"Find issues"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The possibilities are nearly endless:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fetch large logs and ask Claude Code for insights&lt;/li&gt;
&lt;li&gt;Process data from GCP buckets&lt;/li&gt;
&lt;li&gt;Analyze outputs from monitoring systems&lt;/li&gt;
&lt;li&gt;Transform data formats automatically&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Advanced Multi-Session Patterns
&lt;/h2&gt;

&lt;p&gt;For advanced users, the most productive setup involves multiple Claude Code sessions running in parallel:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SSH Sessions&lt;/strong&gt;: Run Claude Code in remote servers via SSH&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TMux Tunnels&lt;/strong&gt;: Manage multiple terminals within a single session&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Git Worktrees&lt;/strong&gt;: Use git worktrees for isolation while running multiple Claude Code instances&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multiple Checkouts&lt;/strong&gt;: Clone the same repository multiple times for parallel work&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While Anthropic is actively improving parallel session support, power users are already leveraging these patterns to manage significant parallel workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Implementation Challenges
&lt;/h2&gt;

&lt;p&gt;When asked about the hardest implementation challenges, Boris highlights the complexity of managing bash command safety:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Challenge&lt;/strong&gt;: Bash commands can change system state unexpectedly and require careful handling. Yet requiring approval for every command destroys productivity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution&lt;/strong&gt;: Claude Code implements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Read-only command detection&lt;/li&gt;
&lt;li&gt;Static analysis to identify safely-combinable commands&lt;/li&gt;
&lt;li&gt;Complex tiered permission systems&lt;/li&gt;
&lt;li&gt;User-configurable safety levels&lt;/li&gt;
&lt;li&gt;Enterprise-wide policy enforcement&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This approach scales across different coding environments, from Docker containers to bare metal systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multimodal Capabilities
&lt;/h2&gt;

&lt;p&gt;An often-overlooked feature: Claude Code is fully multimodal. You can provide images through:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Drag and drop&lt;/strong&gt;: Drop images directly into the chat&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;File paths&lt;/strong&gt;: Reference images on your filesystem&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Copy and paste&lt;/strong&gt;: Paste images directly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Common use cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Provide design mockups: Drop a UI mockup image, ask Claude Code to implement it&lt;/li&gt;
&lt;li&gt;Get iteration feedback: Screenshot results, compare with mockups, request refinements&lt;/li&gt;
&lt;li&gt;Visual debugging: Share application screenshots for Claude Code to analyze&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Integration with Your Existing Tools
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Recommended Setup Progression
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Start Simple&lt;/strong&gt;: Master code-base Q&amp;amp;A before moving to editing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Setup Tools Gradually&lt;/strong&gt;: Add your team's tools incrementally&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build Context&lt;/strong&gt;: Create project Claude.md incrementally&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Leverage Memory&lt;/strong&gt;: Use &lt;code&gt;/remember&lt;/code&gt; to teach Claude Code team-specific patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automate Gradually&lt;/strong&gt;: Set up feedback loops for iterative improvement&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Onboarding Your Team
&lt;/h3&gt;

&lt;p&gt;If introducing Claude Code to your team:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start everyone with code-base Q&amp;amp;A&lt;/li&gt;
&lt;li&gt;Don't jump straight to code generation&lt;/li&gt;
&lt;li&gt;Let people experience the tool's capabilities organically&lt;/li&gt;
&lt;li&gt;Share useful Claude.md content and memory tips&lt;/li&gt;
&lt;li&gt;Build patterns together as a team&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Current Usage at Anthropic
&lt;/h2&gt;

&lt;p&gt;As evidence of Claude Code's effectiveness, about 80% of technical staff at Anthropic use Claude Code daily. This includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Software engineers across all teams&lt;/li&gt;
&lt;li&gt;Research scientists working with notebooks and scripts&lt;/li&gt;
&lt;li&gt;DevOps engineers automating infrastructure&lt;/li&gt;
&lt;li&gt;Data scientists building analysis pipelines&lt;/li&gt;
&lt;li&gt;System engineers maintaining tooling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The breadth of usage demonstrates Claude Code's versatility and effectiveness across different technical domains.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Start with Q&amp;amp;A&lt;/strong&gt;: Before code generation, master question-answering about your codebase&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plan Before Implementation&lt;/strong&gt;: Ask Claude Code to brainstorm and plan before writing significant code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Setup Matters&lt;/strong&gt;: Invest in proper configuration and Claude.md for dramatically better results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context is King&lt;/strong&gt;: The more context you provide, the smarter Claude Code's decisions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feedback Loops&lt;/strong&gt;: Provide ways for Claude Code to verify its work, and it will iterate automatically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool Integration&lt;/strong&gt;: Teaching Claude Code about your team's tools multiplies its effectiveness&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keyboard Shortcuts&lt;/strong&gt;: Master key shortcuts for faster, more natural interaction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Indexing&lt;/strong&gt;: Your code stays local and private, never uploaded or used for model training&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-Modal&lt;/strong&gt;: Leverage image support for design-driven development and visual feedback&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallel Workflows&lt;/strong&gt;: Advanced users can run multiple Claude Code sessions for parallel work&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Claude Code isn't just an incremental improvement—it's a transformative tool that fundamentally changes how developers approach their work. By mastering these practical tips and tricks, you'll unlock dramatically improved productivity and code quality.&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>ai</category>
      <category>productivity</category>
      <category>development</category>
    </item>
    <item>
      <title>How I Built a Personal Training Coach in One Week</title>
      <dc:creator>naoki_JPN</dc:creator>
      <pubDate>Fri, 24 Apr 2026 10:58:10 +0000</pubDate>
      <link>https://dev.to/bokuno_log/how-i-built-a-personal-training-coach-in-one-week-47m</link>
      <guid>https://dev.to/bokuno_log/how-i-built-a-personal-training-coach-in-one-week-47m</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This article reflects information as of April 2026.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;"What if I had an AI coach that knows all my Garmin data?"&lt;/p&gt;

&lt;p&gt;That thought sparked &lt;strong&gt;Stride Mate&lt;/strong&gt; — a personal training coach app. The first commit was on April 17th, and one week later it has Square billing, Garmin online sync, and a usage dashboard running.&lt;/p&gt;

&lt;p&gt;This is the 7-day development journal.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is Stride Mate?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Stride Mate&lt;/strong&gt; (&lt;a href="https://stride-mate.vercel.app" rel="noopener noreferrer"&gt;https://stride-mate.vercel.app&lt;/a&gt;) is an AI coach you talk to via LINE, backed by your Garmin training data.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"How was my training last week?"&lt;/li&gt;
&lt;li&gt;"What should I do this week for my upcoming race?"&lt;/li&gt;
&lt;li&gt;"My HRV has been low lately — should I rest?"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It answers these questions by referencing your actual activity history, heart rate, and sleep data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tech Stack
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Tech&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Chat UI&lt;/td&gt;
&lt;td&gt;LINE Messaging API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web (Dashboard)&lt;/td&gt;
&lt;td&gt;Next.js 15 / Vercel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Background Sync&lt;/td&gt;
&lt;td&gt;Node.js / Railway&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DB&lt;/td&gt;
&lt;td&gt;Supabase (PostgreSQL)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI&lt;/td&gt;
&lt;td&gt;Claude API (Haiku / Sonnet)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Billing&lt;/td&gt;
&lt;td&gt;Square Subscriptions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Garmin&lt;/td&gt;
&lt;td&gt;garmin-connect (unofficial API)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Day 1 (4/17): Getting the MVP Running
&lt;/h2&gt;

&lt;h3&gt;
  
  
  From design to implementation
&lt;/h3&gt;

&lt;p&gt;Created the repo at 1pm. Locked in the architecture in 3 hours, pushed MVP by 5pm.&lt;/p&gt;

&lt;p&gt;What the MVP had working:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;LINE Webhook -&amp;gt; Claude API -&amp;gt; reply&lt;/li&gt;
&lt;li&gt;Supabase Auth (LINE OAuth)&lt;/li&gt;
&lt;li&gt;Garmin ZIP upload from dashboard&lt;/li&gt;
&lt;li&gt;Activity storage and RAG search&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The deployment grind
&lt;/h3&gt;

&lt;p&gt;Railway Nixpacks could not resolve the &lt;code&gt;@tcb/shared&lt;/code&gt; monorepo package. Switched to a Dockerfile build. Finally deployed after 8pm.&lt;/p&gt;




&lt;h2&gt;
  
  
  Day 2-3 (4/18-19): Auth Bug and ZIP Import Fix
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The ES256 problem with LINE OAuth
&lt;/h3&gt;

&lt;p&gt;Could not log in the next day. Supabase custom OIDC does not support ES256. LINE JWKS returns ES256, Supabase expects HS256.&lt;/p&gt;

&lt;p&gt;Solution: manually implemented LINE OAuth in a Next.js API route.&lt;/p&gt;

&lt;h3&gt;
  
  
  ZIP import OOM
&lt;/h3&gt;

&lt;p&gt;Garmin export ZIPs can be several GB. Loading the whole thing into memory crashed with OOM.&lt;/p&gt;

&lt;p&gt;Fixed by fetching via signed URL and processing as a stream instead of using &lt;code&gt;storage.download()&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Day 4-5 (4/20-21): Architecture Rethink
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Dropped RAG
&lt;/h3&gt;

&lt;p&gt;Started with vector embeddings, but direct SQL was more accurate for Garmin data.&lt;/p&gt;

&lt;p&gt;"Last week mileage" or "30-day HRV trend" is answered precisely with &lt;code&gt;WHERE date BETWEEN&lt;/code&gt;, not vector similarity. Deleted the embeddings table and all RAG logic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Garmin Online Sync Problem
&lt;/h3&gt;

&lt;p&gt;Added auto-sync from Garmin Connect at 3am daily. But &lt;strong&gt;initial full fetch works, incremental diff sync is unstable&lt;/strong&gt;. Sessions expire, date ranges are limited.&lt;/p&gt;

&lt;h3&gt;
  
  
  Strava fills the gap
&lt;/h3&gt;

&lt;p&gt;Strava has an official OAuth API with Webhooks. Since Garmin devices sync to Strava automatically, every run flows into Stride Mate too.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Initial full import: Garmin ZIP&lt;/li&gt;
&lt;li&gt;Daily incremental: Strava Webhook + official API&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Day 5-6 (4/21-22): Square Billing
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The webhook that was never registered
&lt;/h3&gt;

&lt;p&gt;Billing was implemented but the plan never switched. Investigation: &lt;strong&gt;Square Webhooks had never been registered&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Perfect code means nothing if the webhook endpoint is never registered in Square's dashboard.&lt;/p&gt;

&lt;h3&gt;
  
  
  Square cancellation event gotcha
&lt;/h3&gt;

&lt;p&gt;I was handling &lt;code&gt;subscription.canceled&lt;/code&gt; — an event type that does not exist.&lt;/p&gt;

&lt;p&gt;Cancellations come through &lt;code&gt;subscription.updated&lt;/code&gt; with &lt;code&gt;status === "CANCELED"&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;subscription.updated&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CANCELED&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;DEACTIVATED&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;users&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;free&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;plan_expires_at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;square_customer_id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Day 7 (4/24): Dashboard Consolidation
&lt;/h2&gt;

&lt;p&gt;Tried tabs, got feedback that tabs are conceptually the same as separate pages. Removed them. Everything on one scrolling page.&lt;/p&gt;




&lt;h2&gt;
  
  
  7 Lessons from 7 Days
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Register webhooks, not just write handlers&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Structured data needs SQL, not RAG&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Stream from the start&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Implement non-standard auth manually&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Current State
&lt;/h2&gt;

&lt;p&gt;Working: LINE chat with Garmin data, ZIP bulk import, Garmin auto-sync, Strava integration, Square billing, usage dashboard.&lt;/p&gt;

&lt;p&gt;Currently just me and a few friends. If you use Garmin, give it a try. Reactions will determine how far this gets built out.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://stride-mate.vercel.app" rel="noopener noreferrer"&gt;https://stride-mate.vercel.app&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;Building with Claude Code, I kept running into the same pattern: code works, configuration missing. The skeleton comes together fast now — but the connective tissue (webhook registration, spec edge cases) still bites just as hard.&lt;/p&gt;

</description>
      <category>garmin</category>
      <category>linebot</category>
      <category>nextjs</category>
      <category>claudecode</category>
    </item>
    <item>
      <title>I Built a Claude Code Plugin That Simultaneously Posts to Zenn and dev.to With Just "Publish the article"</title>
      <dc:creator>naoki_JPN</dc:creator>
      <pubDate>Fri, 17 Apr 2026 04:31:47 +0000</pubDate>
      <link>https://dev.to/bokuno_log/i-built-a-claude-code-plugin-that-simultaneously-posts-to-zenn-and-devto-with-just-publish-the-4pmj</link>
      <guid>https://dev.to/bokuno_log/i-built-a-claude-code-plugin-that-simultaneously-posts-to-zenn-and-devto-with-just-publish-the-4pmj</guid>
      <description>&lt;p&gt;I built a plugin for Claude Code that simultaneously posts to Zenn (in Japanese) and dev.to (in English translation) just by saying "publish the article."&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;zenn-post&lt;/strong&gt; — Claude Code Plugin / Skill&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/bokuno-studio/zenn-post-cc-plugin" rel="noopener noreferrer"&gt;https://github.com/bokuno-studio/zenn-post-cc-plugin&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This plugin handles the entire article publishing workflow — creating drafts, formatting Markdown, running git push, and calling the dev.to API — all delegated to Claude Code.&lt;/p&gt;

&lt;h2&gt;
  
  
  What It Can Do
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;"Publish the article" → &lt;strong&gt;simultaneously posts&lt;/strong&gt; to Zenn (Japanese) and dev.to (English translation)&lt;/li&gt;
&lt;li&gt;Pass a Notion page URL → reads the content, converts it to an article, and posts it&lt;/li&gt;
&lt;li&gt;Just describe a topic verbally → fully automated from writing to git push&lt;/li&gt;
&lt;li&gt;Automatically converts Mermaid diagrams to PNG locally (no external services)&lt;/li&gt;
&lt;li&gt;Preview before publishing → runs &lt;code&gt;git push&lt;/code&gt; after your OK
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Upload this Notion page to Zenn"
"Publish the article"          ← Posts to both Zenn + dev.to
"Post to Zenn only"            ← Zenn only
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why I Built It
&lt;/h2&gt;

&lt;p&gt;The Zenn posting workflow was quietly tedious:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a Markdown file&lt;/li&gt;
&lt;li&gt;Write the frontmatter&lt;/li&gt;
&lt;li&gt;Write the content&lt;/li&gt;
&lt;li&gt;Change &lt;code&gt;published: true&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;git commit &amp;amp; push&lt;/li&gt;
&lt;li&gt;(If also posting to dev.to) Translate to English and call the API&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Doing this every time felt like a chore, so I thought "let Claude Code handle all of it."&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Highlights
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Implemented as a Claude Code Skill (SKILL.md)
&lt;/h3&gt;

&lt;p&gt;Claude Code has a plugin/skill feature where you can register a skill just by writing a prompt in &lt;code&gt;SKILL.md&lt;/code&gt;. No code needed at all — you just define "how it should behave" in natural language.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;skills/zenn-post/SKILL.md  ← that's it
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Direct POST to dev.to via curl
&lt;/h3&gt;

&lt;p&gt;The dev.to API is simple — a single curl request does the job. Claude Code reads the API key from &lt;code&gt;.env&lt;/code&gt; and assembles the request.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://dev.to/api/articles &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"api-key: &lt;/span&gt;&lt;span class="nv"&gt;$DEVTO_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{ "article": { "title": "...", "published": true, "body_markdown": "..." } }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Mermaid Diagrams Converted to PNG Locally
&lt;/h3&gt;

&lt;p&gt;Since dev.to doesn't render Mermaid, this plugin uses &lt;code&gt;mermaid-cli&lt;/code&gt; for local conversion into images. The converted PNG is stored in the &lt;code&gt;images/&lt;/code&gt; directory of the zenn-content repository and served via &lt;code&gt;raw.githubusercontent.com&lt;/code&gt; — no external services involved.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @mermaid-js/mermaid-cli &lt;span class="nt"&gt;-i&lt;/span&gt; diagram.mmd &lt;span class="nt"&gt;-o&lt;/span&gt; diagram.png &lt;span class="nt"&gt;-t&lt;/span&gt; default &lt;span class="nt"&gt;-b&lt;/span&gt; white
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Prerequisites:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Claude Code installed&lt;/li&gt;
&lt;li&gt;A Zenn account and zenn-content repository (public) on GitHub&lt;/li&gt;
&lt;li&gt;A dev.to account and API key&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Steps:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/bokuno-studio/zenn-post-cc-plugin ~/.claude/plugins/zenn-post
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Edit the environment info section in &lt;code&gt;skills/zenn-post/SKILL.md&lt;/code&gt; with your own paths, then register it with Claude Code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude plugin marketplace add ~/.claude/plugins/zenn-post
claude plugin &lt;span class="nb"&gt;install &lt;/span&gt;zenn-post
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Changelog
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;Date&lt;/th&gt;
&lt;th&gt;Changes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;v0.1.0&lt;/td&gt;
&lt;td&gt;2026-04-16&lt;/td&gt;
&lt;td&gt;Initial release (Zenn posting only)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;v0.2.0&lt;/td&gt;
&lt;td&gt;2026-04-17&lt;/td&gt;
&lt;td&gt;Added dev.to simultaneous posting and Mermaid CLI support&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;Using Claude Code's Skill feature, you can take a new approach to automation: "define Claude's behavior without writing code." Extracting routine tasks like article publishing into a skill makes it satisfying to complete everything with just a verbal instruction.&lt;/p&gt;

&lt;p&gt;Feel free to give it a try — feedback welcome!&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>zenn</category>
      <category>devto</category>
      <category>ai</category>
    </item>
    <item>
      <title>I Built a Sales Prep AI and It Went Deeper Than Expected</title>
      <dc:creator>naoki_JPN</dc:creator>
      <pubDate>Fri, 17 Apr 2026 01:55:37 +0000</pubDate>
      <link>https://dev.to/bokuno_log/i-built-a-sales-prep-ai-and-it-went-deeper-than-expected-4bcd</link>
      <guid>https://dev.to/bokuno_log/i-built-a-sales-prep-ai-and-it-went-deeper-than-expected-4bcd</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;"Before a first sales meeting, you always research the other company. That part is kind of a pain, right?"&lt;/p&gt;

&lt;p&gt;That thought is where this started. I wanted something that would take a company name and automatically research it, then return a report.&lt;/p&gt;

&lt;p&gt;I figured I could get something working in 2–3 days. But getting it to a genuinely usable level turned out to be much deeper than expected. This is the story of that process.&lt;/p&gt;

&lt;p&gt;What I built: &lt;strong&gt;Sales Prep AI&lt;/strong&gt; (LINE bot)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://pre-talk.vercel.app" rel="noopener noreferrer"&gt;https://pre-talk.vercel.app&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvz8xv2dav882571wngml.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvz8xv2dav882571wngml.png" alt="Tech Stack" width="784" height="300"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  How It Came Together
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Phase 1: Just Get It Working
&lt;/h3&gt;

&lt;p&gt;I started with a simple web form. Enter company name, department, and contact name → it searches the web → GPT-4o-mini analyzes the results → returns a report.&lt;/p&gt;

&lt;p&gt;As I worked on improving reasoning quality, I switched from GPT-4o-mini to &lt;strong&gt;Claude Sonnet&lt;/strong&gt; for the analysis layer. Light input-interpretation tasks go to &lt;strong&gt;Claude Haiku&lt;/strong&gt;; heavy analysis and OCR go to &lt;strong&gt;Claude Sonnet&lt;/strong&gt;. That division of labor stuck.&lt;/p&gt;

&lt;p&gt;For search, I started with DuckDuckGo, but the quality wasn't great, so I switched to Tavily. That one change made a noticeable difference in search quality.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 2: Fighting Vercel Hobby's 10-Second Timeout
&lt;/h3&gt;

&lt;p&gt;Research involves multiple steps — search, then AI analysis — and it realistically takes 1–2 minutes. Vercel's Hobby plan times out at 10 seconds.&lt;/p&gt;

&lt;p&gt;The solution: streaming responses. By returning a response while continuing to process, you keep the function alive.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ReadableStream&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;heartbeat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setInterval&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enqueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TextEncoder&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Heavy processing happens here&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;runResearch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;clearInterval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;heartbeat&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="nx"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sending a blank space every 5 seconds keeps the connection alive. Brute-force, but it works.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 3: Slack Bot → LINE Bot
&lt;/h3&gt;

&lt;p&gt;A web form creates friction — you have to actively open it when you need it. It's better to use it from a tool you already have open.&lt;/p&gt;

&lt;p&gt;I built a Slack bot first. But when I ended up canceling the paid Slack plan I was using, I migrated to LINE.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 4: Fighting Hallucinations
&lt;/h3&gt;

&lt;p&gt;This was the hardest part.&lt;/p&gt;

&lt;p&gt;The AI was confidently returning information that sounded plausible but wasn't true. Specifically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Asserting fabricated problems as "challenges faced by [department]"&lt;/li&gt;
&lt;li&gt;Returning outdated information as if it were current&lt;/li&gt;
&lt;li&gt;Filling in gaps with information not in any search result&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I approached this on two axes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Axis 1: Improve output accuracy&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I built in mechanisms to prevent unsupported information from slipping through.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fact/inference separation&lt;/strong&gt;: The AI explicitly labels each piece of information as either a verified fact (from official sources) or an inference (from surrounding context). The report displays these separately.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output gate&lt;/strong&gt;: Items that fail conditions like "only contains generalities with no specifics" or "no source URL exists" are filtered out before output.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Axis 2: Make it human-verifiable&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Improving accuracy alone isn't enough. Whether done by humans or AI, mistakes happen. What matters is making the process transparent.&lt;/p&gt;

&lt;p&gt;So I designed each report item to include both "the facts recognized" and "the reasoning path to the conclusion." Showing what evidence led to what conclusion lets humans catch reasoning that doesn't hold up.&lt;/p&gt;

&lt;p&gt;The goal is to save prep time, not to replace human judgment. That's fine.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 5: The Official Website Detection Rabbit Hole
&lt;/h3&gt;

&lt;p&gt;Search results mix "official company sites" with "everything else" (news, Wikipedia, etc.).&lt;/p&gt;

&lt;p&gt;I started with simple domain matching, but group companies, subsidiaries, and subdomains made that fall apart quickly.&lt;/p&gt;

&lt;p&gt;I eventually settled on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Return multiple official domain candidates&lt;/li&gt;
&lt;li&gt;Normalize to base domain (including subdomain matching)&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;.some()&lt;/code&gt; to check against the array&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Phase 6: Business Card Scanning
&lt;/h3&gt;

&lt;p&gt;"Wouldn't it be great if you could start researching the moment you get someone's card?"&lt;/p&gt;

&lt;p&gt;I added business card scanning using Claude's Vision capability. Send a photo of a card to LINE → it extracts company name, department, and contact name → triggers research automatically. OCR quality mattered, so I used Claude Sonnet here.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Lesson in Agent Sprawl
&lt;/h2&gt;

&lt;p&gt;At one point I tried to improve the reasoning logic by spinning up five agents simultaneously (field-sales / info-architect / reasoning-designer / impl-designer / critic).&lt;/p&gt;

&lt;p&gt;They went into an endless loop of spec discussion, autonomously generating 154 tasks. When I told them to stop, they kept going. I had to force-shutdown. Almost no actual code was written — "improving the spec" had become the goal in itself.&lt;/p&gt;

&lt;p&gt;The root cause: I hadn't defined what they were allowed to decide or when they were done.&lt;/p&gt;

&lt;p&gt;After that, I redesigned the agent structure. Instead of everyone chiming in freely, I cut it down to 3 roles and explicitly defined what each role was &lt;strong&gt;not&lt;/strong&gt; allowed to do.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Responsibility&lt;/th&gt;
&lt;th&gt;What they must NOT do&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;team-lead&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Routing and task management&lt;/td&gt;
&lt;td&gt;Write code, generate summaries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;product&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Decide implementation approach and implement&lt;/td&gt;
&lt;td&gt;Create tasks themselves&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;auditor&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pass/fail judgment only&lt;/td&gt;
&lt;td&gt;Write improvement suggestions, act unless called&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Defining "what not to do" alongside "what to do" made role boundaries much cleaner.&lt;/p&gt;




&lt;h2&gt;
  
  
  Cost
&lt;/h2&gt;

&lt;h3&gt;
  
  
  API cost per research run (measured)
&lt;/h3&gt;

&lt;p&gt;Varies by company size and available information.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Item&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claude Sonnet (analysis)&lt;/td&gt;
&lt;td&gt;~$0.35&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tavily (web search)&lt;/td&gt;
&lt;td&gt;~$0.05&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$0.40/run (range: $0.24–$0.52)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;At 100 runs/month that's ~$40; at 500 runs it's ~$200. It's currently free to use, so I'm entirely out of pocket. I'm in a "prove the value first" phase.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fixed costs (monthly)
&lt;/h3&gt;

&lt;p&gt;Hosting, DB, LINE, domain, etc. I've minimized these by combining free tiers, but it's not zero.&lt;/p&gt;




&lt;h2&gt;
  
  
  Current Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LINE bot
  ↓ business card image or text
Claude Haiku (input interpretation)
Claude Sonnet (business card OCR)
  ↓
Tavily (parallel web search: 12–15 queries)
  ↓
Claude Sonnet (fact extraction → issue inference → proposal generation)
  ↓
Supabase (report storage) ← auto-deleted after 30 days (personal data compliance)
  ↓
Report URL pushed to LINE
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;A product that started from "this sounds fun" made it to something I could actually publish.&lt;/p&gt;

&lt;p&gt;Hallucination mitigation, timeout workarounds, official site detection — making something genuinely usable turned out to be deeper than I expected.&lt;/p&gt;

&lt;p&gt;If you're curious, add it as a friend on LINE. Just send a photo of a business card and it runs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://pre-talk.vercel.app" rel="noopener noreferrer"&gt;https://pre-talk.vercel.app&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>nextjs</category>
      <category>linebot</category>
    </item>
    <item>
      <title>Three Ways to Call Codex from Claude Code — A Practical Breakdown</title>
      <dc:creator>naoki_JPN</dc:creator>
      <pubDate>Fri, 17 Apr 2026 01:54:38 +0000</pubDate>
      <link>https://dev.to/bokuno_log/three-ways-to-call-codex-from-claude-code-a-practical-breakdown-d8n</link>
      <guid>https://dev.to/bokuno_log/three-ways-to-call-codex-from-claude-code-a-practical-breakdown-d8n</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A breakdown of the three methods for calling OpenAI Codex from within a Claude Code session — their characteristics and when to use each. Researched on 2026-04-15.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Key Discovery: The Plugin Does NOT Call &lt;code&gt;codex --full-auto&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;The most important finding: codex-plugin-cc (used in Methods 2 and 3) internally uses the &lt;strong&gt;App Server Protocol (ASP)&lt;/strong&gt; — not the raw CLI.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Item&lt;/th&gt;
&lt;th&gt;CLI Mode (Method 1)&lt;/th&gt;
&lt;th&gt;ASP Mode (Methods 2 &amp;amp; 3)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Launch style&lt;/td&gt;
&lt;td&gt;One-shot process&lt;/td&gt;
&lt;td&gt;app-server-broker.mjs runs as a daemon&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Protocol&lt;/td&gt;
&lt;td&gt;stdin/stdout&lt;/td&gt;
&lt;td&gt;JSON-RPC 2.0 over stdio / WebSocket&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Thread continuation&lt;/td&gt;
&lt;td&gt;Not supported&lt;/td&gt;
&lt;td&gt;Possible via threadId&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Startup cost&lt;/td&gt;
&lt;td&gt;High (every time)&lt;/td&gt;
&lt;td&gt;Low (broker stays resident)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Detailed Comparison of the Three Methods
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Method 1: &lt;code&gt;codex --full-auto&lt;/code&gt; (Raw CLI)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;codex &lt;span class="nt"&gt;--full-auto&lt;/span&gt; &lt;span class="s2"&gt;"fix src/foo.ts"&lt;/span&gt;
&lt;span class="c"&gt;# = syntactic sugar for --sandbox workspace-write --ask-for-approval on-request&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Item&lt;/th&gt;
&lt;th&gt;Details&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Internal protocol&lt;/td&gt;
&lt;td&gt;CLI (one-shot)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auth / plan&lt;/td&gt;
&lt;td&gt;ChatGPT subscription &lt;strong&gt;or&lt;/strong&gt; OpenAI API key (either works)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Job tracking&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Background&lt;/td&gt;
&lt;td&gt;Manual &lt;code&gt;&amp;amp;&lt;/code&gt; only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Thread continuation&lt;/td&gt;
&lt;td&gt;Not supported&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prompt optimization&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Subscription-only features&lt;/td&gt;
&lt;td&gt;Fast Mode (unavailable with API key)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API key limitations&lt;/td&gt;
&lt;td&gt;No Fast Mode; may have delayed access to new models&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Quick experiments, interactive use, CI/batch&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gotcha&lt;/td&gt;
&lt;td&gt;Cannot write outside the workspace (sandbox restriction)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Method 2: &lt;code&gt;codex-companion.mjs task&lt;/code&gt; (Indirect via Bash)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;node &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CLAUDE_PLUGIN_ROOT&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/scripts/codex-companion.mjs"&lt;/span&gt; task &lt;span class="nt"&gt;--write&lt;/span&gt; &lt;span class="s2"&gt;"..."&lt;/span&gt;
node &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;CLAUDE_PLUGIN_ROOT&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/scripts/codex-companion.mjs"&lt;/span&gt; task &lt;span class="nt"&gt;--background&lt;/span&gt; &lt;span class="nt"&gt;--write&lt;/span&gt; &lt;span class="s2"&gt;"..."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Item&lt;/th&gt;
&lt;th&gt;Details&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Internal protocol&lt;/td&gt;
&lt;td&gt;ASP (via resident broker)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auth / plan&lt;/td&gt;
&lt;td&gt;ChatGPT subscription &lt;strong&gt;or&lt;/strong&gt; OpenAI API key (either works)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Job tracking&lt;/td&gt;
&lt;td&gt;Yes (job-id / state.json persistence)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Background&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;--background&lt;/code&gt; spawns a detached process&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Thread continuation&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;--resume-last&lt;/code&gt; continues the previous thread&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prompt optimization&lt;/td&gt;
&lt;td&gt;None (passes raw text as-is)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Subscription-only features&lt;/td&gt;
&lt;td&gt;Fast Mode (unavailable with API key)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API key limitations&lt;/td&gt;
&lt;td&gt;No Fast Mode; delayed new model access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Subcommands&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;task&lt;/code&gt; / &lt;code&gt;review&lt;/code&gt; / &lt;code&gt;adversarial-review&lt;/code&gt; / &lt;code&gt;status&lt;/code&gt; / &lt;code&gt;result&lt;/code&gt; / &lt;code&gt;cancel&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Long-running tasks, external job monitoring, job management&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gotcha&lt;/td&gt;
&lt;td&gt;None — the most straightforward of the three&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Method 3: &lt;code&gt;codex:rescue&lt;/code&gt; Sub-agent (Agent tool)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;subagent_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;codex:codex-rescue&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;On branch feat/xxx, ...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;run_in_background&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Item&lt;/th&gt;
&lt;th&gt;Details&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Internal protocol&lt;/td&gt;
&lt;td&gt;ASP (via companion.mjs)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auth / plan&lt;/td&gt;
&lt;td&gt;ChatGPT subscription &lt;strong&gt;or&lt;/strong&gt; OpenAI API key (either works)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Job tracking&lt;/td&gt;
&lt;td&gt;Yes (via companion)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Background&lt;/td&gt;
&lt;td&gt;&lt;code&gt;run_in_background: true&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Thread continuation&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;--resume&lt;/code&gt; flag in prompt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prompt optimization&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Auto-improved via gpt-5.4-prompting skill&lt;/strong&gt; (not available in Methods 1 &amp;amp; 2)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Subscription-only features&lt;/td&gt;
&lt;td&gt;Fast Mode (unavailable with API key)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API key limitations&lt;/td&gt;
&lt;td&gt;No Fast Mode; delayed new model access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Delegating implementation to Codex from within Claude (recommended default)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gotchas&lt;/td&gt;
&lt;td&gt;① Cannot write outside workspace  ② False positives when Bash is denied (Issue #158)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Auth &amp;amp; Plan Summary
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Auth method&lt;/th&gt;
&lt;th&gt;Available features&lt;/th&gt;
&lt;th&gt;Unavailable features&lt;/th&gt;
&lt;th&gt;Billing&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ChatGPT Plus ($20/mo) ~ Pro ($100–$200/mo)&lt;/td&gt;
&lt;td&gt;All features, Fast Mode, latest models (GPT-5.4 / GPT-5.3-Codex), cloud integrations (GitHub, Slack)&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Fixed monthly (rate limits apply)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI API key&lt;/td&gt;
&lt;td&gt;CLI, IDE, ASP execution&lt;/td&gt;
&lt;td&gt;Fast Mode, cloud integrations, immediate access to new models&lt;/td&gt;
&lt;td&gt;Token-based pay-as-you-go&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ Important:&lt;/strong&gt; All three methods use the same authentication. There is no method that exclusively requires a subscription or an API key. The difference only appears in Fast Mode availability and how quickly you get access to new models.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Decision Flowchart
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyjt0gbu5wt0ztogk7o45.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyjt0gbu5wt0ztogk7o45.png" alt="Decision Flowchart" width="600" height="556"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Known Issues &amp;amp; Gotchas
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Issue #158: False Positives in codex:rescue
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; When the Bash tool is denied, the sub-agent silently reads files, performs its own analysis, and falsely reports that "Codex executed it."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Expected behavior:&lt;/strong&gt; Bash denied → &lt;code&gt;return nothing&lt;/code&gt; (return nothing at all)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Impact:&lt;/strong&gt; In environments where Bash is restricted, you cannot tell whether &lt;code&gt;codex:rescue&lt;/code&gt; output was genuinely produced by Codex.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cannot Write Outside the Workspace
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;sandbox workspace-write&lt;/code&gt; blocks writes to directories outside the launch directory. Delegating from an ops session to a dev directory will fail.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workaround:&lt;/strong&gt; Call from a Claude session inside the target workspace, or switch to DEV agent + direct Edit tool.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/openai/codex-plugin-cc" rel="noopener noreferrer"&gt;GitHub: openai/codex-plugin-cc&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://deepwiki.com/openai/codex-plugin-cc/3.2-rescue-and-task-delegation" rel="noopener noreferrer"&gt;Rescue &amp;amp; Task Delegation | DeepWiki&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.openai.com/codex/app-server" rel="noopener noreferrer"&gt;App Server – Codex | OpenAI Developers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.openai.com/codex/auth" rel="noopener noreferrer"&gt;Authentication – Codex | OpenAI Developers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.openai.com/codex/pricing" rel="noopener noreferrer"&gt;Pricing – Codex | OpenAI Developers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/openai/codex-plugin-cc/issues/158" rel="noopener noreferrer"&gt;Issue #158: codex:rescue false success claims&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>claudecode</category>
      <category>codex</category>
      <category>openai</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
