<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: CloudX</title>
    <description>The latest articles on DEV Community by CloudX (cloudx).</description>
    <link>https://dev.to/cloudx</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F1620%2Fe405dafe-8dad-4495-a111-d767102bcad6.png</url>
      <title>DEV Community: CloudX</title>
      <link>https://dev.to/cloudx</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/cloudx"/>
    <language>en</language>
    <item>
      <title>How I Built a Personal AI super-app by Wrapping Codex App Server</title>
      <dc:creator>German Burgardt</dc:creator>
      <pubDate>Fri, 19 Jun 2026 20:10:30 +0000</pubDate>
      <link>https://dev.to/cloudx/how-i-built-a-personal-ai-super-app-by-wrapping-codex-app-server-5fp6</link>
      <guid>https://dev.to/cloudx/how-i-built-a-personal-ai-super-app-by-wrapping-codex-app-server-5fp6</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1k5lxvywlfxym4quojxn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1k5lxvywlfxym4quojxn.png" alt="Hero banner showing a Codex App Server wrapper becoming a personal AI super-app" width="799" height="333"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For months I used Codex like everyone else: one terminal, one session, one long output. Then I found &lt;code&gt;codex app-server&lt;/code&gt;, the same engine exposed as JSON-RPC over stdio. It gave me a more useful idea, building my own interface for the work I actually do.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a super-app means today
&lt;/h2&gt;

&lt;p&gt;OpenAI has been pretty explicit about this: a &lt;a href="https://openai.com/index/next-phase-of-enterprise-ai/" rel="noopener noreferrer"&gt;&lt;strong&gt;unified AI superapp&lt;/strong&gt;&lt;/a&gt; is a place where agents, tools, context, and history live together. Instead of jumping from chat to terminal, from terminal to browser, from browser to docs, everything happens in one working surface.&lt;/p&gt;

&lt;p&gt;Theo built a version of that intuition with &lt;a href="https://t3.chat" rel="noopener noreferrer"&gt;T3 Chat&lt;/a&gt;: he got tired of waiting for the perfect chat app and built his own. My version started from a similar feeling, Codex was already great, but the way I actually work needed a different interface on top.&lt;/p&gt;

&lt;h2&gt;
  
  
  I built my own super-app
&lt;/h2&gt;

&lt;p&gt;Some context first. Over the last months I've been building a desktop app that wraps Codex. It runs several agent sessions in parallel in a grid, improves my prompts before the agent sees them, explains the agent's output in plain language, and spawns subagents with one click.&lt;/p&gt;

&lt;p&gt;I never planned a product. I automated my own frictions, one at a time, until the wrapper became the place where I actually work. Theo did the same thing with &lt;a href="https://t3.chat" rel="noopener noreferrer"&gt;T3 Chat&lt;/a&gt;: no existing AI chat fit him, so he built his own.&lt;/p&gt;

&lt;p&gt;My point is that you can build yours too. The harness is already exposed, you just need to find the door.&lt;/p&gt;

&lt;h2&gt;
  
  
  The super-app starts with a small door
&lt;/h2&gt;

&lt;p&gt;Most people use Codex the way it's marketed, an agent chatting in your terminal. But the same binary ships another mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;codex app-server
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That subcommand turns the CLI into a server that speaks &lt;strong&gt;JSON-RPC over stdio&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;And you only need a little to do something real with it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;thread/start&lt;/code&gt;: open a session&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;turn/start&lt;/code&gt;: give it work&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;turn/steer&lt;/code&gt;: inject a message into a turn that's already running&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are more like &lt;code&gt;thread/resume&lt;/code&gt;, &lt;code&gt;turn/interrupt&lt;/code&gt;, &lt;code&gt;thread/settings/update&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  First need: subagents in ~10 lines
&lt;/h2&gt;

&lt;p&gt;The feature I needed was simple: one click, and a fresh Codex instance inherits my session's context and chases a parallel idea while my main session keeps its focus.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;packTimeline&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;parent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;timeline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;48&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// newest-first&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;briefing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;You are a sub-agent delegated from an active parent session…&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The parent keeps editing in parallel; inspect current files and preserve its changes.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;houseRules&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;newTask&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;child&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;rpc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;thread/start&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;cwd&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;rpc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;turn/start&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;threadId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;child&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;briefing&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The important part was turning one of my frictions into a button I could press.&lt;/p&gt;

&lt;p&gt;That's the mental shift, a super-app can be small and concrete; it gathers the operations that used to be scattered across your head, your terminal, your notes, and your prompts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fpf7ezmpf4wkvk205zost.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fpf7ezmpf4wkvk205zost.png" alt="Illustrative screenshot of a main Codex session delegating context to a subagent session" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example: one main session keeps focus while a delegated subagent inherits context and tests edge cases.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Context is everything
&lt;/h2&gt;

&lt;p&gt;Spawning a second agent is easy. The hard part is giving it enough context to work without stepping on the parent session.&lt;/p&gt;

&lt;p&gt;The briefing packs project name, working directory, current focus, and a snapshot of the parent's timeline capped at &lt;strong&gt;48,000 characters&lt;/strong&gt;, newest entries first. The child's window shows only the task you typed; the runtime receives the full context.&lt;/p&gt;

&lt;p&gt;The briefing also tells the child something that matters a lot in practice: you are not alone in this repository. The parent session is still working next to you. Inspect the current file state, respect parallel changes, and keep your scope tight.&lt;/p&gt;

&lt;p&gt;And the door swings both ways. When the child is created, the parent gets a coordination note with &lt;code&gt;turn/steer&lt;/code&gt;. When the child finishes or gets stuck, it reports back explicitly: &lt;code&gt;task_completed&lt;/code&gt;, &lt;code&gt;blocked&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The timeline became the backbone
&lt;/h2&gt;

&lt;p&gt;The next shift was quieter.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;app-server&lt;/code&gt; gives you a stream of structured events: user messages, agent messages, commands, file changes, tool calls, approvals, status changes. My wrapper turns those events into a timeline and treats that timeline as working state.&lt;/p&gt;

&lt;p&gt;The real object is boring in the best way:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;timelineItem&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;itemType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;threadId&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;turn_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;turnId&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;cwd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;changes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
  &lt;span class="na"&gt;awaiting_approval&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From there, features share the same source of truth. The explainer receives the timeline as context, so it reads the session from the actual sequence of events:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;timelineContext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;buildExplainerTimelineContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;codexNativeTimeline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;explanationLlm&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nl"&gt;contextText&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;hasTimelineContext&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;timelineContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="nx"&gt;contextSource&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;hasTimelineContext&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;app_server_timeline&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Subagents use the same backbone. The child gets the parent's recent timeline packed into its first task, so it starts with the real session state: what was asked, what Codex tried, which commands ran, which files changed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;timelineContext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;buildTimelineContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;codexNativeTimeline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;SUBAGENT_CONTEXT_MAX_CHARS&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;contextText&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;timelineContext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nf"&gt;buildSubagentHistoryContext&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;contextText&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="s2"&gt;`COPIED SESSION CONTEXT:\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;contextText&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That detail is where the app starts to feel alive. Codex keeps being the engine. The product lives around the session: shared context, explanations, approvals, delegation, recovery. The timeline is the backbone that lets all of those features talk about the same work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Needs that appear once the wrapper exists
&lt;/h2&gt;

&lt;p&gt;Subagents were the first one. Then I looked at the rest of my day and saw the same pattern: small repeated frictions that no tool was going to prioritize for me.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fgncw3d3yqdd50wrsrsni.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fgncw3d3yqdd50wrsrsni.png" alt="Conceptual illustration of a personal AI super-app ecosystem: connected modules around a central engine" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My prompts get improved before Codex ever sees them.&lt;/strong&gt; When I hit send, a dedicated investigator digs through the repo and extracts high-signal context. Then another model rewrites my messy draft into a precise prompt and shows me why it rewrote it that way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Codex does the work; Claude translates around it.&lt;/strong&gt; Codex is great at moving through code, but its output often comes back raw: dense, literal, closer to a log than to an explanation. Claude sits one layer above. It explains Codex back to me, and it also explains my messy ideas to Codex before the turn starts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1k2y57f2eypbl0a6ce8q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1k2y57f2eypbl0a6ce8q.png" alt="Example of a prompt transformer restructuring a vague user prompt into a clear XML-style task brief" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example: how the transformer restructures a vague prompt into clarity.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fwoeigr8j0m0xre2k8i8c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fwoeigr8j0m0xre2k8i8c.png" alt="Example of Claude turning dense Codex output into a clear explanation with next steps" width="800" height="509"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Example: how Claude explains what Codex output was missing and how to improve it.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That is the pattern I kept seeing: one model executes, another model translates, and the wrapper keeps the loop together. I can write rough intent, Claude turns it into something sharper, Codex executes, and Claude helps me understand what came back.&lt;/p&gt;

&lt;h2&gt;
  
  
  Build yours
&lt;/h2&gt;

&lt;p&gt;Start with a friction before thinking about a platform.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Launch &lt;code&gt;codex app-server&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Talk to it over JSON-RPC.&lt;/li&gt;
&lt;li&gt;Pick one action you repeat every day.&lt;/li&gt;
&lt;li&gt;Turn that action into an interface.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The big version of the AI world is moving toward super-apps: one place where agents, tools, and context mix. But the useful version can start on your machine, with a button that solves something that bothered you yesterday.&lt;/p&gt;

&lt;p&gt;A year ago I wrote that &lt;a href="https://dev.to/cloudx/forget-the-hype-agents-are-loops-1n3i"&gt;agents are just loops&lt;/a&gt;. Here's this year's version: &lt;strong&gt;a super-app is loops, context, and the right model in the right place&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Next time you think "I wish Codex could do this", it probably can. You just have to wrap it.&lt;/p&gt;

&lt;p&gt;Questions? Leave a comment below.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>codex</category>
      <category>agents</category>
      <category>cli</category>
    </item>
    <item>
      <title>One Brain, Many Hands: Building a Parallel Task Orchestrator for AI Agents</title>
      <dc:creator>Pablo Calofatti</dc:creator>
      <pubDate>Wed, 20 May 2026 15:39:00 +0000</pubDate>
      <link>https://dev.to/cloudx/one-brain-many-hands-building-a-parallel-task-orchestrator-for-ai-agents-3nb4</link>
      <guid>https://dev.to/cloudx/one-brain-many-hands-building-a-parallel-task-orchestrator-for-ai-agents-3nb4</guid>
      <description>&lt;h2&gt;
  
  
  The Problem No One Talks About
&lt;/h2&gt;

&lt;p&gt;AI coding assistants are fast. Absurdly fast. They can scaffold a service, write tests, and refactor a module in the time it takes me to finish my coffee.&lt;/p&gt;

&lt;p&gt;But they do it &lt;strong&gt;one thing at a time&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I was working on a microservices audit logging system. Four tasks, all independent, all well-scoped: implement the event schema, build the ingestion endpoint, add the query API, write the integration tests. I could see the finish line from the start. And yet, I sat there feeding tasks one by one into my Claude Code session like I was hand-loading a washing machine with individual socks.&lt;/p&gt;

&lt;p&gt;Each task took 10-15 minutes of agent time. Four tasks, run sequentially: an hour.&lt;/p&gt;

&lt;p&gt;But they didn't &lt;em&gt;need&lt;/em&gt; to be sequential. They touched different files, different concerns, different parts of the codebase. If I had four developers, I'd assign all four tasks at once and go make lunch.&lt;/p&gt;

&lt;p&gt;So why couldn't I do that with AI agents?&lt;/p&gt;

&lt;h2&gt;
  
  
  The Stripe Spark
&lt;/h2&gt;

&lt;p&gt;Around early March 2026, Stripe published a blog post about their internal system called Minions: one-shot, end-to-end coding agents that handle tasks in parallel. The core insight hit me immediately:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You don't need smarter agents. You need an orchestrator that knows how to keep dumb-ish agents on rails.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Stripe's &lt;strong&gt;blueprint pattern&lt;/strong&gt; was elegant: wrap the &lt;em&gt;agentic&lt;/em&gt; steps (the creative, unpredictable part — writing code, fixing bugs) inside &lt;em&gt;deterministic&lt;/em&gt; guardrails (git operations, linting, testing). The agent can hallucinate all it wants during implementation, but if the tests don't pass, it loops back. If the lint fails, it fixes. And the whole thing is bounded: two attempts max, then stop.&lt;/p&gt;

&lt;p&gt;I wasn't trying to build AGI.&lt;/p&gt;

&lt;p&gt;I was building a &lt;strong&gt;foreman&lt;/strong&gt; for a construction crew.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture: Markdown All the Way Down
&lt;/h2&gt;

&lt;p&gt;Here's where it gets a little unhinged. The entire orchestrator is &lt;strong&gt;pure markdown&lt;/strong&gt;. No TypeScript runtime. No Python glue. No code to maintain, compile, or debug. Just three markdown files that Claude Code loads as instructions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Orchestrator&lt;/strong&gt; (&lt;code&gt;commands/minion.md&lt;/code&gt;): the brain. Parses your task list, figures out dependency order, computes parallel waves, spawns workers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Worker&lt;/strong&gt; (&lt;code&gt;agents/minion-worker.md&lt;/code&gt;): the hands. Each worker gets a task, a git worktree, and a blueprint to follow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Blueprint&lt;/strong&gt; (&lt;code&gt;skills/minion-blueprint/SKILL.md&lt;/code&gt;): the rails. A step-by-step execution pattern: branch, implement, lint, test, commit, report.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I call this &lt;strong&gt;zero-code prompt engineering&lt;/strong&gt;. The "code" &lt;em&gt;is&lt;/em&gt; the prompt. Claude Code interprets the markdown instructions and executes them using its built-in tools (bash, file editing, git). No SDK, no API calls, no infrastructure.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You (human)
  └── Claude Code session (orchestrator)
        ├── Worker-1 (worktree: .worktrees/task-1)
        ├── Worker-2 (worktree: .worktrees/task-2)
        └── Worker-3 (worktree: .worktrees/task-3)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each worker runs in an &lt;strong&gt;isolated git worktree&lt;/strong&gt;. That's the key. A worktree is like a parallel checkout of your repo: same git history, different working directory. Worker-1 can edit &lt;code&gt;src/api/events.ts&lt;/code&gt; while Worker-2 edits &lt;code&gt;src/api/queries.ts&lt;/code&gt; and they never step on each other's toes.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Actually Works
&lt;/h2&gt;

&lt;p&gt;You invoke the orchestrator with a task list:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/minion

Tasks:
1. Implement event schema with Zod validation in src/events/schema.ts
2. Build ingestion POST endpoint at /api/events (depends on: 1)
3. Add query GET endpoint at /api/events with filtering (depends on: 1)
4. Write integration tests for both endpoints (depends on: 2, 3)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The orchestrator reads these, builds a &lt;strong&gt;dependency graph&lt;/strong&gt;, and computes &lt;strong&gt;waves&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Wave 1&lt;/strong&gt;: Task 1 (no dependencies)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wave 2&lt;/strong&gt;: Tasks 2 and 3 (both depend on 1, but not each other, so they run &lt;strong&gt;in parallel&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wave 3&lt;/strong&gt;: Task 4 (depends on 2 and 3)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Before spawning anything, it runs &lt;strong&gt;conflict detection&lt;/strong&gt; — scanning the file paths each task is likely to touch. If two tasks in the same wave would edit the same file, it flags the conflict and serializes them.&lt;/p&gt;

&lt;p&gt;Then it spawns workers. Each one gets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A fresh worktree branched from the current HEAD&lt;/li&gt;
&lt;li&gt;The task description with full context&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;domain agent&lt;/strong&gt; overlay, auto-selected by keywords. "Endpoint" and "API" gets &lt;code&gt;backend-architect&lt;/code&gt;. "React component" gets &lt;code&gt;frontend-architect&lt;/code&gt;. "Dockerfile" gets &lt;code&gt;devops-engineer&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;role overlay&lt;/strong&gt; for the current phase. Planning? You get &lt;code&gt;researcher&lt;/code&gt;. Implementing? &lt;code&gt;tdd-developer&lt;/code&gt;. Reviewing? &lt;code&gt;code-reviewer&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Auto-detected project tools: it reads your &lt;code&gt;package.json&lt;/code&gt; to find your lint command, test runner, and build script.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's how it looks in practice, running a real 12-task HAL refactor with wave-based parallel execution:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvz04fbhsupnkn7gc5ipa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvz04fbhsupnkn7gc5ipa.png" alt="Minion orchestrator running a 12-task HAL refactor with wave-based parallel execution" width="765" height="1382"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The blueprint each worker follows looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create branch from base&lt;/li&gt;
&lt;li&gt;Implement the task (agentic — this is where creativity happens)&lt;/li&gt;
&lt;li&gt;Run lint. If fails, fix (1 attempt)&lt;/li&gt;
&lt;li&gt;Run tests. If fails, fix (1 attempt)&lt;/li&gt;
&lt;li&gt;If still failing after fixes, STOP and report partial progress&lt;/li&gt;
&lt;li&gt;Commit with conventional commit message&lt;/li&gt;
&lt;li&gt;Push branch, create PR&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The &lt;strong&gt;two-iteration maximum&lt;/strong&gt; is sacred. Without it, an agent can loop forever trying to fix a cascading lint error. With it, the worst case is a partially complete PR with a clear status report.&lt;/p&gt;

&lt;p&gt;And here's something I learned: &lt;strong&gt;incomplete runs are still an excellent starting point.&lt;/strong&gt; A PR that's 80% done with a clear "tests fail because X" note is infinitely better than an agent spinning its wheels for 20 minutes.&lt;/p&gt;

&lt;p&gt;When the full workflow runs with plan, implement, and review phases, it looks like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftc8m2cokkro179d4qqcc.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftc8m2cokkro179d4qqcc.jpeg" alt="Minion workflow showing plan, implement, and review phases with 22 tests passing and approved review" width="800" height="938"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The First Real Test (And the First Real Crash)
&lt;/h2&gt;

&lt;p&gt;I tested minion-toolkit on a real project — a microservices tracing system called &lt;code&gt;traza-microservicios&lt;/code&gt;. Two tasks, two parallel workers.&lt;/p&gt;

&lt;p&gt;Worker-2 nailed it. Clean branch, passing tests, PR ready.&lt;/p&gt;

&lt;p&gt;Worker-1... corrupted git.&lt;/p&gt;

&lt;p&gt;Turns out, running concurrent git worktree operations on macOS can trigger a &lt;code&gt;SIGBUS&lt;/code&gt; signal — a low-level memory bus error. The git index file gets corrupted when two processes try to update refs simultaneously. This had nothing to do with prompt engineering or AI. It was a &lt;strong&gt;filesystem concurrency&lt;/strong&gt; problem that human developers rarely hit because they don't create three worktrees in the same second.&lt;/p&gt;

&lt;p&gt;The fix was unglamorous: add a small delay between worktree creation, use &lt;code&gt;--no-optional-locks&lt;/code&gt; on git operations, and as a last resort, clone fresh to &lt;code&gt;/tmp&lt;/code&gt; and work from there.&lt;/p&gt;

&lt;p&gt;This is the kind of lesson you only learn by shipping. No amount of design documents would have predicted "macOS git worktrees have a race condition under parallel agent spawning."&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons Learned the Hard Way
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Merge conflicts are the #1 issue.&lt;/strong&gt; Not AI hallucinations. Not wrong code. Not failing tests. The most common failure mode is two workers editing the same file. Conflict detection before spawning was the single most impactful feature I added.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workers create extra files.&lt;/strong&gt; An agent told to "add an endpoint" might also create a utility module, update a barrel export, or add a type file. These out-of-scope changes cause merge conflicts downstream. The solution: &lt;code&gt;git cherry-pick&lt;/code&gt; specific commits instead of merging entire branches.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;gh pr checks&lt;/code&gt; lies to you.&lt;/strong&gt; GitHub's CLI reports check status using a &lt;code&gt;bucket&lt;/code&gt; field, not &lt;code&gt;conclusion&lt;/code&gt;. I spent an embarrassing amount of time wondering why "passing" PRs were being flagged as failed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dry-run mode saves sanity.&lt;/strong&gt; Before spawning 4 parallel workers that each burn tokens, preview what &lt;em&gt;would&lt;/em&gt; happen — which waves, which agents, which files. &lt;code&gt;--dry-run&lt;/code&gt; was an afterthought that became a daily habit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost tracking matters.&lt;/strong&gt; Parallel workers multiply your API usage. Knowing that a 4-task run costs ~$2.50 vs. a single sequential run at ~$1.80 helps you decide when parallelism is worth it. (Spoiler: when time matters more than tokens, it almost always is.)&lt;/p&gt;

&lt;h2&gt;
  
  
  The Evolution
&lt;/h2&gt;

&lt;p&gt;What started as a 200-line markdown file on March 4th grew into a proper open-source tool by March 8th:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;What Changed&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1.0&lt;/td&gt;
&lt;td&gt;Basic orchestrator, manual task parsing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2.0&lt;/td&gt;
&lt;td&gt;Cross-phase memory, dry-run, worker health monitoring&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2.1&lt;/td&gt;
&lt;td&gt;Conflict prevention, smart context gathering, cost tracking&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2.2&lt;/td&gt;
&lt;td&gt;MCP optional delegation, CLI installer (&lt;code&gt;npx minion-toolkit install&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2.3&lt;/td&gt;
&lt;td&gt;Workflows, domain agents, role overlays, intent capture, stall detection&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The CLI installer copies the markdown assets into your &lt;code&gt;~/.claude/&lt;/code&gt; directory and configures the plugins. One command, and your Claude Code session becomes a parallel orchestrator:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx minion-toolkit &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After that, &lt;code&gt;/minion&lt;/code&gt; is available in any Claude Code session.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who Is This For?
&lt;/h2&gt;

&lt;p&gt;If you use Claude Code (or plan to) and regularly work on tasks that can be parallelized — feature development across multiple files, multi-service changes, test suites, documentation batches — this tool turns your single-threaded AI session into a coordinated team.&lt;/p&gt;

&lt;p&gt;It's especially useful for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Solo developers&lt;/strong&gt; who want to simulate a small team&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tech leads&lt;/strong&gt; who want to delegate a batch of well-scoped tasks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anyone&lt;/strong&gt; who's tired of feeding tasks one at a time into an AI session&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's &lt;em&gt;not&lt;/em&gt; for tasks that are deeply intertwined — where every file depends on every other file. The orchestrator can handle dependencies between waves, but within a wave, tasks must be independent. If your tasks can't be parallelized, a single session is still the right tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;The whole project is open source:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/pablocalofatti/minion-toolkit" rel="noopener noreferrer"&gt;pablocalofatti/minion-toolkit&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;npm&lt;/strong&gt;: &lt;code&gt;npx minion-toolkit install&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Star it if the idea resonates. Open an issue if something breaks (something will break — that's the fun part). And if you build something cool with it, I want to hear about it.&lt;/p&gt;

&lt;p&gt;Claude Code is already a remarkably capable developer. But even the best developer can only type on one keyboard. minion-toolkit gives them a crew.&lt;/p&gt;

&lt;p&gt;One brain. Many hands. That's the whole idea.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>opensource</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Your agent follows instructions. Until it doesn't.</title>
      <dc:creator>Daniel Diaz</dc:creator>
      <pubDate>Tue, 19 May 2026 19:08:52 +0000</pubDate>
      <link>https://dev.to/cloudx/your-agent-follows-instructions-until-it-doesnt-11m1</link>
      <guid>https://dev.to/cloudx/your-agent-follows-instructions-until-it-doesnt-11m1</guid>
      <description>&lt;h2&gt;
  
  
  Hooks over flowers
&lt;/h2&gt;

&lt;p&gt;Instructions suggest. Skills guide. Hooks &lt;strong&gt;enforce&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That's the whole story, but it took me a while to understand why it matters.&lt;/p&gt;

&lt;p&gt;You've set up your agent well. Custom instructions for coding standards. Skills for your team's specific patterns. Copilot knows your stack, follows your conventions, generates code that looks like yours.&lt;/p&gt;

&lt;p&gt;Then one day it decides to run &lt;code&gt;rm -rf dist/&lt;/code&gt; before rebuilding. Or it edits a file and moves on without running the formatter. Or it completes a task without touching the test suite. All technically valid moves. All not what you wanted.&lt;/p&gt;

&lt;p&gt;The problem isn't instructions or skills failing. It's that they &lt;em&gt;guide&lt;/em&gt; the agent. They inform its decisions. But an agent that's informed can still decide differently. Instructions are context. They're not guarantees.&lt;/p&gt;

&lt;p&gt;Hooks are guarantees.&lt;/p&gt;

&lt;h2&gt;
  
  
  What are Agent Hooks?
&lt;/h2&gt;

&lt;p&gt;Agent Hooks are shell commands that VS Code runs at specific lifecycle points during an agent session. Not suggested. Not requested. Run.&lt;/p&gt;

&lt;p&gt;They live in your repo at:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.github/
└── hooks/
    └── hooks.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The filename can be anything. VS Code loads all &lt;code&gt;*.json&lt;/code&gt; files inside &lt;code&gt;.github/hooks/&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;A hook that auto-formats every file the agent edits looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PostToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx prettier --write &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;$TOOL_INPUT_FILE_PATH&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Save the file. VS Code picks it up automatically. The next time the agent edits any file, Prettier runs. No prompt needed, no skill invocation, no agent cooperation required.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;VS Code added agent-scoped hooks in &lt;a href="https://code.visualstudio.com/updates/v1_111" rel="noopener noreferrer"&gt;version 1.111&lt;/a&gt; (March 2026). Workspace hooks via &lt;code&gt;.github/hooks/&lt;/code&gt; were available earlier. Use &lt;code&gt;/create-hook&lt;/code&gt; in chat to scaffold a hook configuration.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Meet the eight
&lt;/h2&gt;

&lt;p&gt;This is where hooks get interesting. VS Code exposes eight points in an agent session where your code can run:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;th&gt;When it fires&lt;/th&gt;
&lt;th&gt;What you can do&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SessionStart&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;User submits the first prompt&lt;/td&gt;
&lt;td&gt;Inject context, validate project state&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;UserPromptSubmit&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;User submits any prompt&lt;/td&gt;
&lt;td&gt;Audit requests, add system context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PreToolUse&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Before the agent invokes any tool&lt;/td&gt;
&lt;td&gt;Block dangerous operations, require approval&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PostToolUse&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;After a tool completes&lt;/td&gt;
&lt;td&gt;Run formatters, trigger follow-up actions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;PreCompact&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Before conversation context is compacted&lt;/td&gt;
&lt;td&gt;Export state before it gets truncated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SubagentStart&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A subagent is spawned&lt;/td&gt;
&lt;td&gt;Initialize subagent resources&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SubagentStop&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A subagent completes&lt;/td&gt;
&lt;td&gt;Aggregate results, clean up&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Stop&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Agent session ends&lt;/td&gt;
&lt;td&gt;Run tests, generate reports, send notifications&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Most people, when they first hear about hooks, think &lt;code&gt;PostToolUse&lt;/code&gt;: run a formatter. That's valid and immediately useful.&lt;/p&gt;

&lt;p&gt;But look at &lt;code&gt;PreToolUse&lt;/code&gt;. That's where you can &lt;em&gt;stop&lt;/em&gt; the agent before it does something.&lt;/p&gt;

&lt;p&gt;And &lt;code&gt;SessionStart&lt;/code&gt;. That's where you can inject context that the agent doesn't have yet: current branch, environment, version, whether production deploys are frozen.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;PreCompact&lt;/code&gt; is the one nobody talks about. When a long session gets too big for the context window, VS Code compacts it. You can hook into that moment to export state, save important context to a file, or log what's about to be truncated.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stop. Before anything happens
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;PreToolUse&lt;/code&gt; fires before every tool invocation. Your hook receives the tool name and its input via stdin as JSON:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hookEventName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"PreToolUse"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tool_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"run_in_terminal"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tool_input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"rm -rf dist/"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your hook can return one of three &lt;code&gt;permissionDecision&lt;/code&gt; values:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;"allow"&lt;/code&gt;: proceed without asking the user&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"deny"&lt;/code&gt;: block the operation entirely&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"ask"&lt;/code&gt;: stop and require explicit user confirmation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A simple shell script that blocks destructive commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nv"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;command&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$input&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.tool_input.command // ""'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$command&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-qE&lt;/span&gt; &lt;span class="s1"&gt;'(rm -rf|DROP TABLE|git push --force)'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'{"hookSpecificOutput": {"hookEventName": "PreToolUse", "permissionDecision": "deny", "permissionDecisionReason": "Destructive command blocked by policy"}}'&lt;/span&gt;
  &lt;span class="nb"&gt;exit &lt;/span&gt;0
&lt;span class="k"&gt;fi

&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'{"hookSpecificOutput": {"hookEventName": "PreToolUse", "permissionDecision": "allow"}}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wire it up:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PreToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"./scripts/validate-command.sh"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent never sees the operation. It doesn't try to negotiate. The hook returns &lt;code&gt;deny&lt;/code&gt;, the tool call stops.&lt;/p&gt;

&lt;p&gt;There's also a simpler way to block without writing any JSON at all. Every script, when it finishes, returns a number to whoever called it. That's an exit code. &lt;code&gt;0&lt;/code&gt; means success. VS Code specifically watches for exit code &lt;code&gt;2&lt;/code&gt; as a "stop this right now" signal. You just write your reason to stderr and exit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Production deploys are frozen"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&amp;amp;2
&lt;span class="nb"&gt;exit &lt;/span&gt;2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;VS Code passes that message to the model as context and blocks the tool call. No JSON, no &lt;code&gt;hookSpecificOutput&lt;/code&gt;, no ceremony.&lt;/p&gt;

&lt;p&gt;One more thing worth knowing: &lt;code&gt;continue: false&lt;/code&gt; in the JSON output stops the entire agent session, not just one tool call. Think of exit code &lt;code&gt;2&lt;/code&gt; as hitting the brakes on a single operation. &lt;code&gt;continue: false&lt;/code&gt; is turning off the engine. Use the second one carefully.&lt;/p&gt;

&lt;p&gt;If you define more than one hook for the same event, VS Code runs all of them. When multiple &lt;code&gt;PreToolUse&lt;/code&gt; hooks return conflicting &lt;code&gt;permissionDecision&lt;/code&gt; values, the most restrictive wins: &lt;code&gt;deny&lt;/code&gt; beats &lt;code&gt;ask&lt;/code&gt;, and &lt;code&gt;ask&lt;/code&gt; beats &lt;code&gt;allow&lt;/code&gt;. You can stack a validation hook and a logging hook on the same event without worrying about one canceling the other.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Full protocol reference: &lt;a href="https://code.visualstudio.com/docs/copilot/customization/hooks#_hook-input-and-output" rel="noopener noreferrer"&gt;Hook input and output, VS Code docs&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The morning briefing
&lt;/h2&gt;

&lt;p&gt;Instructions are static. They were written at the time you created the file. But some context only exists at runtime: the current git branch, which environment is active, whether the CI pipeline is currently broken.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SessionStart&lt;/code&gt; fires before the agent does anything. Your hook can inject that information directly into the session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nv"&gt;branch&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;git branch &lt;span class="nt"&gt;--show-current&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;node_version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;node &lt;span class="nt"&gt;-v&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;env_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;NODE_ENV&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;development&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;

jq &lt;span class="nt"&gt;-n&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--arg&lt;/span&gt; branch &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$branch&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--arg&lt;/span&gt; node &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$node_version&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--arg&lt;/span&gt; &lt;span class="nb"&gt;env&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$env_name&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s1"&gt;'{
    hookSpecificOutput: {
      hookEventName: "SessionStart",
      additionalContext: ("Branch: " + $branch + " | Node: " + $node + " | Env: " + $env)
    }
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every session the agent starts, it already knows what branch it's on and what environment it's working in. No need to tell it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Not every hook is for everyone
&lt;/h2&gt;

&lt;p&gt;Workspace hooks apply to every agent interaction. But sometimes you want a hook that only runs for a specific agent.&lt;/p&gt;

&lt;p&gt;Since VS Code 1.111, you can define hooks directly in a custom agent's frontmatter. The &lt;code&gt;hooks&lt;/code&gt; field here uses YAML syntax because it lives inside the &lt;code&gt;.agent.md&lt;/code&gt; frontmatter, but the structure maps directly to what you'd write in a JSON config file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PM&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Explainer"&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Explains&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;to&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;your&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;PM&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;why&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;it's&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;taking&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;longer.&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Also&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;formats&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;code."&lt;/span&gt;
&lt;span class="na"&gt;hooks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;PostToolUse&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;command&lt;/span&gt;
      &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;./scripts/format-and-lint.sh"&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

You are a code editing agent. After making changes, files are formatted and linted automatically.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enable &lt;code&gt;chat.useCustomAgentHooks&lt;/code&gt; to use this. The hook only runs when this specific agent is active, either selected by the user or invoked as a subagent.&lt;/p&gt;

&lt;h2&gt;
  
  
  The part everyone skips (don't)
&lt;/h2&gt;

&lt;p&gt;Hooks run shell commands with the same permissions as VS Code. That's the &lt;em&gt;power&lt;/em&gt; and the &lt;em&gt;responsibility&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Before enabling hooks from any source, read the scripts&lt;/em&gt;&lt;/strong&gt;. Hooks receive JSON input from the agent. Validate and sanitize that input before using it. Never hardcode secrets in hook scripts; use environment variables.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;One specific risk&lt;/em&gt;&lt;/strong&gt;: if the agent has access to edit scripts that your hooks run, it can modify those scripts during a session and execute code it writes. Use &lt;a href="https://code.visualstudio.com/docs/copilot/customization/hooks#_safety" rel="noopener noreferrer"&gt;&lt;code&gt;chat.tools.edits.autoApprove&lt;/code&gt;&lt;/a&gt; to prevent the agent from editing hook scripts without manual approval.&lt;/p&gt;

&lt;h2&gt;
  
  
  When your hook does nothing
&lt;/h2&gt;

&lt;p&gt;If a hook runs but nothing seems to happen, check the &lt;strong&gt;GitHub Copilot Chat Hooks&lt;/strong&gt; output channel. Open the Output panel in VS Code, select that channel from the list, and you'll see which hooks were loaded, which fired, their exit codes, and any stdout or stderr output.&lt;/p&gt;

&lt;p&gt;Hook scripts that exit with code &lt;code&gt;1&lt;/code&gt; produce a non-blocking warning: the agent continues, but the issue shows up in the channel. Exit code &lt;code&gt;2&lt;/code&gt; is the one that blocks and surfaces the message to the model. If your hook is supposed to stop something and it isn't, check which exit code your script is returning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Already have Claude hooks? You're closer than you think
&lt;/h2&gt;

&lt;p&gt;If you're coming from Claude Code, VS Code reads hook configurations from &lt;code&gt;.claude/settings.json&lt;/code&gt; and &lt;code&gt;.claude/settings.local.json&lt;/code&gt; by default. The format is the same, with two differences to watch for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tool input property names&lt;/strong&gt;: Claude Code uses &lt;code&gt;snake_case&lt;/code&gt; (e.g. &lt;code&gt;tool_input.file_path&lt;/code&gt;), VS Code uses &lt;code&gt;camelCase&lt;/code&gt; (e.g. &lt;code&gt;tool_input.filePath&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool names&lt;/strong&gt;: different. Claude uses &lt;code&gt;Write&lt;/code&gt;/&lt;code&gt;Edit&lt;/code&gt;, VS Code uses &lt;code&gt;create_file&lt;/code&gt;/&lt;code&gt;replace_string_in_file&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Where Hooks Fit
&lt;/h2&gt;

&lt;p&gt;Hooks are the third piece of a larger customization system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Instructions&lt;/strong&gt; (&lt;code&gt;copilot-instructions.md&lt;/code&gt; / &lt;code&gt;.github/instructions/&lt;/code&gt;): always-on context. Good for project-wide rules.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skills&lt;/strong&gt; (&lt;code&gt;.github/skills/&lt;/code&gt;): domain-specific, activated on demand. Good for encoding your conventions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hooks&lt;/strong&gt; &lt;em&gt;← you're here&lt;/em&gt;: &lt;strong&gt;&lt;em&gt;deterministic&lt;/em&gt;&lt;/strong&gt; automation at lifecycle points. Good for guarantees.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP Servers&lt;/strong&gt;: external tools: database access, browser automation, external APIs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plugins&lt;/strong&gt;: installable bundles that package all of the above together.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft7vmm4dvc6s848eic0vu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft7vmm4dvc6s848eic0vu.png" alt="VS Code Agent Customization Stack" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Instructions tell the agent &lt;strong&gt;&lt;em&gt;how&lt;/em&gt;&lt;/strong&gt; to work. Skills tell the agent &lt;strong&gt;&lt;em&gt;what patterns to follow&lt;/em&gt;&lt;/strong&gt;. Hooks make sure certain things happen regardless of what the agent decides.&lt;/p&gt;

&lt;p&gt;They're not interchangeable. They're complementary.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The difference between guidance and guarantee matters.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instructions and skills are &lt;strong&gt;&lt;em&gt;probabilistic&lt;/em&gt;&lt;/strong&gt;. They shift the agent's behavior in the right direction. Hooks are &lt;strong&gt;&lt;em&gt;deterministic&lt;/em&gt;&lt;/strong&gt;. Your code runs, period. Both have their place. Knowing which to reach for is the skill.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start with one hook.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One &lt;code&gt;PostToolUse&lt;/code&gt; Prettier hook that auto-formats on every file edit. Ship it, live with it for a week, see what else you want to automate. The scope expands naturally.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is the second post in a series on VS Code agent customizations: Skills, Hooks, MCP Servers, CLI, and Plugins.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Previous post: &lt;a href="https://dev.to/cloudx/stop-getting-generic-output-from-copilot-teach-it-your-patterns-1358"&gt;I stopped fighting Copilot's conventions. I taught it mine instead.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Official docs: &lt;a href="https://code.visualstudio.com/docs/copilot/customization/hooks" rel="noopener noreferrer"&gt;Agent Hooks (VS Code)&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>vscode</category>
      <category>githubcopilot</category>
      <category>agentskills</category>
      <category>ai</category>
    </item>
    <item>
      <title>Stop getting generic output from Copilot. Teach it your patterns.</title>
      <dc:creator>Daniel Diaz</dc:creator>
      <pubDate>Thu, 23 Apr 2026 13:11:44 +0000</pubDate>
      <link>https://dev.to/cloudx/stop-getting-generic-output-from-copilot-teach-it-your-patterns-1358</link>
      <guid>https://dev.to/cloudx/stop-getting-generic-output-from-copilot-teach-it-your-patterns-1358</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;You use Copilot. You ask it to build something, and it does sort of. It follows your prompt, generates working code, and you ship it.&lt;/p&gt;

&lt;p&gt;Then you do it again the next day. And the day after.&lt;/p&gt;

&lt;p&gt;A month later, your codebase has class names in PascalCase next to camelCase functions, three different error handling styles, two ways to structure the same kind of module, and hooks that work differently from each other for no clear reason.&lt;/p&gt;

&lt;p&gt;All of it generated by Copilot. All of it reviewed and accepted by you.&lt;/p&gt;

&lt;p&gt;The problem isn't that Copilot generates bad code. It generates &lt;em&gt;generic&lt;/em&gt; code. It doesn't know your stack, your team's decisions, or the patterns you settled on six months ago. It works from its training data. And your codebase slowly starts to look like it was written by the internet.&lt;/p&gt;




&lt;h2&gt;
  
  
  What are Agent Skills?
&lt;/h2&gt;

&lt;p&gt;Agent Skills are Markdown files that tell Copilot what &lt;em&gt;your&lt;/em&gt; conventions are, for a specific domain, task, or workflow. Persistent context the agent reads before generating anything in that area.&lt;/p&gt;

&lt;p&gt;They live in your repo at:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.github/
└── skills/
    └── your-skill-name/
        └── SKILL.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the entire setup. No config files, no registration, no CLI step. The file being there is enough for VS Code to discover it.&lt;/p&gt;

&lt;p&gt;When you mention the skill in a chat prompt like &lt;code&gt;"Use the component-structure skill to create a new button"&lt;/code&gt;, Copilot reads that SKILL.md first, then generates code that follows your conventions.&lt;/p&gt;

&lt;p&gt;This is what separates skills from regular prompts: &lt;strong&gt;they're reusable, versioned, and shared through the repo.&lt;/strong&gt; Your conventions stop living in someone's head or a doc nobody reads; they live next to the code they govern.&lt;/p&gt;

&lt;p&gt;With a well-written &lt;code&gt;description&lt;/code&gt;, Copilot can automatically load the relevant skill based on your request without explicit mention. Explicitly referencing skills is more common with smaller models that need extra guidance to identify the right context.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;VS Code added native skill support and the &lt;code&gt;/create-skill&lt;/code&gt; command in &lt;a href="https://code.visualstudio.com/updates/v1_110" rel="noopener noreferrer"&gt;version 1.110&lt;/a&gt;. In &lt;a href="https://code.visualstudio.com/updates/v1_113" rel="noopener noreferrer"&gt;version 1.113&lt;/a&gt;, a dedicated Chat Customizations editor was added, providing a centralized UI to manage skills, instructions, and agents from a single place.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  A minimal SKILL.md
&lt;/h2&gt;

&lt;p&gt;Here's the minimum a skill needs to actually work:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;.github/skills/component-structure/SKILL.md&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Frontmatter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;component-structure&lt;/span&gt;
&lt;span class="na"&gt;description: Guidelines for creating React components following team conventions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;named exports, props interface, no inline JSX logic.&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Body:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Component Structure Skill&lt;/span&gt;

&lt;span class="gu"&gt;## When to use&lt;/span&gt;
Creating new React components or reviewing component patterns.

&lt;span class="gu"&gt;## Conventions&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; One component per file
&lt;span class="p"&gt;-&lt;/span&gt; Props interface defined above the component
&lt;span class="p"&gt;-&lt;/span&gt; Named exports only (no default exports)
&lt;span class="p"&gt;-&lt;/span&gt; No inline logic in JSX, extract to handlers or custom hooks
&lt;span class="p"&gt;-&lt;/span&gt; Error boundaries at page level, not inside components

&lt;span class="gu"&gt;## Never do this&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Default exports
&lt;span class="p"&gt;-&lt;/span&gt; Logic directly in JSX
&lt;span class="p"&gt;-&lt;/span&gt; Props without an explicit interface
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pattern example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;ButtonProps&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;label&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;onClick&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;disabled&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;Button&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;label&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;onClick&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;disabled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;ButtonProps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;handleClick&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;disabled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;onClick&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;button&lt;/span&gt; &lt;span class="nx"&gt;onClick&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;handleClick&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;disabled&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;disabled&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;label&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/button&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;name&lt;/code&gt; field must match the directory name exactly. The &lt;code&gt;description&lt;/code&gt; is what Copilot reads to decide whether to load the skill, so be specific. Full frontmatter reference: &lt;a href="https://code.visualstudio.com/docs/copilot/customization/agent-skills#_skillmd-file-format" rel="noopener noreferrer"&gt;SKILL.md file format&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Four sections. One pattern. A few explicit don'ts.&lt;/p&gt;

&lt;p&gt;When Copilot reads this before generating a component, it follows the same structure every time not because it guessed right, but because you told it.&lt;/p&gt;

&lt;p&gt;The skill doesn't need to be exhaustive. &lt;strong&gt;It needs to be specific.&lt;/strong&gt; A skill that covers one thing well beats one that tries to cover everything and ends up covering nothing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Creating a skill from a conversation
&lt;/h2&gt;

&lt;p&gt;One of the more useful additions in VS Code 1.110 is &lt;code&gt;/create-skill&lt;/code&gt;. After debugging a problem across several chat turns and landing on the right approach, you type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/create-skill
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent extracts the pattern from your conversation and scaffolds a SKILL.md for you. You review, adjust, commit.&lt;/p&gt;

&lt;p&gt;This is how skills actually get created in practice. &lt;strong&gt;You don't design a skill from scratch.&lt;/strong&gt; You solve a problem, recognize you'll solve it the same way every time, and capture that. &lt;code&gt;/create-skill&lt;/code&gt; removes the friction of doing the capture manually.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Ecosystem
&lt;/h2&gt;

&lt;p&gt;Skills are one piece of a larger customization system. Here's a quick map:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb2pg9dlm1xy0vre71373.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb2pg9dlm1xy0vre71373.png" alt="How a general agent loads skills to become specialized" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Instructions&lt;/strong&gt; (&lt;code&gt;copilot-instructions.md&lt;/code&gt; / &lt;code&gt;.github/instructions/&lt;/code&gt;): always-on context loaded in every session. Good for project-wide rules.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skills&lt;/strong&gt; ← you're here: domain-specific, activated on demand when mentioned in chat.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hooks&lt;/strong&gt;: scripts that run at specific points in the agent lifecycle: before a response, after a file write, on session start.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP Servers&lt;/strong&gt;: external tools that give the agent capabilities your codebase doesn't have natively, like database access, browser automation, or external APIs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plugins&lt;/strong&gt;: installable bundles that package all of the above together.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each of these has its own post coming. Skills are the easiest starting point entirely local to your repo, zero infrastructure, and the effect on the agent is immediate.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The model doesn't know your stack. You have to teach it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Copilot is trained on public code. Your team's specific decisions aren't in that data. Skills are the mechanism for closing that gap.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One good skill beats ten prompts.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A well-written skill invoked consistently produces better results than re-explaining your conventions in each prompt. The repeatability is the point.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The skill itself is documentation.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Writing a SKILL.md forces you to articulate things that previously existed informally. Once it's committed, the team has a shared reference, not just for Copilot.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The real value shows up at review time.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When everyone uses the same skill, code review stops being about style. You argue about logic instead. That's the conversation worth having.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is the first post in a series on VS Code agent customizations: Skills, MCP Servers, CLI, Hooks, and Plugins.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Official docs: &lt;a href="https://code.visualstudio.com/docs/copilot/customization/agent-skills" rel="noopener noreferrer"&gt;Agent Skills VS Code&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>vscode</category>
      <category>githubcopilot</category>
      <category>agentskills</category>
      <category>ai</category>
    </item>
    <item>
      <title>What Backend Engineers Get Wrong About AI Integration</title>
      <dc:creator>Juan M. Altamirano</dc:creator>
      <pubDate>Mon, 13 Apr 2026 17:26:14 +0000</pubDate>
      <link>https://dev.to/cloudx/what-backend-engineers-get-wrong-about-ai-integration-2g1j</link>
      <guid>https://dev.to/cloudx/what-backend-engineers-get-wrong-about-ai-integration-2g1j</guid>
      <description>&lt;p&gt;So you've been asked to "add AI" to your project. Maybe it's a chatbot, maybe it's a smart search, maybe it's just your PM watching too many YouTube videos about agents. Either way, here you are.&lt;/p&gt;

&lt;p&gt;And look, as backend engineers, we're actually well-positioned for this. We know how to design APIs, handle async flows, think about failure modes. The problem is that LLMs don't behave like anything we've integrated before, and our instincts sometimes work against us.&lt;/p&gt;

&lt;p&gt;Here, I will try to cover eight common mistakes that I've seen (and made).&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Treating the LLM like a deterministic function
&lt;/h2&gt;

&lt;p&gt;This is the big one.&lt;/p&gt;

&lt;p&gt;We're used to thinking in terms of: same input → same output. Call a database, get a row. Call an API, get a JSON response. Write a unit test, pin the behavior forever.&lt;/p&gt;

&lt;p&gt;LLMs don't work like that. Under the hood, they're predicting the next most probable token based on everything that came before, which means there's inherent randomness added into every response. Same prompt, different temperature, different day, and you might get a slightly different answer. Or a wildly different one.&lt;/p&gt;

&lt;p&gt;Think of it less like calling a function and more like asking a very smart contractor to do some work for you. They'll usually get it right, but you need to review the output, define what "right" looks like, and have a plan for when they hand you something unexpected.&lt;/p&gt;

&lt;p&gt;The practical consequence: &lt;strong&gt;don't build flows that assume the model always returns what you expect&lt;/strong&gt;. Validate the output. Have fallbacks. Don't let a malformed response crash your service.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Not handling retries and timeouts
&lt;/h2&gt;

&lt;p&gt;If you call an external REST API and it times out, you probably have retry logic. Maybe exponential backoff. Maybe a circuit breaker.&lt;/p&gt;

&lt;p&gt;For some reason, when we start calling LLM APIs, all of that discipline disappears.&lt;/p&gt;

&lt;p&gt;LLM calls are &lt;em&gt;slow&lt;/em&gt;. We're talking in the order of seconds depending on the model and the response length. And they fail. Rate limits, provider outages, network issues, it all happens.&lt;/p&gt;

&lt;p&gt;Set a timeout. Retry with backoff. If you're building something user-facing, stream the response so it doesn't feel like the app is frozen. Treat the LLM provider like what it is: an external dependency that can and will go down.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Ignoring token costs in loops
&lt;/h2&gt;

&lt;p&gt;Token costs have a way of sneaking up on you.&lt;/p&gt;

&lt;p&gt;A common pattern that seems harmless: you have a list of items, you loop over them, you call the LLM for each one. 50 items? 50 calls. Each one dragging along the same massive system prompt like dead weight. The model isn't doing 50x the thinking. You're just paying 50x for the setup.&lt;/p&gt;

&lt;p&gt;It's like printing the entire company handbook before every meeting just to discuss one item.&lt;/p&gt;

&lt;p&gt;Some things to think about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Can you batch items into a single prompt? Sometimes yes, sometimes the quality drops.&lt;/li&gt;
&lt;li&gt;Is the system prompt really 2000 tokens? Can it be 400?&lt;/li&gt;
&lt;li&gt;Are you sending context the model doesn't need for that specific call?&lt;/li&gt;
&lt;li&gt;Could you cache responses for identical or near-identical inputs?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Token costs are easy to ignore in development (small dataset, free tier) and painful to discover in production.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Breaking prompt caching without realizing it
&lt;/h2&gt;

&lt;p&gt;This one is especially relevant if you're building agents.&lt;/p&gt;

&lt;p&gt;Most providers cache prompts automatically and give you significant discounts when a request matches a previously seen prefix. In an agent loop where you're continuously resending the system prompt, full conversation history and tool calls, that cache is what keeps your bill reasonable.&lt;/p&gt;

&lt;p&gt;The catch: think of it like a prefix match. The moment something differs from the previous call, the cache stops there and you pay full price from that point on.&lt;/p&gt;

&lt;p&gt;A classic example is injecting the current timestamp into your system prompt. Seems harmless, but since it changes on every call, nothing after it ever gets cached, and you're paying full price every time.&lt;/p&gt;

&lt;p&gt;Treat everything before the dynamic part of your prompt as sacred. Same order, same content, same whitespace. Push dynamic context as far down the message history as possible, after the static parts that can be cached.&lt;/p&gt;

&lt;p&gt;This leads to decisions that look counterintuitive: sometimes you deliberately send more tokens just to keep the cacheable prefix intact. A 2000-token cached call is cheaper than a 1500-token cache miss. Once you internalize that, it changes how you structure your payloads entirely.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Treating prompt engineering as someone else's problem
&lt;/h2&gt;

&lt;p&gt;"The AI team handles the prompts."&lt;/p&gt;

&lt;p&gt;If you're building the integration, you need to understand how prompts work. Not at a research level, but enough to know why your endpoint is returning garbage.&lt;/p&gt;

&lt;p&gt;The prompt is part of your application logic. It deserves the same attention as your SQL queries or your request validation. A bad prompt will produce bad output no matter how clean your surrounding code is.&lt;/p&gt;

&lt;p&gt;At minimum, know the difference between system and user messages, understand what temperature does to your outputs, and learn when to use structured outputs, instead of trying to parse free-form text. Which brings me to...&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Trusting free-form text when you don’t have to
&lt;/h2&gt;

&lt;p&gt;I've seen code like this in the wild:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Analyze this and return: risk level, explanation, recommendation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;risk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Please don't do this.&lt;/p&gt;

&lt;p&gt;Modern LLM APIs support structured outputs — you define a schema, the model returns valid JSON that matches it. Every time. No parsing, no &lt;code&gt;IndexError&lt;/code&gt; because the model decided to add an extra line.&lt;/p&gt;

&lt;p&gt;In Python, pair it with Pydantic. In Typescript, use Zod for validation and parsing. In Go, define a struct and unmarshal directly. Your future self will thank you.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AnalysisResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;risk_level&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HIGH&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MEDIUM&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;LOW&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NONE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;explanation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;recommendation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;complete_structured&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AnalysisResult&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# result.risk_level is always there. Always a string. Always one of the four values.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  7. Putting everything in the context window and calling it RAG
&lt;/h2&gt;

&lt;p&gt;RAG (Retrieval Augmented Generation) is genuinely useful. The idea is simple: instead of stuffing all your knowledge into the prompt, you retrieve only the relevant bits and include those.&lt;/p&gt;

&lt;p&gt;But a lot of implementations I've seen are just... dumping documents into the prompt and calling it RAG.&lt;/p&gt;

&lt;p&gt;Real RAG has a retrieval step. You embed the user's query, find the most semantically similar chunks from your knowledge base, and only send those. If you're sending 50 pages of documentation on every call, you're not doing RAG, you're doing expensive copy-paste.&lt;/p&gt;

&lt;p&gt;The retrieval quality matters as much as the generation. Bad retrieval means irrelevant context, and irrelevant context means bad answers (no matter how good your model is).&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Not thinking about what happens when the model is wrong
&lt;/h2&gt;

&lt;p&gt;LLMs hallucinate. They confidently state things that are false. This isn't a bug that will get patched, it's just how these models work.&lt;/p&gt;

&lt;p&gt;So the question isn't "what if the model is wrong?"... it's "what happens in my system when the model is wrong?"&lt;/p&gt;

&lt;p&gt;If you're using an LLM to triage support tickets, a wrong answer means someone spends extra time investigating. Annoying, not catastrophic. But if you're using it to generate financial reports or medical summaries, wrong answers have a very different impact.&lt;/p&gt;

&lt;p&gt;Design your system with the assumption that the model will sometimes be wrong. Add human review steps where it matters. Log the inputs and outputs so you can audit. Don't pipe raw LLM output into anything irreversible.&lt;/p&gt;




&lt;h2&gt;
  
  
  Wrapping up
&lt;/h2&gt;

&lt;p&gt;None of this means LLMs aren't useful, they absolutely are. But they're a new kind of dependency with failure modes we're not used to. The good news is that most of the solid engineering practices we already have (validation, retries, observability, defensive design) apply here too. We just need to remember to actually use them.&lt;/p&gt;

&lt;p&gt;One last thing: this space moves fast. What's a workaround today might be a built-in feature tomorrow. If you're already the kind of engineer who keeps up with tech, just make sure AI topics are in the mix. It's worth it.&lt;/p&gt;

&lt;p&gt;If you've run into other footguns that aren't on this list, drop them in the comments. I'm curious what patterns other backend engineers have hit.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>backend</category>
      <category>engineering</category>
      <category>programming</category>
    </item>
    <item>
      <title>From 'The Bench' to 'Ready to Ship': How AI Redefined My Learning Curve</title>
      <dc:creator>Joaquin Islas</dc:creator>
      <pubDate>Tue, 07 Apr 2026 19:46:52 +0000</pubDate>
      <link>https://dev.to/cloudx/from-the-bench-to-ready-to-ship-how-ai-redefined-my-learning-curve-47fo</link>
      <guid>https://dev.to/cloudx/from-the-bench-to-ready-to-ship-how-ai-redefined-my-learning-curve-47fo</guid>
      <description>&lt;h2&gt;
  
  
  The Hook: The Bench Moment
&lt;/h2&gt;

&lt;p&gt;We’ve all been there: you’re on the bench, and the pressure is mounting. Two potential assignments land on your desk, but there’s a catch—they are built on tech stacks you’ve only touched on the surface. The traditional anxiety sets in: &lt;em&gt;How fast can I get up to speed without slowing the team down?&lt;/em&gt; In the past, that moment felt like staring at a mountain you had to climb alone. But recently, I realized the climb has changed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Old Way vs. The New Way
&lt;/h2&gt;

&lt;p&gt;Before AI, the process was linear and often isolating. You’d spend days digging through dense documentation and watching generic tutorials. Now, the dynamic is flipped. It’s no longer about consuming static content; it’s about &lt;strong&gt;on-demand, interactive tutoring&lt;/strong&gt;.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Traditional Training&lt;/th&gt;
&lt;th&gt;AI-Powered Learning&lt;/th&gt;
&lt;th&gt;Impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Learning Curve&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Weeks of passive courses.&lt;/td&gt;
&lt;td&gt;
&lt;em&gt;Just-in-Time&lt;/em&gt; learning.&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;40% reduction&lt;/strong&gt; in non-productive time.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Problem Solving&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Waiting for a mentor.&lt;/td&gt;
&lt;td&gt;Instant 24/7 tutoring.&lt;/td&gt;
&lt;td&gt;Saves hours of &lt;strong&gt;Senior&lt;/strong&gt; developers' time.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Contextualization&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Generic examples.&lt;/td&gt;
&lt;td&gt;Adapts to company standards.&lt;/td&gt;
&lt;td&gt;Lower initial error rate.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Feedback Quality&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Errors found in PR reviews.&lt;/td&gt;
&lt;td&gt;Real-time logic suggestions.&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Less rework&lt;/strong&gt; and debt.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Real-World Case: Bridging Java and Go
&lt;/h2&gt;

&lt;p&gt;To illustrate this, I recently had to pivot from my Java background into a Go project. Instead of starting from zero, I used AI as a "paradigm translator."&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;💡 &lt;strong&gt;The Strategy:&lt;/strong&gt; I asked the AI: &lt;em&gt;"I'm used to Java's Spring Boot; how do I achieve this same pattern idiomatically in Go?"&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here is a concrete example of how I mapped a common Java pattern (Service/Repository) to idiomatic Go in minutes, thanks to the AI's guidance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// What I knew: Java Implementation (Spring-like)&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;UserService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;UserRepository&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Standard constructor injection&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;UserService&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;UserRepository&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;repo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="nf"&gt;getUser&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Linear stream-like flow&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findById&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;orElseThrow&lt;/span&gt;&lt;span class="o"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UserNotFoundException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// What I learned (Idiomatic Go): Guided by AI&lt;/span&gt;
&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"errors"&lt;/span&gt;

&lt;span class="c"&gt;// AI suggested defining an interface here for the dependency, &lt;/span&gt;
&lt;span class="c"&gt;// emphasizing Go's implicit interface satisfaction.&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;UserRepository&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;FindByID&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;UserService&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;repo&lt;/span&gt; &lt;span class="n"&gt;UserRepository&lt;/span&gt; &lt;span class="c"&gt;// AI explained composition over inheritance&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// AI helped me write the constructor function&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;NewUserService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="n"&gt;UserRepository&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;UserService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;UserService&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;UserService&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;GetUser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FindByID&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="c"&gt;// AI emphasized Go's explicit error handling pattern&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user not found"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;✅ &lt;strong&gt;The Result:&lt;/strong&gt; I quickly understood the shift from Object-Oriented inheritance to Go's composition and interfaces. This didn't just teach me syntax; it gave me the &lt;strong&gt;confidence&lt;/strong&gt; to show up as a contributor, not a beginner, from week one.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Metrics: Why This Matters for the Company
&lt;/h2&gt;

&lt;p&gt;The shift isn't just a feeling; the data supports it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🚀 &lt;strong&gt;Productivity Boost:&lt;/strong&gt; According to &lt;a href="https://github.blog/2022-09-07-research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/" rel="noopener noreferrer"&gt;GitHub Research&lt;/a&gt;, developers using AI complete tasks up to &lt;strong&gt;55% faster&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;⚖️ &lt;strong&gt;The Leveling Effect:&lt;/strong&gt; A study by &lt;a href="https://www.nber.org/papers/w31161" rel="noopener noreferrer"&gt;MIT and Stanford&lt;/a&gt; suggests that AI helps developers close the skills gap &lt;strong&gt;43% faster&lt;/strong&gt;, acting as a "great leveler" for the whole team.&lt;/li&gt;
&lt;li&gt;📈 &lt;strong&gt;Continuous Learning:&lt;/strong&gt; As highlighted by the &lt;a href="https://hbr.org/2023/10/how-generative-ai-is-changing-the-way-we-learn" rel="noopener noreferrer"&gt;Harvard Business Review&lt;/a&gt;, AI-powered training can reduce time-to-productivity by &lt;strong&gt;40%&lt;/strong&gt; compared to traditional, passive methods.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  ⚠ The Limits: Being Honest
&lt;/h2&gt;

&lt;p&gt;However, let’s be clear: AI is not a silver bullet. It can teach you syntax and explain complex concepts, but it cannot replace &lt;strong&gt;human judgment&lt;/strong&gt;. It doesn't know the specific business context or the long-term architectural risks of our project. AI provides the tools, but &lt;strong&gt;you&lt;/strong&gt; provide the engineering intuition. Experience still matters; AI just lets you acquire it faster.&lt;/p&gt;

&lt;h2&gt;
  
  
  🛠️ A Practical Framework
&lt;/h2&gt;

&lt;p&gt;If you're looking to replicate this, here is the routine I used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🌉 &lt;strong&gt;Bridge the Gap:&lt;/strong&gt; Always start by asking the AI to compare the new technology to a stack you already master.&lt;/li&gt;
&lt;li&gt;🔀 &lt;strong&gt;The "English + AI" Combo:&lt;/strong&gt; Use AI to simulate technical interviews or draft documentation in English while you learn the tech—it's a double win for professional growth.&lt;/li&gt;
&lt;li&gt;💪 &lt;strong&gt;Contribution as Practice:&lt;/strong&gt; Don't just watch videos. Take a small internal spike or task and use your AI assistant to guide your implementation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🎯 The Closing Thought
&lt;/h2&gt;

&lt;p&gt;In 2026, "being prepared" no longer means knowing everything by heart. It means being agile enough to adapt. AI has changed the definition of a Senior Engineer: it’s less about having all the answers and more about knowing how to leverage tools to bridge the gap between "I don't know this" and "I’m ready to ship."&lt;/p&gt;




&lt;h2&gt;
  
  
  References &amp;amp; Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;EdTech Global (2025):&lt;/strong&gt; AI in Corporate Education: Efficiency and ROI Statistics.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub / Microsoft (2023):&lt;/strong&gt; &lt;a href="https://github.blog/2022-09-07-research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/" rel="noopener noreferrer"&gt;The impact of AI on developer productivity&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MIT &amp;amp; Stanford (NBER, 2024):&lt;/strong&gt; &lt;a href="https://www.nber.org/papers/w31161" rel="noopener noreferrer"&gt;Generative AI at Work and the Leveling Effect&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Harvard Business Review (2024):&lt;/strong&gt; &lt;a href="https://hbr.org/2023/10/how-generative-ai-is-changing-the-way-we-learn" rel="noopener noreferrer"&gt;How Generative AI Is Changing the Way We Learn&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>career</category>
      <category>productivity</category>
      <category>go</category>
    </item>
    <item>
      <title>Stop Wasting Time on CVEs That Don't Affect You</title>
      <dc:creator>Juan M. Altamirano</dc:creator>
      <pubDate>Mon, 16 Mar 2026 18:47:53 +0000</pubDate>
      <link>https://dev.to/cloudx/stop-wasting-time-on-cves-that-dont-affect-you-1ln9</link>
      <guid>https://dev.to/cloudx/stop-wasting-time-on-cves-that-dont-affect-you-1ln9</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Aren't you tired of pushing new code and then a few days later receiving an alert from Github's Dependabot? Well, I am.&lt;br&gt;
The most annoying part is looking for the CVE, reviewing your code and then detecting that you aren't using the affected part.&lt;br&gt;
Rinse and repeat for every single alert.&lt;/p&gt;


&lt;h2&gt;
  
  
  The solution?
&lt;/h2&gt;

&lt;p&gt;That's why I built dep_shield — a CLI that I can plug into my common workflow (lint -&amp;gt; dep_shield -&amp;gt; tests -&amp;gt; sonar) and get a straight answer: "this CVE affects you" or "relax, you're fine."&lt;/p&gt;


&lt;h2&gt;
  
  
  How dep_shield Works
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2Fcloudx-labs%2Fposts%2Fmain%2Fposts%2Fjuanmaalt%2Fassets%2Fdep_shield_flow.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2Fcloudx-labs%2Fposts%2Fmain%2Fposts%2Fjuanmaalt%2Fassets%2Fdep_shield_flow.png" alt="dep_shield analysis flow" width="800" height="2071"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The flow is straightforward:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Parse dependencies&lt;/strong&gt; — Read &lt;code&gt;requirements.txt&lt;/code&gt; or &lt;code&gt;pyproject.toml&lt;/code&gt;, extract packages and versions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check for CVEs&lt;/strong&gt; — Query the OSV database for known vulnerabilities
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Find usage in code&lt;/strong&gt; — Scan your Python files to see where you import vulnerable packages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI-powered analysis&lt;/strong&gt; — Send the CVE description + your import context to an LLM and ask: "Does this actually affect me?"&lt;/li&gt;
&lt;/ol&gt;


&lt;h2&gt;
  
  
  The Interesting Parts
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Parsing Dependencies (Both Formats)
&lt;/h3&gt;

&lt;p&gt;The tool supports &lt;code&gt;requirements.txt&lt;/code&gt; and &lt;code&gt;pyproject.toml&lt;/code&gt; — it figures out which one you're using.&lt;br&gt;
Why both? Well, &lt;code&gt;requirements.txt&lt;/code&gt; is still widely used, but &lt;code&gt;uv&lt;/code&gt; is taking over fast.&lt;/p&gt;
&lt;h3&gt;
  
  
  Querying OSV (the free vulnerability database)
&lt;/h3&gt;

&lt;p&gt;OSV over NVD or Snyk? No API key, no rate limits, no pricing tiers. NVD is slow and wants you to parse CPE identifiers. Snyk needs auth and has usage limits.&lt;br&gt;
OSV just works — POST a package name, get vulnerabilities back. For a CLI tool meant to run locally and fast, that's exactly what I needed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;OSV_API_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.osv.dev/v1/query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;query_vulnerabilities&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;package_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Vulnerability&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;package&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;package_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ecosystem&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PyPI&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;version&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OSV_API_URL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;10.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;parse_vulnerabilities&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"vulns"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"GHSA-cpwx-vrp4-4pq7"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Jinja2 vulnerable to sandbox breakout through attr filter selecting format method"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"details"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"An oversight in how the Jinja sandboxed environment interacts with the `|attr` filter allows an attacker that controls the content of a template to execute arbitrary Python code.&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;To exploit the vulnerability, an attacker needs to control the content of a template. Whether that is the case depends on the type of application using Jinja. This vulnerability impacts users of applications which execute untrusted templates.&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;Jinja's sandbox does catch calls to `str.format` and ensures they don't escape the sandbox. However, it's possible to use the `|attr` filter to get a reference to a string's plain format method, bypassing the sandbox. After the fix, the `|attr` filter no longer bypasses the environment's attribute lookup."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"aliases"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"CVE-2025-27516"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"modified"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-02-04T04:14:58.595738Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"published"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2025-03-05T20:40:14Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"affected"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"package"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"jinja2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"ecosystem"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"PyPI"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"purl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pkg:pypi/jinja2"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"ranges"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ECOSYSTEM"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"events"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"introduced"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"fixed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"3.1.6"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;The full response has more fields (references, versions, etc.), but these are the relevant ones.&lt;/em&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Finding Where You Actually Use the Package
&lt;/h3&gt;

&lt;p&gt;It's easy to forget where you actually use a dependency — especially in large projects with dozens of packages. And not all usage is equal: importing requests in your core API is very different from importing it in a one-off migration script.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scan_file_for_package&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;package_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;CodeUsage&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;pattern_import&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;rf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^import\s+&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;package_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;(\s|,|$|\.)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;pattern_from&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;rf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^from\s+&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;package_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;(\s|\.)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ignore&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line_num&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pattern_import&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;import_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;import&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pattern_from&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;import_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;from&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;

            &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;CodeUsage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="n"&gt;line_number&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;line_num&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;line_content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;import_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;import_type&lt;/span&gt;
            &lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  The AI Part: Asking A Model If It Matters
&lt;/h3&gt;

&lt;p&gt;Now, here's where it gets interesting. I send the LLM:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The CVE description&lt;/li&gt;
&lt;li&gt;The import statements from your code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And ask: "Given how this code uses the package, does this vulnerability apply?"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Important:&lt;/strong&gt; Only import lines are sent to the LLM — not your actual business logic. The model sees &lt;code&gt;from requests import Session&lt;/code&gt;, not the body of your functions. Your code stays local.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;_SYSTEM_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;You are a security analyst reviewing Python dependency vulnerabilities.
You will be given a CVE description and import-level code evidence.
Your job is to determine if the vulnerability realistically affects this codebase.

Risk level definitions:
- HIGH: The imports directly expose the vulnerable code path
- MEDIUM: Package is imported but no clear evidence the vulnerable feature is used
- LOW: Package only imported in tests or dev tooling
- NONE: CVE conditions are not plausible in this codebase&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I use Pydantic to force the model into a consistent format, no free-form text that I'd have to parse.&lt;br&gt;
It either gives me a valid ImpactAnalysis or it fails. No "well, it depends..." answers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ImpactAnalysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;risk_level&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HIGH&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MEDIUM&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;LOW&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NONE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;explanation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;recommendation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Making It Smarter Over Time (RAG)
&lt;/h3&gt;

&lt;p&gt;Here's the thing — the LLM doesn't know your codebase. It analyzes each CVE in isolation. But what if it could remember past analyses?&lt;/p&gt;

&lt;p&gt;That's where ChromaDB comes in. Every time the tool analyzes a vulnerability, it stores the result. Next time a similar CVE shows up (same package, similar vulnerability type), it retrieves past analyses as context.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;store_analysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vulnerability_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;analysis&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ImpactAnalysis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;code_context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;vulnerability_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;analysis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;explanation&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;metadatas&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;risk&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;analysis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;risk_level&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;package&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;package_name&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
        &lt;span class="n"&gt;ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;vulnerability_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_similar_analyses&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vulnerability_description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;query_texts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;vulnerability_description&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;n_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result? The tool gets better the more you use it. If you analyzed a &lt;code&gt;requests&lt;/code&gt; CVE last month and a new one appears, the model sees how you handled it before.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Output Looks Like
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm5e6e3ghsppwylzhm2tl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm5e6e3ghsppwylzhm2tl.png" alt="TERMINAL OUTPUT" width="800" height="220"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;Honestly? It reinforced an ancient rule: &lt;em&gt;context is everything&lt;/em&gt;&lt;br&gt;
A CVE that sounds critical in the abstract becomes irrelevant when you realize you only import that package in a test helper. OSV turned out to be the perfect data source — free, fast, no API keys, solid Python coverage. And while the LLM does a surprisingly good job at triage, I still treat its output as a suggestion, not a verdict. The RAG layer (ChromaDB) was an afterthought, but it's become one of the most useful parts: the tool genuinely gets better as it sees more of your codebase.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;I have a few ideas in mind:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;More languages — JavaScript and Go are probably the next ones&lt;/li&gt;
&lt;li&gt;CI/CD integration — run it in your pipeline, make it fail the build if there is any 'HIGH' impact&lt;/li&gt;
&lt;li&gt;Offline mode — run it against a local LLM&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;Give it a shot and let me know what breaks. Seriously — I'd appreciate any feedback.&lt;br&gt;
&lt;a href="https://github.com/juanmaalt/dep_shield" rel="noopener noreferrer"&gt;dep_shield&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>security</category>
      <category>ai</category>
      <category>cli</category>
    </item>
    <item>
      <title>Who tests the tests?</title>
      <dc:creator>Lucas Gabriel Sánchez</dc:creator>
      <pubDate>Fri, 20 Feb 2026 17:05:58 +0000</pubDate>
      <link>https://dev.to/cloudx/who-tests-the-tests-nhc</link>
      <guid>https://dev.to/cloudx/who-tests-the-tests-nhc</guid>
      <description>&lt;p&gt;This post is based on a &lt;a href="https://www.gophercon.com/" rel="noopener noreferrer"&gt;Gophercon&lt;/a&gt; talk by &lt;a href="https://github.com/danicat" rel="noopener noreferrer"&gt;Daniela Petruzalek&lt;/a&gt;: &lt;a href="https://www.youtube.com/watch?v=av2f2okDOyk" rel="noopener noreferrer"&gt;Who tests the tests?&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  A little bit of history
&lt;/h2&gt;

&lt;p&gt;In the beginning, we checked our code manually, running the application and trying different inputs: we called that manual testing. Then we discovered that we could write code to test application code to check if it's correct: we called that automatic testing (unit, integration, functional, etc.)&lt;/p&gt;

&lt;p&gt;Now we are in the AI era where we write fewer tests and even less code, we need a way to swiftly check that the tests are correct and that they are testing the cases we expect in our application.&lt;/p&gt;

&lt;h2&gt;
  
  
  How can we know if code written by AI is correct?
&lt;/h2&gt;

&lt;p&gt;You can use automatic tests in the same way we used to check code written by humans.&lt;/p&gt;

&lt;p&gt;One option would be to let the AI write the application code and you write the tests, &lt;a href="https://dev.to/marabesi/ai-and-tdd-a-match-that-can-work-1d59"&gt;you can even use TDD&lt;/a&gt; where you, the human, write the tests and let the AI write the implementation after.&lt;/p&gt;

&lt;p&gt;Another option is to let the AI write the application code and the tests, but then how can you be sure that those tests are testing the things you want or need? Reading and understanding the tests would be the best, but can we do that automatically?&lt;/p&gt;

&lt;h2&gt;
  
  
  Enter: mutation tests
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Mutation_testing" rel="noopener noreferrer"&gt;Mutation testing&lt;/a&gt; is a way to check that the tests you have are testing the code the way you want by making small changes to the application code and checking that the tests fail as expected.&lt;/p&gt;

&lt;p&gt;It's based on a concept known as a mutant: a version of your application with a small change.&lt;/p&gt;

&lt;h3&gt;
  
  
  How does it work?
&lt;/h3&gt;

&lt;p&gt;The mutation testing cycle is this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a mutant: change the application code by applying just one change&lt;/li&gt;
&lt;li&gt;Run your test suite&lt;/li&gt;
&lt;li&gt;Check how many mutants were killed&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After running your tests, the ones that failed are said to have killed the mutant, those tests are testing something related to the code you changed and are, from the perspective of that change, good tests.&lt;/p&gt;

&lt;p&gt;After many changes, if a test never failed it means that test didn't kill any mutant and is a weak test. You should probably delete that test or write a better one.&lt;/p&gt;

&lt;p&gt;If a mutant is never killed, then that code is not being tested (no coverage) or is being tested poorly, you have an opportunity to write a test to check that piece of code if needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Do I have to make these changes manually?
&lt;/h3&gt;

&lt;p&gt;You can do it manually, but there are some tools to aid you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;C#, TypeScript and Scala: &lt;a href="https://stryker-mutator.io/" rel="noopener noreferrer"&gt;Stryker&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Go: &lt;a href="https://github.com/go-gremlins/gremlins" rel="noopener noreferrer"&gt;go-gremlins&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Java: &lt;a href="https://pitest.org/" rel="noopener noreferrer"&gt;pitest&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Python: &lt;a href="https://mutatest.readthedocs.io/en/latest/" rel="noopener noreferrer"&gt;mutatest&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Rust: &lt;a href="https://mutants.rs/welcome.html" rel="noopener noreferrer"&gt;mutants.rs&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These tools provide a way to make those changes automatically and some of them run the tests for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why do we need mutation testing?
&lt;/h2&gt;

&lt;p&gt;Let's see a very simple example in Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;divide&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;can&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t divide by 0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A simple test suite we can have is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;unittest&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;divide&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;divide&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TestDivide&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;unittest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TestCase&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_divide_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assertRaises&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="nf"&gt;divide&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_divide_success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assertEqual&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;divide&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;unittest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run the tests and everything is fine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;python tests.py
..
&lt;span class="nt"&gt;----------------------------------------------------------------------&lt;/span&gt;
Ran 2 tests &lt;span class="k"&gt;in &lt;/span&gt;0.000s

OK
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here we are testing both flows, when an error is raised and when we complete a successful operation, coverage is at 100% but those tests are not great, and here's why: let's change the implementation of divide from a/b to &lt;code&gt;a*b&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;divide&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;can&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t divide by 0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Running the tests should fail, right? They don't, because &lt;code&gt;1*1&lt;/code&gt; is still 1:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;python tests.py
..
&lt;span class="nt"&gt;----------------------------------------------------------------------&lt;/span&gt;
Ran 2 tests &lt;span class="k"&gt;in &lt;/span&gt;0.000s

OK
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a problematic test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_divide_success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assertEqual&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;divide&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even when you have 100% test coverage, that doesn't mean you have 100% test case coverage; some cases may be missing or not being tested correctly.&lt;/p&gt;

&lt;p&gt;What we did here was a mutation: from &lt;code&gt;a/b&lt;/code&gt; to &lt;code&gt;a*b&lt;/code&gt;, that mutation gives us information about our tests that we didn't have before.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to read the results of mutation testing?
&lt;/h2&gt;

&lt;p&gt;When you run your test suite against mutated code, each test can do one of two things: &lt;em&gt;kill&lt;/em&gt; the mutant (the test fails, so it detected the change) or &lt;em&gt;survive&lt;/em&gt; (the test still passes, so it missed the change). After many mutation, you can summarize how often each test killed mutants, for example:&lt;/p&gt;

&lt;p&gt;After 200 Mutations:&lt;br&gt;
TestA: 140 Kills, 60 Survived&lt;br&gt;
TestB: 200 Kills&lt;br&gt;
TestC: 30 Kills, 170 Survived&lt;/p&gt;

&lt;p&gt;How to interpret this:&lt;/p&gt;

&lt;p&gt;TestA: is good test because on most mutations it was killed.&lt;br&gt;
TestB: is your strongest test, it detects every mutation you ran.&lt;br&gt;
TestC: is the weak one, it usually doesn't detect mutations.&lt;/p&gt;

&lt;h2&gt;
  
  
  When not to use mutation testing?
&lt;/h2&gt;

&lt;p&gt;In some projects or languages, running mutation tests can take hours. For each mutant, the tool has to compile (if needed) and run the full test suite. If your test suite takes 10 minutes, each mutation will take at least that long, so keep this in mind.&lt;/p&gt;

&lt;p&gt;Use this approach on small projects or ones where the "edit, compile, test" cycle is small.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;So who tests the tests? In practice, mutation testing does that: by checking that your tests react when the code is deliberately broken.&lt;/p&gt;

</description>
      <category>testing</category>
      <category>go</category>
      <category>programming</category>
    </item>
    <item>
      <title>Bicycles Are All Your AI Agents Need</title>
      <dc:creator>Federico Pascarella</dc:creator>
      <pubDate>Thu, 13 Nov 2025 12:47:16 +0000</pubDate>
      <link>https://dev.to/cloudx/bicycles-are-all-your-ai-agents-need-33cc</link>
      <guid>https://dev.to/cloudx/bicycles-are-all-your-ai-agents-need-33cc</guid>
      <description>&lt;h2&gt;
  
  
  From Condors to Code
&lt;/h2&gt;

&lt;p&gt;Somewhere between a condor and a keyboard lies human genius. Steve Jobs once told a story about how humans are terrible movers compared to animals. The condor beats us easily in the race of energy efficiency, but put a person on a bicycle and they fly.&lt;br&gt;
The bicycle, Jobs said, is "a tool that amplifies our efficiency." Computers, he added, are bicycles for the mind.&lt;br&gt;
That thought never left me. And now, with AI agents evolving super-fast, I can't help seeing the same pattern repeat.&lt;br&gt;
My humble view is that tools are still the key. Only this time, the cyclists are our AI agents with its brain (the LLM), and the bicycles are the functions we build for them.&lt;br&gt;
With the right tools, an agent moves with purpose. With clumsy tools, it stalls.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Engineering of Great Tools
&lt;/h2&gt;

&lt;p&gt;Great agents sit on top of small, sharp Python functions. They are plain, predictable, and fast.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Single Responsibility
&lt;/h3&gt;

&lt;p&gt;Specialize each function. Do one job well, then compose.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad: Swiss-army function
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;send_welcome&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;send_welcome&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;send_email&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Welcome!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; created&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;

&lt;span class="c1"&gt;# Good: Focused, composable tools
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;send_welcome_email&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;send_email&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Welcome!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Clear interfaces
&lt;/h3&gt;

&lt;p&gt;Name things so intent is obvious. Keep arguments explicit. Return data instead of printing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad: Vague names and side effects
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;discCalc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Discount applied: $&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Good: Straight names and returns
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;calculate_discount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;percentage&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;percentage&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Structured outputs
&lt;/h3&gt;

&lt;p&gt;Agents prefer structure. Return dicts or JSON, not prose.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad: Unstructured string
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_weather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;temp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;fetch_temperature&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;It&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s about &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;temp&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; degrees in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, partly cloudy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Good: MCP tool with schema
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp.server&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Server&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp.types&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Tool&lt;/span&gt;

&lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;weather&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;WeatherData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;City name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Temperature in Celsius&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;condition&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Weather condition&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;humidity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Humidity percentage&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@server.call_tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_weather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;WeatherData&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;temp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch_temperature&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;condition&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch_condition&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;WeatherData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;temp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;condition&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;condition&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;humidity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;65&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Efficiency
&lt;/h3&gt;

&lt;p&gt;Use built-ins, cache where it helps, and profile before optimizing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad: Manual loops
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;filter_active_users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;active&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;

&lt;span class="c1"&gt;# Good: Built-ins plus caching
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;functools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;lru_cache&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;

&lt;span class="nd"&gt;@lru_cache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;maxsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;filter_active_users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;users_tuple&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;u&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;users_tuple&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;active&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Robustness
&lt;/h3&gt;

&lt;p&gt;Validate inputs and fail loudly with helpful errors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad: No validation
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;read_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Good: Validation and clear errors
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;read_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Path must be a non-empty string&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;FileNotFoundError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;FileNotFoundError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;File not found: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6. The micro-tooling mindset
&lt;/h3&gt;

&lt;p&gt;Break big jobs into small tools you can test and swap. MCP benefits from chains of simple, named steps.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad: Monolith
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_user_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;validated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;enriched&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enrich&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;validated&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;enriched&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Good: Composable steps
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;validate_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;enrich_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enrich&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  7. Trade-offs
&lt;/h3&gt;

&lt;p&gt;Hundreds of tiny tools can create orchestration overhead. Clear names, steady input and output shapes, and basic docs keep things manageable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Show, Don't Tell: Two Decision Flows
&lt;/h2&gt;

&lt;p&gt;A concrete example makes the difference clear. Here is the same task, done with weak tools and with strong tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Task:&lt;/strong&gt; Extract newly signed customers from a CSV in cloud storage, enrich each with firmographic data, and email an account summary.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agent with poorly designed tools
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Calls a generic &lt;code&gt;process_file()&lt;/code&gt; that auto-detects type and tries to parse everything.&lt;/li&gt;
&lt;li&gt;Uses one do everything &lt;code&gt;enrich_user()&lt;/code&gt; that accepts many flags, then times out on third party rate limits.&lt;/li&gt;
&lt;li&gt;Prints logs to stdout, returns a mixed string summary, and the agent fails to decide what to send.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Decision flow with weak tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Input: blob path&lt;/li&gt;
&lt;li&gt;Branch: auto-detect format, guess schema&lt;/li&gt;
&lt;li&gt;Loop: enrich with side effects&lt;/li&gt;
&lt;li&gt;Output: unstructured string&lt;/li&gt;
&lt;li&gt;Failure mode: retries loop, hallucinates missing fields, no clear errors&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Agent with well designed tools
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;load_csv(path, schema)&lt;/code&gt; returns a typed dataframe.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;batch_enrich(users, provider, rate_limit)&lt;/code&gt; yields structured rows with retry metadata.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;render_account_summary(users)&lt;/code&gt; returns JSON for &lt;code&gt;send_email(to, subject, body_html)&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Decision flow with strong tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Input: explicit path and schema&lt;/li&gt;
&lt;li&gt;Transform: strict parser&lt;/li&gt;
&lt;li&gt;Enrich: idempotent, rate limited, returns status per row&lt;/li&gt;
&lt;li&gt;Render: deterministic template&lt;/li&gt;
&lt;li&gt;Output: email send result with IDs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Result: same goal, three clean steps, easy to test and to explain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;I believe that innovation often hides in simplicity. Building efficient AI agents isn't about giving them infinite intelligence; it's about giving them great tools. Write them clean, focused, and well-documented; think in micro-tooling: small parts, big impact.&lt;br&gt;
So, next time you're debugging that stubborn Python function, just remember: you're not fixing a bug. You're tuning a bicycle for the mind of an AI.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>productivity</category>
      <category>technology</category>
    </item>
    <item>
      <title>WhatsApp + MCP: automatic audio transcription</title>
      <dc:creator>German Burgardt</dc:creator>
      <pubDate>Mon, 29 Sep 2025 19:39:53 +0000</pubDate>
      <link>https://dev.to/cloudx/whatsapp-mcp-automatic-audio-transcription-jbh</link>
      <guid>https://dev.to/cloudx/whatsapp-mcp-automatic-audio-transcription-jbh</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;MCP (Model Context Protocol) can look complicated until you ship something real with it. Let's use it on something practical: expose your WhatsApp voice notes with your own MCP server and turn them into transcripts.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is MCP?
&lt;/h2&gt;

&lt;p&gt;MCP is a connection standard that connects AI agents with external systems.&lt;/p&gt;

&lt;p&gt;It has a server and a client, and they have two different ways to talk to each other:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;stdio (stdin/stdout): the standard Unix mechanism for a process to receive or send data to the environment or another process.&lt;/li&gt;
&lt;li&gt;Server-Sent Events (SSE): an HTTP mechanism where the server keeps the connection open and streams events to the client (one-way).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fln8mevhsqwmsveb51lnn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fln8mevhsqwmsveb51lnn.png" alt="STDIN/STDOUT vs SSE in MCP" width="600" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Quick comparison of stdio and SSE transports in MCP.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  MCP architecture
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Host: Claude Desktop / Cursor / any AI agent. It coordinates the LLM, spins up MCP clients, and shows results.&lt;/li&gt;
&lt;li&gt;MCP Client: an implementation embedded in the host that connects to your server. It speaks the protocol, opens/manages the connection, and sends/receives requests.&lt;/li&gt;
&lt;li&gt;MCP Server: your program that exposes tools. It runs actions and returns data/events to the client.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;An MCP server can expose different capabilities, but in this project we stick to &lt;strong&gt;tools&lt;/strong&gt; (actions like transcribing audio). MCP also supports resources or prompts; we skip them here to keep the flow simple.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftdtl4gk4p7n7kn6ojish.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftdtl4gk4p7n7kn6ojish.png" alt="MCP architecture: Host -&gt; MCP Client -&gt; MCP Server" width="600" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Diagram of the Host → MCP Client → MCP Server flow.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Building the WhatsApp MCP
&lt;/h2&gt;

&lt;p&gt;WhatsApp Desktop on macOS stores everything locally: an SQLite database with chats and folders containing the media files.&lt;/p&gt;

&lt;p&gt;Our MCP server will:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Read the WhatsApp database&lt;/li&gt;
&lt;li&gt;Find audio files per contact&lt;/li&gt;
&lt;li&gt;Transcribe them with Whisper&lt;/li&gt;
&lt;li&gt;Send the text back to the Client (Cursor in this case)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The working code lives in the repository: &lt;a href="https://github.com/GBurgardt/mcp-whatsapp-whisper" rel="noopener noreferrer"&gt;mcp-whatsapp-whisper&lt;/a&gt;. Let's walk through the key pieces.&lt;/p&gt;

&lt;h3&gt;
  
  
  The STDIN/STDOUT connection
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;StdioServerTransport&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@modelcontextprotocol/sdk/server/stdio.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;transport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;StdioServerTransport&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;transport&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With that the server listens to every client request on STDIN and replies through STDOUT.&lt;/p&gt;

&lt;p&gt;We pick stdio because this MCP server runs locally. It's the simplest and most stable transport on desktop/CLI: no open ports, no HTTP dependency, avoids CORS/firewalls, and hosts (Claude Desktop/Cursor) support it natively. SSE makes sense when the server lives remotely behind HTTP.&lt;/p&gt;

&lt;h3&gt;
  
  
  Exposing capabilities
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;whatsapp-audio-mcp&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;1.0.0&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;capabilities&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="c1"&gt;// We will expose actions&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Designing the tools
&lt;/h3&gt;

&lt;p&gt;The server lives on three tools each with a specific role:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;getRecentAudio(contactName, count?)&lt;/code&gt;: pulls the latest audio paths for a contact.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;searchAudios(query, date?)&lt;/code&gt;: narrows the list by name or date when the history is large. We get filtering without touching SQLite directly.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;transcribeAudio(audioPath)&lt;/code&gt;: turns a path into text with Whisper. It finishes the loop by delivering the result we care about.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal was a minimal set: find, refine, transcribe. Each tool lines up with one of those stages.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;transcribeAudio&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Transcribe an audio file using OpenAI Whisper (SDK)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nl"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nl"&gt;audioPath&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Path to the audio file&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="nx"&gt;required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;audioPath&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The schema follows JSON Schema. With it, Cursor knows which parameters to send.&lt;/p&gt;

&lt;h2&gt;
  
  
  Accessing WhatsApp
&lt;/h2&gt;

&lt;p&gt;WhatsApp Desktop keeps everything under predictable paths:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dbPath&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;homeDir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Library/Group Containers/group.net.whatsapp.WhatsApp.shared/ChatStorage.sqlite&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;mediaPath&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;homeDir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Library/Group Containers/group.net.whatsapp.WhatsApp.shared/Message/Media&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The database is SQLite:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`
  SELECT DISTINCT 
    ZCONTACTJID as jid,
    ZPARTNERNAME as name,
    ZLASTMESSAGEDATE as lastMessageDate
  FROM ZWACHATSESSION
  WHERE ZPARTNERNAME IS NOT NULL
  AND ZCONTACTJID NOT LIKE '%@g.us'  -- Exclude groups
`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Audio files are organized per contact. We scan recursively:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;audioExtensions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;.opus&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;.m4a&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;.mp3&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;.aac&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;.wav&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;scanDirectory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;dir&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;entries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;readdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;withFileTypes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;entry&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;audioExtensions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;some&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;ext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;endsWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ext&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Found an audio file&lt;/span&gt;
      &lt;span class="nx"&gt;audioFiles&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fullPath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;modifiedDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;mtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The transcription: FFmpeg + Whisper
&lt;/h2&gt;

&lt;p&gt;WhatsApp ships audio in Opus, but OpenAI Whisper prefers MP3. We use FFmpeg:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ffmpeg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;spawn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ffmpeg&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;-i&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;inputPath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// WhatsApp Opus audio&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;-acodec&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mp3&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;-b:a&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;128k&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;outputPath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Temporary MP3&lt;/span&gt;
&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then we transcribe with OpenAI Whisper (SDK):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;fs&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;node:fs&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_API_KEY&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;transcription&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;audio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;transcriptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createReadStream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;outputPath&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c1"&gt;// Temporary MP3&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;whisper-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;transcriptionText&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Configuring Cursor (the client)
&lt;/h2&gt;

&lt;p&gt;In the Cursor config (&lt;code&gt;~/.cursor/mcp.json&lt;/code&gt;) we add:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"whatsapp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"node"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"/path/to/mcp-whatsapp-whisper/dist/server.js"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"YOUR_OPENAI_KEY"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cursor can now invoke our server whenever it needs to.&lt;/p&gt;

&lt;h2&gt;
  
  
  MCP in action
&lt;/h2&gt;

&lt;p&gt;The user asks Cursor:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Send me the transcript of Elian's last audio."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Cursor automatically:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Calls &lt;code&gt;getRecentAudio(contactName: "elian")&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Receives the audio file path&lt;/li&gt;
&lt;li&gt;Calls &lt;code&gt;transcribeAudio(audioPath: "/path/to/audio.opus")&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Receives the transcription&lt;/li&gt;
&lt;li&gt;Summarizes or shows the full text&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The transcription flows through the OpenAI API; the temporary MP3 is sent to get the text back. Cursor orchestrates; your server prepares the file and makes the call.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwgkiuwpuuc8y7qy1o2hy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwgkiuwpuuc8y7qy1o2hy.png" alt="Cursor + MCP WhatsApp: transcription example" width="800" height="442"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Cursor showing the transcription returned by the WhatsApp MCP server.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Limitations: macOS only
&lt;/h2&gt;

&lt;p&gt;This server is macOS only. The WhatsApp paths are specific to Mac.&lt;/p&gt;

&lt;p&gt;It depends on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;WhatsApp Desktop installed&lt;/li&gt;
&lt;li&gt;FFmpeg (&lt;code&gt;brew install ffmpeg&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;OpenAI SDK (&lt;code&gt;npm i openai&lt;/code&gt;) with &lt;code&gt;OPENAI_API_KEY&lt;/code&gt; configured&lt;/li&gt;
&lt;li&gt;Internet connection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We also skip Prompts and Resource Templates.&lt;/p&gt;

&lt;p&gt;Security depends on the host. Cursor can ask for approval before it runs tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  Keep it running with PM2
&lt;/h2&gt;

&lt;p&gt;Build the project once (&lt;code&gt;npm run build&lt;/code&gt;) and keep the server alive with &lt;code&gt;pm2 start ecosystem.config.cjs&lt;/code&gt;. The provided config watches the compiled &lt;code&gt;dist/server.js&lt;/code&gt; and restarts it if it crashes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Your AI agent can now reach your data, use your tools, work in your context.&lt;/p&gt;

&lt;p&gt;The WhatsApp server is just one idea. Once you realize any program that speaks STDIN/STDOUT can be an MCP server, the possibilities get wild.&lt;/p&gt;

&lt;p&gt;Next time you think "I wish Cursor could access...", remember: it probably can. You just need to build the bridge.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>whisper</category>
      <category>automation</category>
    </item>
    <item>
      <title>How AI Reflects Your Thinking</title>
      <dc:creator>German Burgardt</dc:creator>
      <pubDate>Tue, 26 Aug 2025 13:36:38 +0000</pubDate>
      <link>https://dev.to/cloudx/how-ai-reflects-your-thinking-5714</link>
      <guid>https://dev.to/cloudx/how-ai-reflects-your-thinking-5714</guid>
      <description>&lt;p&gt;When we code using AI we ask ourselves: "what's the best prompt?" or "what magic prompt should I use?".&lt;/p&gt;

&lt;p&gt;We'd be better off asking: "what kind of interaction is this?". Trying to understand the nature of the interaction between us and the model.&lt;/p&gt;

&lt;p&gt;Maybe the problem isn't the technology, but us.&lt;/p&gt;

&lt;h2&gt;
  
  
  An Analogy
&lt;/h2&gt;

&lt;p&gt;Imagine you hire a remote programmer. Brilliant, but with some quirks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Never worked on your project before (0 context)&lt;/li&gt;
&lt;li&gt;Extremely literal. If you don't explicitly tell them, they never assume anything.&lt;/li&gt;
&lt;li&gt;Doesn't infer context&lt;/li&gt;
&lt;li&gt;Completely loses their memory every day, returning to their initial state&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;How would you communicate with them?&lt;/p&gt;

&lt;p&gt;You'd probably:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Explain all the necessary context, very detailed&lt;/li&gt;
&lt;li&gt;Be very specific with requirements&lt;/li&gt;
&lt;li&gt;Not assume they'll "figure out" anything. You explain everything&lt;/li&gt;
&lt;li&gt;Expect some iterations before the final result&lt;/li&gt;
&lt;li&gt;Maybe save context files to resend them every day&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's the best way to interact with an AI model.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdda3rxzav04bj1jf7y4h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdda3rxzav04bj1jf7y4h.png" alt="Communicating with AI requires the same clarity as working with a remote programmer" width="600" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  AI As a Mirror
&lt;/h2&gt;

&lt;p&gt;The model isn't just a task executor. It's also a mirror of your clarity when communicating a problem.&lt;/p&gt;

&lt;p&gt;If you give it vague instructions, you get vague results simply because it faithfully reflects how vague your thinking was.&lt;/p&gt;

&lt;p&gt;Most of the time when the model "doesn't understand" the problem isn't the model. It's that we ourselves weren't clear about what we wanted.&lt;/p&gt;

&lt;h2&gt;
  
  
  Clarity As a Skill
&lt;/h2&gt;

&lt;p&gt;The real skill isn't "writing good prompts". It's thinking clearly about problems and communicating that clarity. This is a fundamental skill for any programmer.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm3z8fou5qu5ud33t79un.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm3z8fou5qu5ud33t79un.png" alt="Vague thinking leads to vague results, while clear communication produces precise outcomes" width="600" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Example
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What we usually do:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Optimize this function
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why it fails:&lt;/strong&gt; Optimize in what sense? Speed? Memory? Readability? There's no success criteria.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What we should do:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The processOrders() function in orders.js takes 5 seconds with 1000 orders.
I need it to take less than 1 second.
Orders come from the database already sorted by date.
You can assume there are no duplicate orders.
Logs: &amp;lt;&amp;lt;detailed logs&amp;gt;&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is much clearer and less abstract. It describes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The problem (5 seconds is too much)&lt;/li&gt;
&lt;li&gt;The measurable goal (less than 1 second)&lt;/li&gt;
&lt;li&gt;Constraints (already sorted)&lt;/li&gt;
&lt;li&gt;Assumptions (no duplicates)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Breaking Down Problems
&lt;/h2&gt;

&lt;p&gt;One of the skills that improves working with AI is breaking problems down into smaller pieces. AI won't save you the work of thinking. The clarification process itself is valuable work in programming.&lt;/p&gt;

&lt;p&gt;Instead of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Implement a complete authentication system
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You learn to think:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Step 1: Define the User model with minimum required fields: &amp;lt;fields&amp;gt;
Step 2: Create the registration endpoint with basic validation (validation type, etc)
[etc...]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Limitations
&lt;/h2&gt;

&lt;p&gt;AI can only handle 3-4 files well at a time. It's a limitation but with its bright side:&lt;/p&gt;

&lt;p&gt;It forces you to keep responsibilities separated and create clear interfaces. You need to avoid coupling and think in small modules.&lt;/p&gt;

&lt;p&gt;It incentivizes you to follow good architecture practices.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Importance of Context
&lt;/h2&gt;

&lt;p&gt;AI needs all the context possible, don't skimp.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CONTEXT: Users report the checkout page hangs
SYMPTOM: The "Pay" button stays in "Processing..." state indefinitely
FILE: checkout.js, handlePayment() function
SUSPICION: Probably missing a catch to handle API errors
TASK: Add robust error handling and visual feedback to the user
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Value of Programming with AI
&lt;/h2&gt;

&lt;p&gt;Programming with AI trains you in thinking clearly and communicating precisely. It forces you to break problems into manageable pieces and be explicit with your requirements while constantly verifying results.&lt;/p&gt;

&lt;p&gt;These seem like fundamental skills for any dev regardless of language.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Reflection
&lt;/h2&gt;

&lt;p&gt;AI doesn't save you from thinking, or at least you shouldn't use it that way. It's the opposite, every prompt you write is an opportunity to clarify your understanding. Every response you receive is feedback on your clarity. Every iteration is a chance to improve.&lt;/p&gt;

&lt;p&gt;Next time you use AI and don't get the expected result, before blaming the model, ask yourself:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Did I really have clarity on what I wanted?&lt;/li&gt;
&lt;li&gt;Did I break down the problem into manageable parts?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These models are honest, literal collaborators. They give you exactly what you ask for, but they demand clarity. Learning to be clear is learning to think well. AI used properly makes you a better programmer.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>productivity</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Automate Any Repetitive Task with MCP</title>
      <dc:creator>German Burgardt</dc:creator>
      <pubDate>Mon, 28 Jul 2025 17:30:01 +0000</pubDate>
      <link>https://dev.to/cloudx/automate-any-repetitive-task-with-mcp-o9n</link>
      <guid>https://dev.to/cloudx/automate-any-repetitive-task-with-mcp-o9n</guid>
      <description>&lt;h2&gt;
  
  
  The Problem: Repetitive Detailed Prompting
&lt;/h2&gt;

&lt;p&gt;Every time I start a new task in Claude Code / Cursor, I type a detailed prompt to guide the AI through an internal monologue before proceeding. For example:&lt;/p&gt;

&lt;p&gt;"You will generate an internal monologue of 200 numbered lines where two thinkers debate the approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pragmatic focuses on functionality and efficiency&lt;/li&gt;
&lt;li&gt;Creative on innovation and elegance&lt;/li&gt;
&lt;li&gt;Follow these rules: exactly 200 lines, each starting with [Pragmatic] or [Creative]&lt;/li&gt;
&lt;li&gt;Be specific about code without abstractions&lt;/li&gt;
&lt;li&gt;Reflect and question without solutions&lt;/li&gt;
&lt;li&gt;Mention files/functions/variables&lt;/li&gt;
&lt;li&gt;Consider edge cases/performance/maintainability/user experience&lt;/li&gt;
&lt;li&gt;Debate simplicity vs functionality&lt;/li&gt;
&lt;li&gt;Question decisions, no repeats, end without conclusion&lt;/li&gt;
&lt;li&gt;Then address the task: [actual task here]."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Typing this repeatedly 20+ times a day wastes time and disrupts focus.&lt;/p&gt;

&lt;p&gt;As someone researching practical AI applications, we can fix that.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Before: 200+ word prompt every time&lt;/span&gt;
&lt;span class="c1"&gt;// After: "internal monologue 200 lines - implement auth system"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Enter MCPs: The Missing Link
&lt;/h2&gt;

&lt;p&gt;Model Context Protocols (MCPs) allow extending AI agents with custom tools. While common examples include fetching data, web browsing, or integrating with Slack, I used it in a novel way to automate my repetitive prompt.&lt;/p&gt;

&lt;h2&gt;
  
  
  From Repetition to Automation
&lt;/h2&gt;

&lt;p&gt;I built an MCP server in my Remix app (essentially the same as plain Node.js) that generates these monologues on demand. Now, Claude detects the trigger and handles it automatically.&lt;/p&gt;

&lt;p&gt;Here's a glimpse of what it generates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. [Pragmatic] We need to implement auth - start with basic JWT in middleware.js
2. [Creative] But what about OAuth? Users expect social login nowadays...
3. [Pragmatic] OAuth adds complexity - first nail down password flow, then extend
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjmi1ao6dc7pkccc5aqkj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjmi1ao6dc7pkccc5aqkj.png" alt="Before and After MCP Automation Flowchart" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The difference:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Before:&lt;/strong&gt; Type the full detailed prompt each time, then describe the task.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;After:&lt;/strong&gt; Simply say "internal monologue 200 lines about X - [task]", and Claude generates the monologue via the tool, then proceeds.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Time saved:&lt;/strong&gt; ~2 minutes per task&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Characters typed:&lt;/strong&gt; 300+ → 40&lt;/p&gt;


&lt;h2&gt;
  
  
  Building Your Own Monologue MCP
&lt;/h2&gt;

&lt;p&gt;Here's how to implement it in a Node.js server (adaptable from my Remix example).&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 1: Install Dependencies
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @modelcontextprotocol/sdk zod @anthropic-ai/sdk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Step 2: Create the MCP Server Handler
&lt;/h3&gt;

&lt;p&gt;Create &lt;code&gt;app/lib/mcp-server.ts&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Server&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@modelcontextprotocol/sdk/server/index.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;SSEServerTransport&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@modelcontextprotocol/sdk/server/sse.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;zod&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;Anthropic&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@anthropic-ai/sdk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;anthropic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;createMCPServer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;monologue-mcp&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;1.0.0&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;capabilities&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{},&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Define the monologue tool&lt;/span&gt;
  &lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setRequestHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;tools/list&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;generate-monologue&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Generate a reflective internal monologue in the style of Pragmatic vs Creative thinker&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Number of lines in the monologue (default: 100)&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Current conversation context&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Description of the task to perform&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;task&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;}));&lt;/span&gt;

  &lt;span class="c1"&gt;// The actual tool implementation&lt;/span&gt;
  &lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setRequestHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;tools/call&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;generate-monologue&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ArgsSchema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;number&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;optional&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;task&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ArgsSchema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arguments&lt;/span&gt;
      &lt;span class="p"&gt;);&lt;/span&gt;

      &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;systemPrompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`You are two thinkers having an internal dialogue about programming.
Pragmatic is focused on functionality and efficiency.
Creative is obsessive about innovation and elegance.

STRICT RULES:
1. Generate EXACTLY &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; numbered lines
2. Each line must start with [Pragmatic] or [Creative]
3. NO abstractions - be specific about the code
4. NO complete solutions - REFLECT and QUESTION
5. Mention specific files, functions, variables when relevant
6. Think about: edge cases, performance, maintainability, user experience
7. Debate simplicity vs functionality
8. Question every technical decision
9. NO repeated ideas - each line must add new value
10. End without a definitive conclusion - it's reflection, not decision`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userPrompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;
          &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="s2"&gt;`Previous context:\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;\n\n`&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;Current task: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;

Generate an internal monologue of EXACTLY &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; numbered lines where the two thinkers debate the best way to approach this task.`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
          &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;claude-opus-4-20250514&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;max_tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;32000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;system&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;systemPrompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userPrompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;

        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;monologue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;monologue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Error generating monologue: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;],&lt;/span&gt;
          &lt;span class="na"&gt;isError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Unknown tool: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Create the API Route
&lt;/h3&gt;

&lt;p&gt;Create &lt;code&gt;app/routes/api.mcp.ts&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;The MCP server needs to be exposed as an HTTP endpoint. We use Bearer authentication to secure it. Only Claude (or other authorized clients) with the correct API key can access your server. This prevents random people from using your tools.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;LoaderFunctionArgs&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@remix-run/node&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createMCPServer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;~/lib/mcp-server&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;SSEServerTransport&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@modelcontextprotocol/sdk/server/sse.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// SSE (Server Sent Events) keeps an open connection between Claude and your server&lt;/span&gt;
&lt;span class="c1"&gt;// This allows Claude to call your tools in real time without polling&lt;/span&gt;

&lt;span class="c1"&gt;// Simple auth check&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;verifyAuth&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;authHeader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Authorization&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;expectedKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;MCP_API_KEY&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;your-secret-key&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;authHeader&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s2"&gt;`Bearer &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;expectedKey&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;LoaderFunctionArgs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nf"&gt;verifyAuth&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Unauthorized&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;401&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;responseHeaders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Headers&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text/event-stream&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Cache-Control&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;no-cache&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;Connection&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;keep-alive&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Access-Control-Allow-Origin&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;*&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Access-Control-Allow-Headers&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Authorization, Content-Type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createMCPServer&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;transport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SSEServerTransport&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/api/mcp&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;requestHeaders&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromEntries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
    &lt;span class="na"&gt;responseHeaders&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromEntries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;responseHeaders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ReadableStream&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;transport&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// Keep connection alive (SSE connections timeout after 30 seconds of silence)&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;keepAlive&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setInterval&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enqueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TextEncoder&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;: keepalive&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;30000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;abort&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nf"&gt;clearInterval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;keepAlive&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="nx"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;responseHeaders&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Configure Environment Variables
&lt;/h3&gt;

&lt;p&gt;Add to your &lt;code&gt;.env&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-anthropic-api-key
&lt;span class="nv"&gt;MCP_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;a-secret-key-for-your-mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;ANTHROPIC_API_KEY&lt;/code&gt; lets your server call Claude's API to generate monologues. The &lt;code&gt;MCP_API_KEY&lt;/code&gt; is your own secret, it's what Claude will use to authenticate with your server.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Deploy and Connect
&lt;/h3&gt;

&lt;p&gt;Deploy your changes (I use Vercel, but any platform works):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git add &lt;span class="nb"&gt;.&lt;/span&gt;
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"Add MCP server for internal monologues"&lt;/span&gt;
git push
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then connect from Claude:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude mcp add &lt;span class="nt"&gt;--transport&lt;/span&gt; sse monologue https://yourdomain.com/api/mcp &lt;span class="nt"&gt;--header&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer your-secret-key"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;sse&lt;/code&gt; transport tells Claude to use Server Sent Events (the streaming connection type we set up). Replace &lt;code&gt;your-secret-key&lt;/code&gt; with the same MCP_API_KEY from your &lt;code&gt;.env&lt;/code&gt; file.&lt;/p&gt;




&lt;h2&gt;
  
  
  How It Works in Practice
&lt;/h2&gt;

&lt;p&gt;Now, when working in Claude:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; internal monologue 150 lines - design user experience for login flow
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude detects the phrase, calls the MCP tool, generates the detailed monologue (e.g., a debate on intuitive interfaces vs secure processes, navigation logic, etc.), and uses it to design the feature thoughtfully.&lt;/p&gt;

&lt;p&gt;A sample monologue excerpt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. [Creative] Login flow should be innovative and seamless – perhaps biometric integration for delight?
2. [Pragmatic] Biometrics add complexity; focus on reliable password handling in auth.js first.
3. [Creative] But user experience suffers with forms – question if we can animate transitions smoothly.
4. [Pragmatic] Animations might impact performance on mobile; consider edge cases in responsive design.
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Finj50qff2o8aq1b9w165.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Finj50qff2o8aq1b9w165.png" alt="Monologue Generation Screen Illustration" width="800" height="472"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;This MCP setup boosts programming efficiency by leveraging AI tools for consistent planning and productivity gains, while experimenting with a non typical application to explore MCPs more creatively and deeply.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;One could build other creative tools, such as one that fetches and analyzes server logs directly, or another that integrates with external APIs for real time data checks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Your Turn
&lt;/h2&gt;

&lt;p&gt;What repetitive tasks do you deal with in your daily work? Maybe you can create an MCP. The code is ready to adapt and build something.&lt;/p&gt;




&lt;p&gt;Questions? Leave a comment below and I'll be happy to help!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>javascript</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
