<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: speedy_devv</title>
    <description>The latest articles on DEV Community by speedy_devv (@speedy_devv).</description>
    <link>https://dev.to/speedy_devv</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3882972%2F4d901804-cbc3-4bae-95ee-901bf0ed6631.jpg</url>
      <title>DEV Community: speedy_devv</title>
      <link>https://dev.to/speedy_devv</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/speedy_devv"/>
    <language>en</language>
    <item>
      <title>I Shipped a SaaS MVP With Three Emails. Then I Watched It Die.</title>
      <dc:creator>speedy_devv</dc:creator>
      <pubDate>Sun, 03 May 2026 15:16:06 +0000</pubDate>
      <link>https://dev.to/speedy_devv/i-shipped-a-saas-mvp-with-three-emails-then-i-watched-it-die-3dej</link>
      <guid>https://dev.to/speedy_devv/i-shipped-a-saas-mvp-with-three-emails-then-i-watched-it-die-3dej</guid>
      <description>&lt;p&gt;I just shipped a SaaS MVP last month. Auth worked. Payments worked. The product did the thing it was supposed to do. I opened the dashboard on a Monday morning, coffee in hand, ready to watch the trials convert.&lt;/p&gt;

&lt;p&gt;Three weeks later my trial-to-paid was sitting at 8%. Users were signing up, clicking around for a day, and vanishing. Nobody was churning loudly. They were just leaving. Quietly. Like the app was a restaurant they walked into, looked at the menu, and walked out of without ordering.&lt;/p&gt;

&lt;p&gt;I looked at my email setup. A welcome email, a password reset, a payment receipt. That was it. Three emails covering maybe 10% of the lifecycle. The other 90% was silence, and that silence was costing me every single trial that did not convert on its own.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgn0qeiqvmn6kl05fcjum.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgn0qeiqvmn6kl05fcjum.png" alt=" " width="800" height="420"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/.%2Fhero.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/.%2Fhero.png" alt="AI Email Sequences" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Number That Made Me Stop Coding and Start Reading
&lt;/h2&gt;

&lt;p&gt;I went down a research rabbit hole for two days. I read every email marketing study I could find. One number kept showing up, and it kept me up at night.&lt;/p&gt;

&lt;p&gt;Automated email flows make up about 2% of total email volume but drive 37% of email revenue. Two percent. Thirty-seven percent. That ratio is insane. The emails most founders skip are the ones generating most of the money.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Two percent of your email volume drives 37% of your revenue. That is the gap where silent SaaS products go to die.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I had shipped the product. I had not shipped the lifecycle. And the lifecycle was the product, because nobody reaches the product if nobody leads them to it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I Stopped Trusting My Email Tool
&lt;/h2&gt;

&lt;p&gt;The standard advice is: open your email platform, pick a template, write the copy, set a 3-day delay, repeat. I tried that. I hated it. Here is what I kept running into.&lt;/p&gt;

&lt;h3&gt;
  
  
  Timers Ignore What Users Actually Do
&lt;/h3&gt;

&lt;p&gt;Most tools schedule emails on fixed delays. Send email 2 three days after email 1. That logic treats every user the same. A user who activated on day 1 gets the same nudge as someone who never logged in. It is spam disguised as personalization.&lt;/p&gt;

&lt;p&gt;The research is blunt about this. Triggered emails, meaning emails sent based on what a user does, get 76% higher open rates and 152% higher click-through rates than timed sends. If I wanted the 37% revenue, I had to build triggered, not timed.&lt;/p&gt;

&lt;h3&gt;
  
  
  HTML Templates Look Professional and Underperform
&lt;/h3&gt;

&lt;p&gt;I had been building rich HTML emails with my logo, images, layout grids, the whole thing. They looked great in preview. They also tanked in practice.&lt;/p&gt;

&lt;p&gt;Plain text emails get 42% more clicks than HTML. Gmail and Apple Mail render plain text well. Heavy HTML triggers spam filters. Half the devices your users carry render your beautiful layout like a ransom note. I had been optimizing for what looked good in my design tool, not for what clicked in real inboxes.&lt;/p&gt;

&lt;h3&gt;
  
  
  External Tools Have No Idea What Your App Is Doing
&lt;/h3&gt;

&lt;p&gt;This is the one that killed me. My email platform could not check if a user had already activated before sending the activation nudge. It could not skip the upgrade email for someone who upgraded yesterday. The email system and the product lived in separate worlds. I was sending people upgrade prompts after they had upgraded.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Actually Wanted to Build
&lt;/h2&gt;

&lt;p&gt;I wrote down what a correct email system would do. It would:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Read my product, my user journeys, my brand voice, and design sequences that sounded like my product&lt;/li&gt;
&lt;li&gt;Cover all six major lifecycle stages, not just welcome and billing&lt;/li&gt;
&lt;li&gt;React to what users do, not when they signed up&lt;/li&gt;
&lt;li&gt;Check the database between every email so I never sent something stupid&lt;/li&gt;
&lt;li&gt;Render both HTML and plain text so every client got the version it preferred&lt;/li&gt;
&lt;li&gt;Wire triggers to events my app already fires (signup, limit hit, cancellation)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then I looked at what that would take to build by hand. Seventeen templates. Six background job functions. Trigger wiring across three codebases. Type checking. Tests. Maybe a week of focused work if nothing went sideways.&lt;/p&gt;

&lt;p&gt;I did not have a week. I had a conversion problem right now.&lt;/p&gt;

&lt;p&gt;So I built it differently.&lt;/p&gt;

&lt;h2&gt;
  
  
  One Command, Six Agents, Seventeen Emails
&lt;/h2&gt;

&lt;p&gt;I built a slash command called &lt;code&gt;/emails&lt;/code&gt; that runs a pipeline of Claude Code agents. One command reads my product, designs every sequence, waits for approval, then builds in parallel.&lt;/p&gt;

&lt;p&gt;The command runs in two phases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 1 is design.&lt;/strong&gt; One agent reads my product overview, user journeys, brand guidelines, feature map, auth flow, billing setup, and brand colors. From that it builds an Email Brief: the product name, the "aha moment" (the specific action where the product clicks), pricing model, top features to educate about, and upgrade triggers. Then it designs all 17 emails. Subject lines, timing, goals, copy angles.&lt;/p&gt;

&lt;p&gt;Nothing gets built yet. The plan goes to me first. I can cut sequences, rewrite subject lines, change timing, or kill emails I do not want. The build phase uses my approved plan as its spec.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 2 is build.&lt;/strong&gt; Six agents work in parallel. One base agent goes first and creates the shared layout and sending utility. Then five agents build their sequences at the same time. Templates are React components. Background jobs use Inngest (a job engine that handles scheduling, retries, and event-driven workflows).&lt;/p&gt;

&lt;p&gt;Twenty minutes of wall-clock time. Six parallel agents. Seventeen typed templates. Six functions. One type check at the end.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/.%2Fslide-02.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/.%2Fslide-02.png" alt="Email Sequences Overview" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Six Sequences Every SaaS Is Missing
&lt;/h2&gt;

&lt;p&gt;Here is the lifecycle the system builds. I did not invent this. I read about 40 email marketing studies and pulled the pattern that worked.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Welcome (4 emails, days 0 to 7)
&lt;/h3&gt;

&lt;p&gt;This is the highest-value sequence you will ever send. First welcome emails average 55 to 70% open rates. Top-performing SaaS companies hit 75% plus. That is the best attention you will ever get from a user. Four emails over the first seven days, each one guiding them closer to the moment your product clicks for them.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Activation Nudge (+48h if no key action)
&lt;/h3&gt;

&lt;p&gt;Someone finishes onboarding, sees the dashboard, and leaves. This is the single biggest drop-off point in most SaaS products. One email 48 hours later, focused on one specific action, pulls a chunk of them back.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Feature Education (days 14 to 28)
&lt;/h3&gt;

&lt;p&gt;The user is active but only uses one feature. Three emails, each highlighting something they have not touched yet. Segmented campaigns like this drive up to 760% more revenue than broadcast blasts.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Upgrade Prompt (on limit hit)
&lt;/h3&gt;

&lt;p&gt;This one fires when the user hits a real limit, not on a timer. They tried to do something the free plan does not allow. That is the moment they actually understand the value. Three emails over five days: what happened, what the paid plan includes, and an offer.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Churn Prevention (7 to 30 days inactive)
&lt;/h3&gt;

&lt;p&gt;Runs on a daily cron job. Every morning at 9 AM, the system queries the database for users inactive for 7 plus days. Three emails spread across the next 30 days, each using a different angle: here is what you are missing, your data is still here, tell us what went wrong.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Win-Back (0 to 30 days after cancellation)
&lt;/h3&gt;

&lt;p&gt;Starts when Stripe sends a cancellation webhook. Three emails over 30 days. The first acknowledges the cancellation. The second highlights what improved. The third makes an offer.&lt;/p&gt;

&lt;p&gt;Here is the part that floored me. A study of 38 SaaS companies found that only 2 covered all of these stages. Two out of thirty-eight. Most stop at welcome and billing. The other 36 are leaving 37% of their email revenue on the table.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Only 2 of 38 SaaS companies cover all nine lifecycle stages. Everyone else is bleeding revenue politely.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Part That Actually Matters: Behavioral Branching
&lt;/h2&gt;

&lt;p&gt;The thing that separates a real lifecycle system from a timer-based drip is branching. Reacting to what users do, not when they signed up.&lt;/p&gt;

&lt;p&gt;Inngest has a feature called &lt;code&gt;step.waitForEvent&lt;/code&gt;. In plain terms, it pauses a sequence and waits for a specific thing to happen before continuing. If that thing does not happen in a time limit, the sequence takes a different path.&lt;/p&gt;

&lt;p&gt;Here is how that plays out for a new user.&lt;/p&gt;

&lt;p&gt;They sign up. The welcome sequence starts. Email 1 sends immediately. Then the function calls &lt;code&gt;step.waitForEvent&lt;/code&gt; and waits up to 48 hours for a &lt;code&gt;user.activated&lt;/code&gt; event, meaning the user completed the product's core action.&lt;/p&gt;

&lt;p&gt;If the event arrives, the sequence skips the activation nudge and jumps straight to feature education. The user already found the value. Nudging them would be noise.&lt;/p&gt;

&lt;p&gt;If the event does not arrive, the activation nudge fires. One email focused on the specific action they have not taken.&lt;/p&gt;

&lt;p&gt;That same branching happens inside every sequence. The upgrade prompt checks if the user upgraded between emails. Churn prevention checks if they logged back in. Win-back checks if they re-subscribed. Between every &lt;code&gt;step.sleep()&lt;/code&gt; call, the function opens a fresh database connection and checks the user's current state. If the user already converted, the sequence stops.&lt;/p&gt;

&lt;p&gt;No one gets an upgrade email the day after they upgraded. No one gets a churn email the morning they came back. The email system knows what the product knows, because it is built inside the same app.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Copy Angle Nobody Teaches You: Loss Framing
&lt;/h2&gt;

&lt;p&gt;I want to single out one specific thing the research changed about my copy. The win-back sequence.&lt;/p&gt;

&lt;p&gt;Most founders write cancellation follow-ups that are sunny and optimistic. "Come back, we miss you, look at all the great new things we built." Positive framing. It feels right.&lt;/p&gt;

&lt;p&gt;It underperforms. By a lot.&lt;/p&gt;

&lt;p&gt;Loss-framed messaging, meaning what you are giving up, converts 21 to 32% better than positive framing. My win-back emails now lead with what the user loses by staying away. The data they cannot access. The automations still running in the background waiting for them. The integration they set up that is about to go stale.&lt;/p&gt;

&lt;p&gt;It feels counterintuitive. Loss framing sounds mean. But humans are wired to avoid loss more than to seek gain, and email copy is the place where that wiring shows up most clearly.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Happened When I Actually Used This
&lt;/h2&gt;

&lt;p&gt;I rolled out the full seven-email onboarding arc on my MVP.&lt;/p&gt;

&lt;p&gt;Trial-to-paid went from 12% to 22% over six weeks. Monthly churn dropped from 8% to 4.8%. Those are the two numbers that compound. A 10-point conversion lift plus a 3-point churn reduction changes the math on every marketing dollar I spend. Every channel I was running suddenly paid back more, because the funnel behind it stopped leaking at the bottom.&lt;/p&gt;

&lt;p&gt;The other thing I did not expect: my support volume went down. People stopped asking me basic feature questions. The education sequence was answering them before they hit the contact form.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Trial-to-paid went from 12% to 22%. Churn went from 8% to 4.8%. Support volume dropped. The emails did the work my onboarding was supposed to do.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Six Rules I Would Keep Even If I Rebuilt This
&lt;/h2&gt;

&lt;p&gt;The pattern works with any email provider, any job engine, any framework. Here is what I would carry forward.&lt;/p&gt;

&lt;h3&gt;
  
  
  Read the product before designing emails
&lt;/h3&gt;

&lt;p&gt;The most common mistake is writing emails that sound generic because the person writing them did not understand the product deeply enough. Feed the system your product docs, user journeys, and brand voice before it touches a template.&lt;/p&gt;

&lt;h3&gt;
  
  
  Design before you build
&lt;/h3&gt;

&lt;p&gt;Present the full plan. Get approval. Then build. Changing a subject line in a plan is free. Changing it in a built template means re-rendering, re-testing, re-deploying.&lt;/p&gt;

&lt;h3&gt;
  
  
  Check user state between every email
&lt;/h3&gt;

&lt;p&gt;Never assume the user is still in the state they were in when the sequence started. A &lt;code&gt;step.sleep("3d")&lt;/code&gt; means three days of potential changes. Query the database before sending.&lt;/p&gt;

&lt;h3&gt;
  
  
  One action per email
&lt;/h3&gt;

&lt;p&gt;Not two buttons. Not three links. One clear next step. Emails with a single call to action get higher click-through rates because there is no decision to make.&lt;/p&gt;

&lt;h3&gt;
  
  
  Build plain text alongside HTML
&lt;/h3&gt;

&lt;p&gt;Do not treat plain text as an afterthought. Render both versions from the same template. Plain text emails get 42% more clicks in head-to-head tests, and you get it for free if you design for both.&lt;/p&gt;

&lt;h3&gt;
  
  
  Wire triggers to events your app already fires
&lt;/h3&gt;

&lt;p&gt;Signup already happens. Cancellation already happens. Limit hits already happen. The email system hooks into events that already exist. That is zero new infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pattern Works Beyond Email
&lt;/h2&gt;

&lt;p&gt;The parallel-agents pattern, one base agent builds shared infrastructure, then specialized agents build on top of it at the same time, is not specific to email.&lt;/p&gt;

&lt;p&gt;I have since used the same shape for notification systems (push, in-app, SMS), onboarding flows (modal, tour, checklist, empty states), and documentation (structure and shared components, then sections written in parallel). Same coordination. Same speed win.&lt;/p&gt;

&lt;p&gt;Read the product context first. Design before building. Build in parallel with shared infrastructure. Check state before acting. Type-check everything at the end. That is the entire recipe.&lt;/p&gt;

&lt;h2&gt;
  
  
  If You Want the Whole Thing
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;/emails&lt;/code&gt; command is part of Build This Now. It is an AI-powered build system for shipping production SaaS in 48 hours. Eighteen specialist agents, 55+ skills, a full production codebase with auth, payments, storage, analytics, and security built in. One-time payment. No subscriptions.&lt;/p&gt;

&lt;p&gt;If you want to stop shipping products that die quietly three weeks after launch, start here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full system: &lt;a href="https://buildthisnow.com" rel="noopener noreferrer"&gt;buildthisnow.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Read the full technical breakdown on the blog: &lt;a href="https://buildthisnow.com/blog/ai-email-sequences" rel="noopener noreferrer"&gt;buildthisnow.com/blog/ai-email-sequences&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The MVP is the product. The lifecycle is the business. Build both.&lt;/p&gt;




</description>
      <category>ai</category>
      <category>claudecode</category>
      <category>saas</category>
      <category>emailmarketing</category>
    </item>
    <item>
      <title>MCP Tool Hooks in Claude Code</title>
      <dc:creator>speedy_devv</dc:creator>
      <pubDate>Fri, 24 Apr 2026 13:22:05 +0000</pubDate>
      <link>https://dev.to/speedy_devv/mcp-tool-hooks-in-claude-code-24f6</link>
      <guid>https://dev.to/speedy_devv/mcp-tool-hooks-in-claude-code-24f6</guid>
      <description>&lt;p&gt;Your hooks run shell scripts. Every time a hook needs to call an MCP server, it spawns a subprocess, wires up transport, handles auth, parses the response, and formats JSON back to stdout. For a security check that fires on every file write, that overhead adds up fast.&lt;/p&gt;

&lt;p&gt;As of Claude Code v2.1.118, there is a cleaner path. Hooks have a new type that calls MCP tools directly. No subprocess. The MCP server is already running. The hook calls straight into its RPC connection.&lt;/p&gt;

&lt;p&gt;Add this to &lt;code&gt;.claude/settings.json&lt;/code&gt; to run a security scan after every file write:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ijs867pybukqry3xh6y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ijs867pybukqry3xh6y.png" alt=" " width="800" height="420"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PostToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Write|Edit"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp_tool"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"server"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"semgrep"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"scan_file"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"${tool_input.file_path}"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the whole thing. The tool's text output goes through the same JSON decision parser as any command hook. No shell, no PATH problems, no jq dependency.&lt;/p&gt;

&lt;h2&gt;
  
  
  What type: "mcp_tool" actually is
&lt;/h2&gt;

&lt;p&gt;Before v2.1.118, hooks had four handler types: &lt;code&gt;command&lt;/code&gt;, &lt;code&gt;http&lt;/code&gt;, &lt;code&gt;prompt&lt;/code&gt;, and &lt;code&gt;agent&lt;/code&gt;. Now there are five:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;command&lt;/code&gt; — shell subprocess (stdin/stdout)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;http&lt;/code&gt; — POST to a URL endpoint&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;mcp_tool&lt;/code&gt; — direct RPC call to a connected MCP server&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;prompt&lt;/code&gt; — single-turn LLM evaluation (Haiku default)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;agent&lt;/code&gt; — multi-turn subagent with Read/Grep/Glob access&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;mcp_tool&lt;/code&gt; type works on every hook event, same as &lt;code&gt;command&lt;/code&gt; and &lt;code&gt;http&lt;/code&gt;. One practical caveat: &lt;code&gt;SessionStart&lt;/code&gt; and &lt;code&gt;Setup&lt;/code&gt; fire while servers are still connecting. Those hooks may get a "server not connected" error on first run. Subsequent runs are fine.&lt;/p&gt;

&lt;h2&gt;
  
  
  The full schema
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp_tool"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"server"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"my-mcp-server"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tool_name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"arg1"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"${tool_input.file_path}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"arg2"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"${session_id}"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"timeout"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"statusMessage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Checking..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"if"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Edit(*.ts|*.tsx)"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three fields are specific to &lt;code&gt;mcp_tool&lt;/code&gt; hooks: &lt;code&gt;server&lt;/code&gt;, &lt;code&gt;tool&lt;/code&gt;, and &lt;code&gt;input&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;server&lt;/code&gt; must exactly match the server name in your MCP configuration. One character difference and the hook silently fails with a non-blocking error.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;input&lt;/code&gt; values support &lt;code&gt;${field.path}&lt;/code&gt; dot-notation into the hook's full event JSON. For a &lt;code&gt;PostToolUse&lt;/code&gt; hook on a &lt;code&gt;Write&lt;/code&gt; call, the event includes &lt;code&gt;tool_input.file_path&lt;/code&gt;, &lt;code&gt;tool_input.content&lt;/code&gt;, &lt;code&gt;session_id&lt;/code&gt;, &lt;code&gt;cwd&lt;/code&gt;, &lt;code&gt;duration_ms&lt;/code&gt;, and more. Any field is reachable.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;if&lt;/code&gt; uses permission-rule syntax. &lt;code&gt;"Edit(*.py|*.ts|*.js)"&lt;/code&gt; means the hook only fires when the matched file extension applies. On a docs-heavy project with constant markdown edits, this is a real performance difference.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters vs. command hooks
&lt;/h2&gt;

&lt;p&gt;Two concrete differences, not just speed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stateful servers.&lt;/strong&gt; A shell subprocess starts fresh every time. An MCP server is a live process with its own state: loaded configs, open connections, caches, accumulated session context. A linting MCP that pre-parsed your &lt;code&gt;tsconfig.json&lt;/code&gt; on startup does not re-parse it on every file write. A command hook does.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No shell environment dependency.&lt;/strong&gt; Command hooks fail silently when &lt;code&gt;PATH&lt;/code&gt; is wrong, when &lt;code&gt;jq&lt;/code&gt; is not installed, when &lt;code&gt;~/.zshrc&lt;/code&gt; prints something to stdout on non-interactive shells. MCP tool hooks bypass all of that. The call goes from Claude Code to the server over the existing RPC connection.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the output is processed
&lt;/h2&gt;

&lt;p&gt;The MCP tool's text content is treated exactly like a command hook's stdout. If it parses as valid JSON, Claude Code acts on the decision fields. If not, the text becomes context for Claude.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"decision"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"block"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Security issue found in src/api.ts: SQL injection risk on line 42."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Return this from a &lt;code&gt;PostToolUse&lt;/code&gt; MCP tool hook and Claude gets the message and fixes the file. For blocking before a tool runs, use &lt;code&gt;PreToolUse&lt;/code&gt; and return &lt;code&gt;permissionDecision: "deny"&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;One field is exclusive to &lt;code&gt;mcp_tool&lt;/code&gt; hooks on &lt;code&gt;PostToolUse&lt;/code&gt;: &lt;code&gt;updatedMCPToolOutput&lt;/code&gt;. It replaces what Claude sees as the tool's output before it enters the conversation. A running MCP server can post-process another tool's result before Claude reads it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 1: Security scanning on every write
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PostToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Write|Edit"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp_tool"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"server"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"semgrep"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"scan_file"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"if"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Write(*.ts|*.py|*.js|*.go)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"${tool_input.file_path}"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"statusMessage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Scanning..."&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The scan runs on the server's cached ruleset. Not a fresh subprocess parse on every keystroke. If the tool finds something, return &lt;code&gt;decision: "block"&lt;/code&gt; with the finding. Claude reworks the file before continuing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 2: Stop hook with external verification
&lt;/h2&gt;

&lt;p&gt;A Stop hook that calls a Linear MCP to check whether the related ticket is actually closed before Claude declares done:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Stop"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp_tool"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"server"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"linear"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"get_issue_status"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"issue_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"${tool_input.issue_id}"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Always check &lt;code&gt;stop_hook_active&lt;/code&gt; in your Stop hook logic. The event JSON includes this field as &lt;code&gt;"true"&lt;/code&gt; when Claude is already continuing from a previous Stop hook firing. A server that ignores this creates an infinite loop. Build the guard into the MCP tool: if &lt;code&gt;stop_hook_active&lt;/code&gt; is &lt;code&gt;"true"&lt;/code&gt; in the input, return empty output and exit cleanly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 3: Production error check before stopping
&lt;/h2&gt;

&lt;p&gt;After Claude finishes a feature, check whether anything new broke in staging before marking the session complete:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Stop"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp_tool"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"server"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sentry"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"get_new_errors_since"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"minutes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"5"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"skip_if_active"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"${stop_hook_active}"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If new errors appeared in the last five minutes, the MCP tool returns them with &lt;code&gt;decision: "block"&lt;/code&gt;. Claude reads the error details and fixes the regression before stopping.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 4: Auto-inject docs before every prompt
&lt;/h2&gt;

&lt;p&gt;A &lt;code&gt;UserPromptSubmit&lt;/code&gt; hook with a Context7 MCP fetches live documentation for any library mentioned in the prompt, before Claude processes it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"UserPromptSubmit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp_tool"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"server"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"context7"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"get_library_docs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"prompt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"${prompt}"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"timeout"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Previously this required Claude to explicitly call the MCP tool. Now it happens on every prompt automatically. Claude starts with current docs instead of training data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 5: Policy enforcement for agent teams
&lt;/h2&gt;

&lt;p&gt;When running multi-agent workflows, a shared policy MCP server can enforce which agent writes to which directories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PreToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Write|Edit"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp_tool"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"server"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"policy-server"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"check_write_permission"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
              &lt;/span&gt;&lt;span class="nl"&gt;"agent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"${agent_name}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
              &lt;/span&gt;&lt;span class="nl"&gt;"path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"${tool_input.file_path}"&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Update the server once and every agent in every project inherits the new rules. No touching individual &lt;code&gt;settings.json&lt;/code&gt; files.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 6: MCP tool hooks in agent frontmatter
&lt;/h2&gt;

&lt;p&gt;Hooks can live in an agent's YAML frontmatter, scoped to that agent's lifecycle:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;backend-developer&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Builds&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;API&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;endpoints&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;and&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;database&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;logic"&lt;/span&gt;
&lt;span class="na"&gt;hooks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;PostToolUse&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;matcher&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write"&lt;/span&gt;
      &lt;span class="na"&gt;hooks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mcp_tool&lt;/span&gt;
          &lt;span class="na"&gt;server&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;semgrep&lt;/span&gt;
          &lt;span class="na"&gt;tool&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;scan_file&lt;/span&gt;
          &lt;span class="na"&gt;input&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path"&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;${tool_input.file_path}"&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
  &lt;span class="na"&gt;Stop&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;hooks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agent&lt;/span&gt;
          &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Verify&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;all&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;API&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;endpoints&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;have&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;corresponding&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tests.&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Block&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;if&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;any&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;are&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;missing."&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each specialist agent in an orchestrated team carries its own validation logic. The backend agent scans for security issues. The frontend agent checks accessibility. Neither needs a global hook that applies to everyone.&lt;/p&gt;

&lt;h2&gt;
  
  
  MCP servers worth pairing with hooks
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Server&lt;/th&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Semgrep&lt;/td&gt;
&lt;td&gt;PostToolUse: Write&lt;/td&gt;
&lt;td&gt;Security scan on every write&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sentry&lt;/td&gt;
&lt;td&gt;Stop&lt;/td&gt;
&lt;td&gt;Check for new staging errors before completing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Linear / Jira&lt;/td&gt;
&lt;td&gt;Stop&lt;/td&gt;
&lt;td&gt;Verify ticket status, update on completion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Context7&lt;/td&gt;
&lt;td&gt;UserPromptSubmit&lt;/td&gt;
&lt;td&gt;Auto-fetch live docs for mentioned libraries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ElevenLabs&lt;/td&gt;
&lt;td&gt;Stop&lt;/td&gt;
&lt;td&gt;TTS audio on task completion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Slack&lt;/td&gt;
&lt;td&gt;Notification, Stop&lt;/td&gt;
&lt;td&gt;Team alerts without curl boilerplate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;E2B&lt;/td&gt;
&lt;td&gt;Stop&lt;/td&gt;
&lt;td&gt;Run generated scripts in a sandbox before marking done&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;claude-mem&lt;/td&gt;
&lt;td&gt;PostCompact, SessionStart&lt;/td&gt;
&lt;td&gt;Restore session context after compaction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;n8n&lt;/td&gt;
&lt;td&gt;TaskCompleted&lt;/td&gt;
&lt;td&gt;Trigger an external workflow on completion&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Known issue: PostToolUse + MCP events + additionalContext
&lt;/h2&gt;

&lt;p&gt;There is an open bug (GitHub issue #24788) where &lt;code&gt;additionalContext&lt;/code&gt; from hooks gets silently dropped when the triggering event was an MCP tool call. This affects &lt;code&gt;type: "command"&lt;/code&gt; hooks that respond to MCP tool events, not &lt;code&gt;mcp_tool&lt;/code&gt; hooks themselves.&lt;/p&gt;

&lt;p&gt;The distinction matters: hooks that are MCP invocations work fine. Hooks that respond to MCP tool calls and return &lt;code&gt;additionalContext&lt;/code&gt; do not. Workaround is &lt;code&gt;exit 2&lt;/code&gt; plus stderr for critical messages. The blocking pattern works. Advisory injection does not.&lt;/p&gt;

&lt;h2&gt;
  
  
  The hook system's last missing piece
&lt;/h2&gt;

&lt;p&gt;Before this, hooks were a safety net. Shell commands that could block dangerous things or run formatters. Stateless, process-local, disconnected from everything your MCP servers already know.&lt;/p&gt;

&lt;p&gt;After: hooks are a deterministic orchestration layer. Any event, any MCP tool, full decision control, with state that persists across calls and no subprocess overhead.&lt;/p&gt;

&lt;p&gt;PreToolUse validates. PostToolUse formats and scans. PostToolBatch runs tests. Stop verifies with real external data. Every step can be an MCP tool invocation. None of them require a shell script.&lt;/p&gt;

&lt;p&gt;Full reference with schema, substitution syntax, and all event types: &lt;a href="https://buildthisnow.com/blog/tools/hooks/mcp-tool-hooks" rel="noopener noreferrer"&gt;https://buildthisnow.com/blog/tools/hooks/mcp-tool-hooks&lt;/a&gt;&lt;/p&gt;

</description>
      <category>claude</category>
      <category>cli</category>
      <category>mcp</category>
      <category>tooling</category>
    </item>
    <item>
      <title>5 Frontier Models Compared</title>
      <dc:creator>speedy_devv</dc:creator>
      <pubDate>Thu, 23 Apr 2026 19:43:03 +0000</pubDate>
      <link>https://dev.to/speedy_devv/5-frontier-models-compared-3mgb</link>
      <guid>https://dev.to/speedy_devv/5-frontier-models-compared-3mgb</guid>
      <description>&lt;p&gt;I kept picking the wrong model.&lt;/p&gt;

&lt;p&gt;Not because I didn't know the benchmarks. Because the benchmarks don't tell you what a model actually costs when you're running it daily, or whether it holds up across a 3-hour agent session, or whether it can fit your whole codebase without truncating half of it.&lt;/p&gt;

&lt;p&gt;Five frontier models shipped in early 2026. All of them are good. None of them is good at everything. DeepSeek V3.2 costs $1 per million input tokens. GPT-5.4 costs $2.50 for the same volume. That is a 2.5x spread at the top of the lineup, and the cheaper model is not always worse for the job at hand.&lt;/p&gt;

&lt;p&gt;Here is what I learned after running them all.&lt;/p&gt;




&lt;h2&gt;
  
  
  The price gap nobody talks about
&lt;/h2&gt;

&lt;p&gt;The headline numbers first, because they set the frame.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Input / Output (per 1M tokens)&lt;/th&gt;
&lt;th&gt;Context Window&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claude Opus 4.7&lt;/td&gt;
&lt;td&gt;$5 / $25&lt;/td&gt;
&lt;td&gt;1M tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT-5.4&lt;/td&gt;
&lt;td&gt;$2.50 / $15&lt;/td&gt;
&lt;td&gt;256K tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kimi K2.6&lt;/td&gt;
&lt;td&gt;$3 / $15&lt;/td&gt;
&lt;td&gt;512K tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 3.1 Pro&lt;/td&gt;
&lt;td&gt;$2 / $12&lt;/td&gt;
&lt;td&gt;2M tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DeepSeek V3.2&lt;/td&gt;
&lt;td&gt;$1 / $4&lt;/td&gt;
&lt;td&gt;128K tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The price gap is real. DeepSeek V3.2 costs a fifth of what Opus 4.7 costs per input token. Context windows vary by 16x from smallest to largest. DeepSeek's 128K window handles a medium codebase. Gemini's 2M window fits an entire monorepo.&lt;/p&gt;

&lt;p&gt;These gaps are not footnotes. For the right workloads, they are the whole decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  Coding: where the separation actually shows up
&lt;/h2&gt;

&lt;p&gt;The standard benchmark is SWE-Bench, real GitHub issues where the model writes a fix that passes the test suite. Good benchmark. It skews toward clean, well-specified problems.&lt;/p&gt;

&lt;p&gt;CursorBench runs a different evaluation. Real prompts from Cursor users. Messy, underspecified, half-broken codebases. The kind of problems actual developers bring to an AI every day.&lt;/p&gt;

&lt;p&gt;Opus 4.7 leads CursorBench at 70%. GPT-5.4 comes close at 68% on SWE-Bench. On clean, well-defined problems the two are nearly even. On messy problems, the gap widens.&lt;/p&gt;

&lt;p&gt;What makes Opus 4.7 different on hard coding tasks is self-correction. Most models generate code, declare it done, and move on. Opus 4.7 reviews what it just wrote, spots the type error or logic gap, and fixes it in the same pass. One fewer debugging loop per session adds up across a week of engineering work. I noticed it first on a nasty legacy codebase with no tests and inconsistent patterns: Opus 4.7 held the thread across multiple refactoring steps where others started drifting.&lt;/p&gt;

&lt;p&gt;Gemini 3.1 Pro scores 63% on SWE-Bench and is a solid coding model when the task requires pulling context from a large codebase. The 2M window means it can read the whole thing. Where it falls behind is on complex reasoning chains where the model has to hold a long chain of logic without losing it.&lt;/p&gt;

&lt;p&gt;DeepSeek V3.2 at 52% is surprisingly capable on standard implementation tasks for its price. Clear prompt, unambiguous problem, it delivers. It does not belong on hard, ambiguous work, and it mostly knows that.&lt;/p&gt;

&lt;h2&gt;
  
  
  Long documents: two different dimensions
&lt;/h2&gt;

&lt;p&gt;Context window size and document reasoning quality are separate things. A huge window is useless if the model loses the plot. Strong reasoning is limited if the document doesn't fit.&lt;/p&gt;

&lt;p&gt;Gemini 3.1 Pro's 2M context is genuinely useful for real workloads: a large monorepo, a full set of legal contracts, a year of financial filings. Nothing gets truncated. If the task is "read everything and extract what matters," Gemini is the right tool.&lt;/p&gt;

&lt;p&gt;Opus 4.7's edge is accuracy over what it reads. On dense source material, it produces 21% fewer errors than its predecessor. That gap shows up most clearly in legal and financial work where a wrong clause or misread number has consequences. You can fit more raw text into Gemini, but Opus 4.7 does more with the text it reads.&lt;/p&gt;

&lt;p&gt;A practical combination for large, high-stakes documents: Gemini 3.1 Pro for the initial pass across the full document, Opus 4.7 for the sections that require careful reasoning. Full picture from Gemini, accuracy from Opus on the parts that matter.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multi-step agents: where the real separation is
&lt;/h2&gt;

&lt;p&gt;Agent tasks are where the gap between models becomes undeniable. A model that is great at one-shot prompts can fall apart when it has to run for 20 steps, use tools, and keep track of what it already did.&lt;/p&gt;

&lt;p&gt;The failure mode looks the same across models: the agent starts losing coherence around step 10 to 15. It forgets what it already checked. It tries an approach it already tried. It produces a "done" message when the task is half-finished.&lt;/p&gt;

&lt;p&gt;Opus 4.7 stays coherent across hours of work. It has the lowest tool error rate of the group. When a tool call returns an unexpected result, it adjusts rather than proceeding on a false assumption. The practical payoff: you can set Opus 4.7 on a multi-hour task, walk away, and come back to actual results.&lt;/p&gt;

&lt;p&gt;GPT-5.4 is strong on short chains, 3 to 5 steps, well-defined, fast. It is the fastest model in this group, which matters for interactive workflows where you are watching and course-correcting in real time. At the long end, reliability drops compared to Opus 4.7.&lt;/p&gt;

&lt;p&gt;DeepSeek V3.2 is the right call for lightweight agent work at volume. Bulk tagging, classification pipelines, structured extraction from well-formatted documents. Running 10M tokens through DeepSeek instead of Opus saves about $61 per batch.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it actually costs per real workload
&lt;/h2&gt;

&lt;p&gt;Headline prices only tell half the story. The actual cost depends on what you are running.&lt;/p&gt;

&lt;p&gt;Daily coding sessions (roughly 200K tokens each):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Cost per Session&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;DeepSeek V3.2&lt;/td&gt;
&lt;td&gt;$0.26&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 3.1 Pro&lt;/td&gt;
&lt;td&gt;$0.75&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kimi K2.6&lt;/td&gt;
&lt;td&gt;$0.90&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT-5.4&lt;/td&gt;
&lt;td&gt;$1.60&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opus 4.7&lt;/td&gt;
&lt;td&gt;$1.75&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For coding sessions, DeepSeek is nearly 7x cheaper than Opus 4.7. GPT-5.4 and Opus are actually close in per-session cost — GPT-5.4 wins on speed, Opus wins on hard problems.&lt;/p&gt;

&lt;p&gt;High-volume automation (10M tokens per month):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Monthly Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;DeepSeek V3.2&lt;/td&gt;
&lt;td&gt;$14&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 3.1 Pro&lt;/td&gt;
&lt;td&gt;$35&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kimi K2.6&lt;/td&gt;
&lt;td&gt;$39&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opus 4.7&lt;/td&gt;
&lt;td&gt;$75&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT-5.4&lt;/td&gt;
&lt;td&gt;$78&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;At bulk volumes, DeepSeek is in a different price category. $14 versus $78 for the same token volume is a fundamentally different operating cost. Gemini 3.1 Pro at $35/month is the surprise here: 2M context at less than half the price of Opus.&lt;/p&gt;

&lt;h2&gt;
  
  
  The default pair for most builders
&lt;/h2&gt;

&lt;p&gt;Opus 4.7 handles the tasks where quality decides the outcome: hard coding, debugging legacy code, long agent runs, precise document analysis. DeepSeek V3.2 handles the tasks where volume and cost decide the outcome: bulk automation, classification, templated generation, anything with a clear spec.&lt;/p&gt;

&lt;p&gt;Those two together cover 90% of what most builders actually need.&lt;/p&gt;

&lt;p&gt;The other three have specific edges worth knowing. Gemini 3.1 Pro for any workload that needs a 2M context window at a competitive price. GPT-5.4 for fast interactive work on clean codebases. Kimi K2.6 for Chinese-language documents at a competitive price.&lt;/p&gt;

&lt;p&gt;The question is never "which model is best." It is "which model is right for this task." Get that right and you spend less, finish faster, and fix fewer mistakes on the other side.&lt;/p&gt;

&lt;p&gt;Full breakdown with all the benchmark tables and cost scenarios is here: &lt;a href="https://dev.tourl"&gt;buildthisnow.com/blog/models/2026-04-21-opus47-vs-frontier&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claudecode</category>
      <category>saas</category>
      <category>buildinpublic</category>
    </item>
    <item>
      <title>Claude Opus 4.7: What Actually Changed for Agentic Coding</title>
      <dc:creator>speedy_devv</dc:creator>
      <pubDate>Thu, 16 Apr 2026 17:46:11 +0000</pubDate>
      <link>https://dev.to/speedy_devv/claude-opus-47-what-actually-changed-for-agentic-coding-4i27</link>
      <guid>https://dev.to/speedy_devv/claude-opus-47-what-actually-changed-for-agentic-coding-4i27</guid>
      <description>&lt;p&gt;Anthropic shipped Opus 4.7 on April 16, 2026. Same $5/$25 pricing as 4.6. Same 1M context. Same 128K output ceiling. Different model entirely when you put it in an agentic coding loop.&lt;/p&gt;

&lt;p&gt;If you run Claude Code or build agent pipelines on the Claude API, this post is the migration brief I wish someone had handed me on release day. Benchmarks, the five behavioral shifts that actually matter, the new API features, and the breaking changes you will hit if you just flip the model ID.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Benchmark Table First
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;| Benchmark              | Opus 4.6 | Opus 4.7              |
|------------------------|----------|-----------------------|
| CursorBench            | 58%      | 70%                   |
| Rakuten-SWE-Bench      | Baseline | 3x resolution         |
| XBOW visual-acuity     | 54.5%    | 98.5%                 |
| OfficeQA Pro errors    | Baseline | 21% fewer             |
| BigLaw Bench (Harvey)  | Lower    | 90.9% at high effort  |
| Notion Agent errors    | Baseline | 1/3 the errors        |
| Factory Droids         | Baseline | +10-15% task success  |
| Bolt long-running apps | Baseline | +10% best case        |
| CodeRabbit recall      | Baseline | +10%                  |
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Rakuten's 3x is the one I keep coming back to. That is real production SWE tasks, not a synthetic eval. CursorBench moving 58 to 70 is what you actually feel in a Claude Code session: fewer rounds per feature.&lt;/p&gt;

&lt;h2&gt;
  
  
  Five Behavioral Shifts
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Self-Verification Is New
&lt;/h3&gt;

&lt;p&gt;Prior Claude models did not verify their own assumptions before acting unless you prompted them to. 4.7 does.&lt;/p&gt;

&lt;p&gt;Vercel reports the model runs proofs on systems code before starting work. Hex reports it flags missing data instead of inventing plausible-but-wrong fallbacks, and resists dissonant-data traps that 4.6 falls for.&lt;/p&gt;

&lt;p&gt;In a Claude Code session this looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Before I write this migration, let me verify the actual shape
of the response object, because my assumption here might be wrong.

[reads file]
[runs grep]

Confirmed. Writing migration now.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You did not ask for that step. The model added it.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Long Runs Stay Coherent
&lt;/h3&gt;

&lt;p&gt;Devin reports 4.7 works coherently for hours on hard problems instead of giving up. Genspark measured loop rates: prior models looped indefinitely on roughly 1 in 18 queries, 4.7 posts the highest quality-per-tool-call ratio they have ever measured.&lt;/p&gt;

&lt;p&gt;Notion saw tool errors cut to a third of 4.6's rate on multi-step workflows. That is not a small optimization. That is a different failure mode.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Fewer Tool Calls, More Thinking
&lt;/h3&gt;

&lt;p&gt;Default behavior shifted. 4.7 thinks more and acts less. Ramp reports less need for step-by-step guidance on cross-tool, cross-codebase debugging.&lt;/p&gt;

&lt;p&gt;If you want the old tool-heavy behavior, raise effort:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;--model&lt;/span&gt; claude-opus-4-7 &lt;span class="nt"&gt;--effort&lt;/span&gt; high
&lt;span class="c"&gt;# or&lt;/span&gt;
claude &lt;span class="nt"&gt;--model&lt;/span&gt; claude-opus-4-7 &lt;span class="nt"&gt;--effort&lt;/span&gt; xhigh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Instructions Get Read Literally
&lt;/h3&gt;

&lt;p&gt;Prompts that relied on 4.6 quietly filling in the gaps will hit unexpected behavior on 4.7. The fix is usually shorter and more explicit, not longer.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Fewer Subagents by Default
&lt;/h3&gt;

&lt;p&gt;If your orchestrator relied on 4.6 fanning out aggressively to specialists, 4.7 will pull that back. You can still request parallel subagent work explicitly.&lt;/p&gt;

&lt;h2&gt;
  
  
  The &lt;code&gt;xhigh&lt;/code&gt; Effort Tier
&lt;/h2&gt;

&lt;p&gt;Adaptive thinking gained a fifth setting. Ordering is now:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;low &amp;lt; medium &amp;lt; high &amp;lt; xhigh &amp;lt; max
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;xhigh&lt;/code&gt; sits between &lt;code&gt;high&lt;/code&gt; and &lt;code&gt;max&lt;/code&gt;. Claude Code raised the default to &lt;code&gt;xhigh&lt;/code&gt; on all plans.&lt;/p&gt;

&lt;p&gt;For API work, Anthropic recommends starting at &lt;code&gt;xhigh&lt;/code&gt; and dialing down:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8192&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;thinking&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;adaptive&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;effort&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;xhigh&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Your agentic task here&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hex's quote is the cost-math line worth memorizing: "low-effort Opus 4.7 is roughly equivalent to medium-effort Opus 4.6." Same quality, lower tier, fewer tokens.&lt;/p&gt;

&lt;h2&gt;
  
  
  Task Budgets (Public Beta)
&lt;/h2&gt;

&lt;p&gt;Task budgets are the feature I keep recommending first. An advisory cap on a full agentic loop: thinking, tool calls, tool results, final output. The model sees a running countdown and self-paces against it.&lt;/p&gt;

&lt;p&gt;Minimum value is 20k tokens. Distinct from &lt;code&gt;max_tokens&lt;/code&gt;, which is a hard ceiling the model never sees.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8192&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;extra_headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anthropic-beta&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;task-budgets-2026-03-13&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;task_budget&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;thinking&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;adaptive&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;effort&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;xhigh&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Build this feature end to end&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model can now plan: "I have 200k tokens to spend. I will use 30k on exploration, 90k on writing, 40k on review, and leave a buffer." A hard &lt;code&gt;max_tokens&lt;/code&gt; just cuts off mid-thought.&lt;/p&gt;

&lt;h2&gt;
  
  
  Vision: 3x Resolution
&lt;/h2&gt;

&lt;p&gt;First Claude model with high-resolution image support. Max image resolution went from 1568px / 1.15MP on prior models to 2576px / 3.75MP on 4.7. Roughly 3x the pixel budget.&lt;/p&gt;

&lt;p&gt;What changes in practice:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dense screenshots where small text has to be readable&lt;/li&gt;
&lt;li&gt;Complex diagrams with nested labels&lt;/li&gt;
&lt;li&gt;High-fidelity design mockups&lt;/li&gt;
&lt;li&gt;Pointing, measuring, counting, bounding-box localization&lt;/li&gt;
&lt;li&gt;Coordinates returned are 1-to-1 with real image pixels (no scale conversion)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Cost note: a full-resolution image can consume up to roughly 4,784 tokens, versus roughly 1,600 on prior models. Downsample when you do not need the fidelity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Breaking Changes in the API
&lt;/h2&gt;

&lt;p&gt;This is the section you have to read before flipping the model ID.&lt;/p&gt;

&lt;h3&gt;
  
  
  Extended Thinking Is Removed
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# 4.6 and earlier (still valid on 4.6)
&lt;/span&gt;&lt;span class="n"&gt;thinking&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;enabled&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;budget_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;8192&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# 4.7 (old shape returns 400)
&lt;/span&gt;&lt;span class="n"&gt;thinking&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;adaptive&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;effort&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;xhigh&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Adaptive thinking is now off by default on 4.7. A request with no &lt;code&gt;thinking&lt;/code&gt; field runs with no thinking at all.&lt;/p&gt;

&lt;h3&gt;
  
  
  Sampling Parameters Are Removed
&lt;/h3&gt;

&lt;p&gt;Setting &lt;code&gt;temperature&lt;/code&gt;, &lt;code&gt;top_p&lt;/code&gt;, or &lt;code&gt;top_k&lt;/code&gt; to non-default values returns a 400:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Returns 400 on 4.7
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-7&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# &amp;lt;-- remove this
&lt;/span&gt;    &lt;span class="n"&gt;top_p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;        &lt;span class="c1"&gt;# &amp;lt;-- remove this
&lt;/span&gt;    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[...]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Drop them. Use prompting to shape behavior.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thinking Content Is Omitted by Default
&lt;/h3&gt;

&lt;p&gt;Thinking blocks still appear in the response stream but their &lt;code&gt;thinking&lt;/code&gt; field is empty unless you opt in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;thinking&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;adaptive&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;effort&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;xhigh&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;display&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;summarized&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your UI streamed thinking to users on 4.6, the new default reads as a long pause before output begins. Add &lt;code&gt;display: "summarized"&lt;/code&gt; back.&lt;/p&gt;

&lt;h3&gt;
  
  
  Prefill Is Blocked
&lt;/h3&gt;

&lt;p&gt;Prefilling assistant messages returns a 400. Use structured outputs or &lt;code&gt;output_config.format&lt;/code&gt; instead. This one carries over from 4.6.&lt;/p&gt;

&lt;h3&gt;
  
  
  New Tokenizer
&lt;/h3&gt;

&lt;p&gt;The same input can now map to 1.0x to 1.35x the prior token count, content-dependent. Re-run &lt;code&gt;/v1/messages/count_tokens&lt;/code&gt; against your representative payloads and re-baseline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;max_tokens&lt;/code&gt; ceilings&lt;/li&gt;
&lt;li&gt;Compaction triggers&lt;/li&gt;
&lt;li&gt;Client-side token estimates&lt;/li&gt;
&lt;li&gt;Cost projections&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Automated Migration
&lt;/h3&gt;

&lt;p&gt;Inside Claude Code, most of the above is handled for you. On the API side:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude /claude-api migrate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Applies the changes across the codebase.&lt;/p&gt;

&lt;h2&gt;
  
  
  New Product Features
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;/ultrareview&lt;/code&gt;&lt;/strong&gt; runs a review session reading through changes and flagging bugs a careful human reviewer would catch. Pro and Max users get three free ultrareviews. I have been using it as a pre-commit gate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Auto mode for Max.&lt;/strong&gt; Previously gated to Team and Enterprise. Now extends to Max subscribers. Longer runs with fewer interruptions, and with less risk than fully skipping permissions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;File-system memory improvements.&lt;/strong&gt; Agents keeping a scratchpad, notes file, or structured memory store across turns make cleaner notes and actually leverage them on later tasks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing Did Not Move
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Input:        $5.00 / 1M tokens
Output:      $25.00 / 1M tokens
Prompt cache: up to 90% savings
Batch:        50% savings
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two things will nudge your bill up:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;New tokenizer (up to 1.35x input token count on some content)&lt;/li&gt;
&lt;li&gt;High-resolution images (up to ~4,784 tokens each)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Task budgets and the effort parameter are how you trade back.&lt;/p&gt;

&lt;p&gt;On GitHub Copilot, 4.7 went GA the same day on Pro+, Business, and Enterprise, carrying a 7.5x premium request multiplier through April 30 as promotional pricing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Switching Claude Code to 4.7
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Set as default&lt;/span&gt;
claude config &lt;span class="nb"&gt;set &lt;/span&gt;model claude-opus-4-7

&lt;span class="c"&gt;# Or override per session&lt;/span&gt;
claude &lt;span class="nt"&gt;--model&lt;/span&gt; claude-opus-4-7
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;opus&lt;/code&gt; alias in Claude Code points at it on current releases. Available on claude.ai, Claude Platform API, AWS Bedrock (research preview), Google Cloud Vertex AI, Microsoft Foundry, and GitHub Copilot.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Reach for It
&lt;/h2&gt;

&lt;p&gt;4.7 when reasoning depth, long agent runs, or high-resolution screenshots are what the job needs. Sonnet 4.6 is still the right call for smaller, faster work where speed and cost decide the tradeoff.&lt;/p&gt;

&lt;p&gt;Full breakdown on the webapp: &lt;a href="https://buildthisnow.com/blog/models/claude-opus-4-7" rel="noopener noreferrer"&gt;buildthisnow.com/blog/models/claude-opus-4-7&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Build your own agentic SaaS with Claude Code in 48 hours: &lt;a href="https://buildthisnow.com" rel="noopener noreferrer"&gt;buildthisnow.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>claude</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
