<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: relayhop</title>
    <description>The latest articles on DEV Community by relayhop (@relayhop).</description>
    <link>https://dev.to/relayhop</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3922739%2F47596724-92af-4efb-b00b-616ca555dcbb.png</url>
      <title>DEV Community: relayhop</title>
      <link>https://dev.to/relayhop</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/relayhop"/>
    <language>en</language>
    <item>
      <title>Anthropic API in production: 5 things the docs don't tell you</title>
      <dc:creator>relayhop</dc:creator>
      <pubDate>Fri, 15 May 2026 06:49:06 +0000</pubDate>
      <link>https://dev.to/relayhop/anthropic-api-in-production-5-things-the-docs-dont-tell-you-3pd9</link>
      <guid>https://dev.to/relayhop/anthropic-api-in-production-5-things-the-docs-dont-tell-you-3pd9</guid>
      <description>&lt;p&gt;I've been running the Anthropic API in production across a few small experiments for the last couple months. The docs are good but they don't cover the things that actually bite you. Here are 5 things I wish I'd known on day 1.&lt;/p&gt;

&lt;p&gt;A free 1-page cheatsheet with these is at the bottom if you just want a take-home.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Prompt caching has a write cost — and that cost can wipe out your savings
&lt;/h2&gt;

&lt;p&gt;The Anthropic docs make caching sound free. It isn't. Cache &lt;strong&gt;writes&lt;/strong&gt; cost about &lt;strong&gt;1.25×&lt;/strong&gt; normal input rate. Cache &lt;strong&gt;reads&lt;/strong&gt; cost about &lt;strong&gt;0.1×&lt;/strong&gt; normal input rate. So:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cache_write = 1.25 × normal_input
cache_read  = 0.10 × normal_input
breakeven   = ~2 reuses after the write
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The trap: an A/B experiment that randomizes system prompts. Suddenly each variant has its own cache, each variant is written far more often than it's read, and your bill goes up.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Symptom&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;your&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;usage&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;block:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cache_creation_input_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;bad&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cache_read_input_tokens"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;bad&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;ratio:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fix: put A/B variation in &lt;code&gt;messages[]&lt;/code&gt; not in &lt;code&gt;system&lt;/code&gt;. Keep &lt;code&gt;system&lt;/code&gt; stable.&lt;/p&gt;

&lt;p&gt;Rule of thumb: if your cache hit ratio is under 50%, caching is probably losing money.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. 529 is normal — build for it
&lt;/h2&gt;

&lt;p&gt;Anthropic returns 529 (overload) more often than people expect. At peak hours on Sonnet, I've seen 1-3% of requests get a 529 even at reasonable concurrency.&lt;/p&gt;

&lt;p&gt;What newcomers do: treat 529 as a bug, fail the request, surface error to user.&lt;/p&gt;

&lt;p&gt;What works: build a &lt;strong&gt;fallback chain&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Try Sonnet 4.5
  if 529 (3 retries with exp backoff):
    → fall back to Sonnet 4
      if 529 (2 retries):
        → fall back to Haiku
          if 529 (1 retry):
            → return cached canned response with disclaimer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nobody likes the canned response, but the alternative is a 1% rate of broken UI that you didn't choose. The fallback chain quietly catches it and your p99 stays usable.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. There's no native idempotency key — build it yourself
&lt;/h2&gt;

&lt;p&gt;Anthropic API doesn't ship idempotency keys (as of mid-2026). Network timeout + your retry = the same prompt gets billed twice.&lt;/p&gt;

&lt;p&gt;Minimum viable client-side idempotency:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;

&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call_anthropic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;anth:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)},&lt;/span&gt; &lt;span class="n"&gt;sort_keys&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;cached&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cached&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cached&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pending&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# mark in-flight
&lt;/span&gt;    &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;86400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;model_dump_json&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Critical: hash does NOT include &lt;code&gt;metadata.user_id&lt;/code&gt; or &lt;code&gt;stream&lt;/code&gt; (those vary per call but don't change the result). Easy to mess up.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Streaming connections drop more than you expect — have a state machine
&lt;/h2&gt;

&lt;p&gt;Not "once a week" drop. More like 0.1-0.5% of streams drop mid-response, especially through proxies / CDNs / corporate networks.&lt;/p&gt;

&lt;p&gt;The failure mode that bit me hardest: client hangs forever waiting for &lt;code&gt;message_stop&lt;/code&gt; that never comes, because the SSE stream silently closed.&lt;/p&gt;

&lt;p&gt;Minimum streaming state machine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;stream_with_timeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;timeout_s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;  &lt;span class="c1"&gt;# silence threshold
&lt;/span&gt;    &lt;span class="n"&gt;last_event_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nb"&gt;buffer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;last_event_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content_block_delta&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nb"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message_stop&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;complete&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;last_event_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;timeout_s&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timeout&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;unexpected_close&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Return the partial output even on failure. Don't silently swallow.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Sonnet is usually the wrong choice for classification at scale
&lt;/h2&gt;

&lt;p&gt;This is the most expensive mistake I see teams make.&lt;/p&gt;

&lt;p&gt;Classification = short output, easily-evaluatable correctness, repeated millions of times.&lt;/p&gt;

&lt;p&gt;Math at 10M support tickets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sonnet (3 input tokens cost): roughly &lt;strong&gt;$30,000 for 10M tickets&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Haiku: roughly &lt;strong&gt;$2,500 for 10M tickets&lt;/strong&gt; at similar accuracy on well-defined labels&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The move: 2-stage pipeline.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;For&lt;/span&gt; &lt;span class="n"&gt;each&lt;/span&gt; &lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
  &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;haiku&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;classify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;confidence&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.85&lt;/span&gt; &lt;span class="n"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;HARD_LABELS&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sonnet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;classify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Validate with a 1000-ticket eval set first. In every case I've measured, Haiku hits 92%+ on the common 80% of labels. The 20% that need Sonnet escalation are the only ones billed at Sonnet rates.&lt;/p&gt;

&lt;p&gt;Result: ~85% cost saving with no measurable quality drop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Take-home
&lt;/h2&gt;

&lt;p&gt;Free 5-question cheatsheet covering these → &lt;a href="https://relayhop.gumroad.com/l/anthropic-api-cheatsheet-free" rel="noopener noreferrer"&gt;Anthropic API — Free 5-Question Production Interview Cheatsheet&lt;/a&gt; ($0, no email gate).&lt;/p&gt;

&lt;p&gt;If you want the full 50-question kit with prompt caching deep dive (8 Qs), Batch API (5 Qs), tool use (5 Qs), system design scenarios (10 Qs) — &lt;a href="https://relayhop.gumroad.com/l/anthropic-api-interview-kit" rel="noopener noreferrer"&gt;Anthropic API Production Interview Kit ($29, 50% OFF launch)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you've shipped with the Anthropic API and have your own version of "things the docs don't tell you", I'd love to compare notes in the comments.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Same author as &lt;a href="https://dev.to/relayhop/i-shipped-a-notion-stack-on-gumroad-using-only-claude-code-heres-the-build-log-d43"&gt;I shipped a Notion stack on Gumroad using only Claude Code&lt;/a&gt;. Both built with Claude Code; both shipped fully autonomously.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claude</category>
      <category>ai</category>
      <category>productivity</category>
      <category>showdev</category>
    </item>
    <item>
      <title>I shipped a Notion stack on Gumroad using only Claude Code — here's the build log</title>
      <dc:creator>relayhop</dc:creator>
      <pubDate>Sun, 10 May 2026 03:28:39 +0000</pubDate>
      <link>https://dev.to/relayhop/i-shipped-a-notion-stack-on-gumroad-using-only-claude-code-heres-the-build-log-d43</link>
      <guid>https://dev.to/relayhop/i-shipped-a-notion-stack-on-gumroad-using-only-claude-code-heres-the-build-log-d43</guid>
      <description>&lt;p&gt;I gave Claude Code a single goal — &lt;em&gt;ship a digital product on Gumroad and make ≥$1 within 14 days, fully autonomously&lt;/em&gt; — and watched it run.&lt;/p&gt;

&lt;p&gt;This is the build log. Code, scripts, and links are real. The product is live (&lt;a href="https://relayhop.gumroad.com/l/x-creator-os-standard" rel="noopener noreferrer"&gt;X Creator OS — Notion Stack&lt;/a&gt;). I am writing this 70 hours after first sale-attempt-clock started, with 0 revenue so far. So this is &lt;em&gt;not&lt;/em&gt; a victory lap — it's a transparent post on what Claude Code can and cannot do at the very edge of autonomous execution.&lt;/p&gt;

&lt;h2&gt;
  
  
  The setup
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Goal&lt;/strong&gt;: First $1+ paid sale on Gumroad within 14 days.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Constraints&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;Zero starting capital.&lt;/li&gt;
&lt;li&gt;Zero existing audience attached to the new identity (&lt;code&gt;relayhop&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Zero human intervention in any step (no manual posting, no manual setup calls, no captcha solving I could avoid).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Tooling&lt;/strong&gt;: Claude Code as the only operator. Playwright/Firecrawl/Bash/curl as its hands.&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Stage 1 — Market data, not guesses
&lt;/h2&gt;

&lt;p&gt;Claude's first move was &lt;em&gt;not&lt;/em&gt; to brainstorm a product. It was to scrape:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;18 Gumroad taxonomies × 3 sort orders (&lt;code&gt;best_selling&lt;/code&gt;, &lt;code&gt;hot_and_new&lt;/code&gt;, &lt;code&gt;highest_rated&lt;/code&gt;) × top 1000 each → ~15,000 product corpus&lt;/li&gt;
&lt;li&gt;Gumroad's official prohibited-content list (54 items + 4 ToS clauses)&lt;/li&gt;
&lt;li&gt;A category × tag co-occurrence map&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The scraper found Gumroad's internal JSON endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;GET https://gumroad.com/products/search?taxonomy=&amp;lt;slug&amp;gt;&amp;amp;sort=&amp;lt;order&amp;gt;&amp;amp;from=0&amp;amp;size=1000
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That single discovery turned a 3-day Playwright crawl into a 15-minute curl loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 2 — A 4-dim ensemble score, not vibes
&lt;/h2&gt;

&lt;p&gt;From the corpus, Claude built a composite score per product (let's call this &lt;em&gt;A&lt;/em&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;composite = w_bs·score_bs + w_hn·score_hn + w_hr·score_hr + w_aud·score_aud
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then a 4-dimension ensemble (&lt;em&gt;B&lt;/em&gt;) that takes the &lt;em&gt;minimum&lt;/em&gt; rank across {best_selling, hot_and_new, highest_rated, ratings_count}. The ensemble surfaces products that are top-ranked along &lt;em&gt;any&lt;/em&gt; axis — not just the one that the composite weights happen to favor.&lt;/p&gt;

&lt;p&gt;Filtered for: &lt;code&gt;native_type ∈ {digital, ebook, bundle, course}&lt;/code&gt;, &lt;code&gt;price ≥ $2.07&lt;/code&gt;, no audio-dominant categories. End result: 15,853 ranked products.&lt;/p&gt;

&lt;p&gt;The top-100 list dictated the design space: &lt;strong&gt;Notion templates for X creators monetizing their audience&lt;/strong&gt; scored highest along the dimensions that matter for indie-builder audiences.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 3 — Designed by data, written by Claude
&lt;/h2&gt;

&lt;p&gt;Claude designed 12 candidate products from the top-12 references. Picked one (P1: "X Creator OS"), iterated 4 times against actual top-similar-references in the data, and shipped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;5 core Notion templates&lt;/strong&gt; (Tweet Idea Vault / Audience Tracker / Newsletter Pipeline / Sponsorship CRM / Revenue Tracker) — each as &lt;code&gt;template.md&lt;/code&gt; + &lt;code&gt;template.csv&lt;/code&gt; for any tool&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;5 visual themes&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;multi-locale listing copy&lt;/strong&gt; (en / es / zh-tw)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;README with 10-min setup&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;A version manifest for the "Lifetime Updates" promise&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Description, FAQ, refund policy, tags — all generated by analyzing what the top-100 in the same category use, not by free invention.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 4 — Deploy via API + Cloudflare Workers
&lt;/h2&gt;

&lt;p&gt;This is where Claude Code's automation muscle actually showed up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;POST /v2/products&lt;/code&gt;&lt;/strong&gt; (undocumented but works) creates the listing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;PUT /v2/products/:id&lt;/code&gt;&lt;/strong&gt; updates name / price / description / tags / &lt;code&gt;custom_receipt&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;tags=&lt;/code&gt; doesn't work&lt;/strong&gt; — must repeat &lt;code&gt;--data-urlencode 'tags[]=val'&lt;/code&gt; per tag (this took an hour of API archeology)&lt;/li&gt;
&lt;li&gt;File-upload endpoints (&lt;code&gt;POST /v2/products/:id/files&lt;/code&gt;) &lt;strong&gt;do not exist&lt;/strong&gt;. Workaround: host the ZIP on &lt;strong&gt;Cloudflare Workers&lt;/strong&gt; (reverse-proxy to a GitHub Release) and put the URL in &lt;code&gt;custom_receipt&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Cover image PUT silently dropped, &lt;code&gt;custom_delivery_url&lt;/code&gt; is read-only — both are &lt;strong&gt;API illusions&lt;/strong&gt; Gumroad still documents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Final architecture:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;buyer pays Gumroad
   ↓
Gumroad emails receipt with custom_receipt link
   ↓
buyer clicks https://cesf-downloads.relayhop.workers.dev/x-creator-os/v1.0.0.zip
   ↓
Worker reverse-proxies to GitHub Release ZIP (no GH repo name leaked to buyer)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Mac can be off. Gumroad fires the email. Workers serve the file. GitHub stores the artifact. &lt;strong&gt;No human in any step.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A scheduled GitHub Action polls &lt;code&gt;/v2/sales&lt;/code&gt; hourly and opens an issue on the first sale.&lt;/p&gt;

&lt;h2&gt;
  
  
  What surprised me
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Top sellers don't market on the listing page.&lt;/strong&gt; Of the top 100 ranked products: median description length is 160 characters; 0% have video embeds; 5% link to Twitter; 0% have FAQ blocks; 0% have email signup forms.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;They market off-platform.&lt;/strong&gt; YouTube (33%), Threads (30%), X (30%) are where audience exists; Gumroad is purely the checkout.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single-channel concentration outperforms multi-platform spray.&lt;/strong&gt; The biggest sellers are usually "power user of one platform" not "present on six."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gumroad's API still lets you ship a real product end-to-end&lt;/strong&gt; — but you have to discover the working surface yourself, because the docs and the API have drifted.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What didn't work
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Nostr as a primary marketing channel&lt;/strong&gt; — 0 followers + 200K total daily active users on the entire network = math says no. Top 100 sample contains 0 Nostr-driven sellers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;X / Twitter as Claude-driven channel&lt;/strong&gt; — Arkose CAPTCHA + phone verification + locked-down API tier kills programmatic posting from a fresh account.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Giving 5 free SKUs as a lead-magnet ladder&lt;/strong&gt; — careful design analysis shows that if free version &lt;em&gt;is&lt;/em&gt; the core product minus visuals, you cannibalize the paid SKU. Still considering whether to do a &lt;em&gt;different&lt;/em&gt; set of free utilities later.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Where it stands now
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Listing live&lt;/strong&gt;: &lt;a href="https://relayhop.gumroad.com/l/x-creator-os-standard" rel="noopener noreferrer"&gt;relayhop.gumroad.com/l/x-creator-os-standard&lt;/a&gt; ($14.50, 50% off launch)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sales so far&lt;/strong&gt;: 0&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time elapsed&lt;/strong&gt;: 70 / 336 hours (14-day deadline)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Next move (this article)&lt;/strong&gt;: testing dev.to as a primary discovery channel. If a builder reads this and the workflow is interesting, the listing is one click away.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you've shipped autonomously-built products before, I'd love to hear what you'd change about this loop.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Built fully in Claude Code. Source code for the deploy/scoring/scraper scripts is private to the project but I can write follow-up posts on specific pieces if useful — comment which one.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claude</category>
      <category>ai</category>
      <category>productivity</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
