<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Akram Bakhouche</title>
    <description>The latest articles on DEV Community by Akram Bakhouche (@akram_bakhouche).</description>
    <link>https://dev.to/akram_bakhouche</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F361309%2F1b2e660c-bd6a-47f9-becc-9aec3da47db8.jpg</url>
      <title>DEV Community: Akram Bakhouche</title>
      <link>https://dev.to/akram_bakhouche</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/akram_bakhouche"/>
    <language>en</language>
    <item>
      <title>Prompt caching in production: the 4 patterns that cut my Anthropic bill (and when not to bother)</title>
      <dc:creator>Akram Bakhouche</dc:creator>
      <pubDate>Thu, 28 May 2026 16:14:58 +0000</pubDate>
      <link>https://dev.to/akram_bakhouche/prompt-caching-in-production-the-4-patterns-that-cut-my-anthropic-bill-and-when-not-to-bother-1h5e</link>
      <guid>https://dev.to/akram_bakhouche/prompt-caching-in-production-the-4-patterns-that-cut-my-anthropic-bill-and-when-not-to-bother-1h5e</guid>
      <description>&lt;p&gt;The first month I ran Career-OS in production, the Anthropic bill was bigger&lt;br&gt;
than my coffee budget. After I wired prompt caching properly into the scorer,&lt;br&gt;
the drafter, and the digest, it dropped under it. Same calls. Same model.&lt;br&gt;
Same outputs. Roughly an 80% cost reduction in one afternoon.&lt;/p&gt;

&lt;p&gt;Prompt caching is the single highest-leverage knob in the Claude SDK. It's&lt;br&gt;
also the one I see misconfigured most often in client code — usually because&lt;br&gt;
people read the docs, slap &lt;code&gt;cache_control&lt;/code&gt; on something, and assume they're&lt;br&gt;
caching when they're not.&lt;/p&gt;

&lt;p&gt;Here are the 4 patterns I ship in production, with the cost math, and the 4&lt;br&gt;
cases where caching genuinely does not help so you don't waste a day on it.&lt;/p&gt;
&lt;h2&gt;
  
  
  What prompt caching actually does
&lt;/h2&gt;

&lt;p&gt;The mechanics, in three lines, because you need to know this to use it right:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;cached block&lt;/strong&gt; (added with &lt;code&gt;"cache_control": { "type": "ephemeral" }&lt;/code&gt;)
is stored on Anthropic's side after the first call. Subsequent calls with
an identical cached block hit the cache instead of re-processing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;First call to a cache block costs 1.25× the base input price&lt;/strong&gt; (cache
write). Every subsequent call within the TTL costs &lt;strong&gt;0.1× the base
price&lt;/strong&gt; (cache read). The break-even is at the second call.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Default TTL is 5 minutes.&lt;/strong&gt; A 1-hour TTL is available at 2× write cost.
Plan for the TTL — it shapes the entire pattern.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your workload calls the same prompt twice within 5 minutes, caching pays&lt;br&gt;
off. If you call it once an hour with no warmup, you're paying the write&lt;br&gt;
penalty for nothing.&lt;/p&gt;
&lt;h2&gt;
  
  
  Pattern 1 — Cache the system block
&lt;/h2&gt;

&lt;p&gt;The pattern everyone reaches for first, and the one that gives the biggest&lt;br&gt;
win in 90% of cases.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/api/agent/route.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;Anthropic&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@anthropic-ai/sdk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;claude&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;POST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;question&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reply&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;claude&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;claude-sonnet-4-6&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;max_tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;system&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;SYSTEM_PROMPT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                    &lt;span class="c1"&gt;// 2,400 tokens of context&lt;/span&gt;
        &lt;span class="na"&gt;cache_control&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ephemeral&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;   &lt;span class="c1"&gt;// ← the magic&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;question&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The math, for a 2,400-token system prompt called 100 times in 5 minutes (the&lt;br&gt;
realistic shape of a busy support endpoint):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Without caching&lt;/strong&gt;: 100 × 2,400 × $3/M input = &lt;strong&gt;$0.72&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;With caching&lt;/strong&gt;: 1 × 2,400 × $3.75/M (write) + 99 × 2,400 × $0.30/M (read) = &lt;strong&gt;$0.08&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Savings: ~89%.&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The break-even is between the 1st and 2nd call. After call 2 you're already&lt;br&gt;
ahead. After call 100 you've collapsed an 89% chunk of your bill into&lt;br&gt;
operating expense.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cache hits are silent.&lt;/strong&gt; The API returns &lt;code&gt;cache_creation_input_tokens&lt;/code&gt; and&lt;br&gt;
&lt;code&gt;cache_read_input_tokens&lt;/code&gt; in the usage block. Log them. If you're not seeing&lt;br&gt;
reads, you're not caching:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;cache_write&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cache_creation_input_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;cache_read&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="nx"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cache_read_input_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;uncached&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="nx"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;input_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A single dashboard tile showing cache_read / (cache_read + uncached) tells&lt;br&gt;
you whether your caching is working. Mine sits at 94% for the Career-OS&lt;br&gt;
scorer during morning crawl runs.&lt;/p&gt;
&lt;h2&gt;
  
  
  Pattern 2 — Cache long documents (the RAG-adjacent pattern)
&lt;/h2&gt;

&lt;p&gt;The pattern that actually changes which architectures are economically viable.&lt;/p&gt;

&lt;p&gt;Say you have a 30,000-token product manual, customer policy document, or&lt;br&gt;
codebase. Without caching, every customer question costs you ~$0.09 in input&lt;br&gt;
tokens alone. With caching, your &lt;em&gt;first&lt;/em&gt; question of the day costs you&lt;br&gt;
~$0.11, and every subsequent question costs $0.01.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# document_qa.py
&lt;/span&gt;
&lt;span class="n"&gt;reply&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;LIGHT_INSTRUCTIONS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;         &lt;span class="c1"&gt;# 200 tokens, uncached
&lt;/span&gt;        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SHOP_POLICY_DOCUMENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="c1"&gt;# 30,000 tokens, CACHED
&lt;/span&gt;            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cache_control&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ephemeral&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user_question&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What this kills: most of the use cases people built RAG for. If your&lt;br&gt;
"retrieval over a fixed corpus" use case fits inside Claude's 200K context,&lt;br&gt;
caching the full document is often cheaper and &lt;em&gt;always more accurate&lt;/em&gt; than&lt;br&gt;
embedding-based retrieval. No chunking. No top-k tuning. No vector DB&lt;br&gt;
operational burden.&lt;/p&gt;

&lt;p&gt;The catch: the corpus has to be relatively stable. If your "document" is&lt;br&gt;
yesterday's database dump, you're paying the cache write fee every single&lt;br&gt;
day. Use cache for things that change weekly, not hourly.&lt;/p&gt;
&lt;h2&gt;
  
  
  Pattern 3 — Cache tool definitions
&lt;/h2&gt;

&lt;p&gt;Tool use blocks are tokens. They count. And they're identical across every&lt;br&gt;
call to the same agent.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;TOOLS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_orders&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input_schema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{...}},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;issue_refund&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input_schema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{...}},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lookup_user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input_schema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{...}},&lt;/span&gt;
    &lt;span class="c1"&gt;# … 12 tools in total, ~3,500 tokens of schema
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;reply&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;TOOLS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cache_control&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ephemeral&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[...],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you cache the system block, &lt;strong&gt;tool definitions get cached too&lt;/strong&gt; if&lt;br&gt;
they're declared in the same call. They become part of the cached prefix.&lt;br&gt;
You don't need a separate &lt;code&gt;cache_control&lt;/code&gt; on the tools array — the cache&lt;br&gt;
boundary extends through everything in the system block and the tools.&lt;/p&gt;

&lt;p&gt;This is a 3,500-token win you get for free when you're already caching the&lt;br&gt;
system block. Most of the time it's already happening and you don't realize&lt;br&gt;
it. Worth confirming with the cache_creation_input_tokens log line.&lt;/p&gt;
&lt;h2&gt;
  
  
  Pattern 4 — Conversation prefix caching for multi-turn agents
&lt;/h2&gt;

&lt;p&gt;The pattern that makes long-running agentic loops affordable.&lt;/p&gt;

&lt;p&gt;Multi-turn agents — the ones that loop through &lt;code&gt;assistant → tool_use → tool_result → assistant → tool_use → …&lt;/code&gt; — re-send the entire conversation&lt;br&gt;
history on every call. By turn 8, you're sending 12,000+ tokens of history,&lt;br&gt;
most of which is unchanged from turn 7.&lt;/p&gt;

&lt;p&gt;Cache the prefix.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;agent_loop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;initial_message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;initial_message&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;turn&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_turns&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Cache everything up to the last assistant turn
&lt;/span&gt;        &lt;span class="n"&gt;cached_messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;mark_last_message_cached&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;reply&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;TOOLS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cache_control&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ephemeral&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}],&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;cached_messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stop_reason&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;end_turn&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;

        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;run_tools&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;)})&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;mark_last_message_cached&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Add cache_control to the last user message so the whole prefix caches.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;last&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;last&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;last&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;last&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}]&lt;/span&gt;
        &lt;span class="n"&gt;last&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cache_control&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ephemeral&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;last&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each new turn extends the cached prefix by the previous turn's content. By&lt;br&gt;
turn 10, ~95% of your input tokens hit cache reads. An agent loop that would&lt;br&gt;
cost $0.40 to run uncached costs $0.05 with this pattern.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 4 cases where caching does NOT help
&lt;/h2&gt;

&lt;p&gt;This is where I see clients waste afternoons. Be honest about whether your&lt;br&gt;
workload fits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Your prompts vary too much.&lt;/strong&gt; If each call has a different system&lt;br&gt;
prompt (you're concatenating user-specific data into it, or A/B-testing&lt;br&gt;
prompt variants), there's no shared cache prefix to hit. Either restructure&lt;br&gt;
to push the variation into the messages block (keeping the system stable),&lt;br&gt;
or accept that caching isn't your lever.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Your volume is low.&lt;/strong&gt; If you call the model 5 times an hour spread&lt;br&gt;
evenly, the 5-minute TTL means you almost never hit a warm cache. The&lt;br&gt;
1-hour TTL helps but doubles the write cost. At extremely low volumes the&lt;br&gt;
math sometimes works out to "uncached is cheaper."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Your prompts are short.&lt;/strong&gt; Below ~1,024 tokens of cacheable content (the&lt;br&gt;
Anthropic minimum), caching just doesn't activate. The write cost is paid;&lt;br&gt;
no cache is created. Quietly. Check the usage block.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Your content is per-user and short-lived.&lt;/strong&gt; If the cached content is&lt;br&gt;
specific to one user and they only make one or two calls, you're paying the&lt;br&gt;
write penalty without ever hitting the cache. Aggregation across users or&lt;br&gt;
sessions doesn't apply.&lt;/p&gt;

&lt;h2&gt;
  
  
  Operational hygiene
&lt;/h2&gt;

&lt;p&gt;The three things to wire up &lt;em&gt;before&lt;/em&gt; you ship cached calls:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Log &lt;code&gt;cache_creation_input_tokens&lt;/code&gt; and &lt;code&gt;cache_read_input_tokens&lt;/code&gt; for
every call.&lt;/strong&gt; Without this, you have no idea if caching is working.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Alert on cache hit rate dropping.&lt;/strong&gt; If your dashboard shows 90% hits
on Monday and 12% on Tuesday, something changed in your prompt
structure. Tuesday's bill will reflect it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't put PII or per-user secrets in the cached block.&lt;/strong&gt; Cached
content is reused. Anything you put in there is shared across every call
that hits the same cache key. Put per-user context in the user messages
block where it belongs.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What this is worth, dollars and time
&lt;/h2&gt;

&lt;p&gt;For Career-OS, the four patterns above collapsed the morning crawl-and-score&lt;br&gt;
run from "noticeable on the bill" to "rounding error." Setup time: one&lt;br&gt;
afternoon. Ongoing maintenance: the three log lines + one dashboard tile.&lt;/p&gt;

&lt;p&gt;For an inbound support agent handling 20,000 queries a month: easily&lt;br&gt;
$200–$400/month saved versus uncached, every month, forever, with the same&lt;br&gt;
quality of output.&lt;/p&gt;

&lt;p&gt;For a documentation-QA endpoint over a stable corpus: the difference between&lt;br&gt;
"too expensive to ship to all users" and "an obvious feature." I've watched&lt;br&gt;
this single decision unblock entire roadmap items.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to call
&lt;/h2&gt;

&lt;p&gt;If you have a Claude-powered feature in production today and you do not have&lt;br&gt;
a dashboard tile showing cache hit rate, that's the bug. Cache misses are&lt;br&gt;
silent and your bill is paying for them.&lt;/p&gt;

&lt;p&gt;This is a 1–3 day scoped audit + fix that I take on:&lt;br&gt;
&lt;a href="https://dev.to/hire-me"&gt;the shape is on the hire-me page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For the full context where these patterns ship, see the&lt;br&gt;
&lt;a href="https://dev.to/blog/career-os-architecture"&gt;Career-OS architecture walkthrough&lt;/a&gt;.&lt;br&gt;
For the upstream patterns — where to bolt the Claude call onto your stack&lt;br&gt;
in the first place — see the&lt;br&gt;
&lt;a href="https://dev.to/blog/5-places-to-bolt-ai-onto-laravel"&gt;5 places to bolt AI onto Laravel&lt;/a&gt;&lt;br&gt;
and the&lt;br&gt;
&lt;a href="https://dev.to/blog/claude-agent-prestashop-5-files"&gt;PrestaShop 5-file pattern&lt;/a&gt;.&lt;br&gt;
And before any of this ships to production, the&lt;br&gt;
&lt;a href="https://dev.to/blog/evaluating-claude-features-before-production"&gt;eval harness post&lt;/a&gt;&lt;br&gt;
is the discipline that catches the regressions caching alone can't.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://bak-dev.com/blog/prompt-caching-claude-sdk-production-patterns" rel="noopener noreferrer"&gt;bak-dev.com&lt;/a&gt;. Find more build-in-public posts at &lt;a href="https://bak-dev.com/blog" rel="noopener noreferrer"&gt;bak-dev.com/blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudesdk</category>
      <category>promptengineering</category>
      <category>costoptimization</category>
      <category>aiagents</category>
    </item>
    <item>
      <title>How I evaluate Claude SDK features before shipping them to production</title>
      <dc:creator>Akram Bakhouche</dc:creator>
      <pubDate>Thu, 28 May 2026 13:34:22 +0000</pubDate>
      <link>https://dev.to/akram_bakhouche/how-i-evaluate-claude-sdk-features-before-shipping-them-to-production-1eac</link>
      <guid>https://dev.to/akram_bakhouche/how-i-evaluate-claude-sdk-features-before-shipping-them-to-production-1eac</guid>
      <description>&lt;p&gt;The riskiest line of code in your codebase is not the database transaction or&lt;br&gt;
the third-party API call. It's the LLM prompt — because when it silently&lt;br&gt;
regresses, &lt;strong&gt;nothing visible breaks&lt;/strong&gt;. The endpoint still returns 200. The&lt;br&gt;
JSON still parses. Test suite still green. The model just started giving&lt;br&gt;
worse answers, and you find out from a customer.&lt;/p&gt;

&lt;p&gt;I have shipped enough Claude SDK features in production to be afraid of this.&lt;br&gt;
Here is the eval harness pattern I now ship with every LLM-powered feature.&lt;br&gt;
It takes about an hour to build per feature and has saved me from at least&lt;br&gt;
three subtle regressions in Career-OS alone.&lt;/p&gt;
&lt;h2&gt;
  
  
  The risk you can't see
&lt;/h2&gt;

&lt;p&gt;A normal regression looks like this: you change a function, a test fails,&lt;br&gt;
you fix it.&lt;/p&gt;

&lt;p&gt;An LLM regression looks like this: you tweak the system prompt to fix a tone&lt;br&gt;
issue on Monday. The fix works. Three weeks later, your scoring drift down by&lt;br&gt;
12% on a class of inputs you didn't think about. No test failed. No alert&lt;br&gt;
fired. You only notice because a downstream metric (apply-rate, conversion,&lt;br&gt;
support-ticket close time) is off, and that's &lt;em&gt;if&lt;/em&gt; you're watching.&lt;/p&gt;

&lt;p&gt;The four common silent regressions I have personally hit:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Tweaking the system prompt broke a tone you weren't testing.&lt;/strong&gt; You
added "be friendly" and lost "be precise about prices."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A new Anthropic model version (e.g. Sonnet 4.5 → 4.6) calibrated
differently.&lt;/strong&gt; Same prompt, same input, different score by 8 points.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A library upgrade silently dropped a parameter&lt;/strong&gt; (max_tokens, temperature).
Defaults kicked in. Outputs got longer and looser.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache invalidation drift.&lt;/strong&gt; You added a field to the system prompt;
the cache hash changed; cost per call jumped 6× before you noticed.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All four are invisible to unit tests. All four are caught by a single eval&lt;br&gt;
harness that takes a few hours to build.&lt;/p&gt;
&lt;h2&gt;
  
  
  The pattern
&lt;/h2&gt;

&lt;p&gt;You need three things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;A fixtures file&lt;/strong&gt; — frozen inputs with expected outputs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A scorer&lt;/strong&gt; — measures how far each actual output drifted from expected.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A runner&lt;/strong&gt; — replays every fixture through the live model, fails the
build if any fixture drifted past tolerance.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's it. No fancy framework. Anything more is YAGNI.&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 1 — Fixtures
&lt;/h2&gt;

&lt;p&gt;Pick 8–15 hand-labeled cases. Cover the edge cases you actually care about,&lt;br&gt;
not just the happy path. For a job-fit scorer, mine looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// tests/fixtures/scored_jobs.jsonl

{"id":"f001","input":{...},"expected":{"fit":88,"angle":"strong fullstack + AI"},"tolerance":{"fit":5}}
{"id":"f002","input":{...},"expected":{"fit":35,"angle":"junior role, decline"},"tolerance":{"fit":5}}
{"id":"f003","input":{...},"expected":{"fit":72,"angle":"good freelance fit"},"tolerance":{"fit":7}}
{"id":"f004","input":{...},"expected":{"fit":15,"angle":"crypto, disqualify"},"tolerance":{"fit":3}}
// …
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Per-fixture tolerance is the move I most often see skipped. Different inputs&lt;br&gt;
have different acceptable variance. A clear "strong fit" might tolerate ±5&lt;br&gt;
points. A clear "disqualify" should be tight — ±3 — because we want certainty&lt;br&gt;
on rejection. A borderline case might tolerate ±10 because it's genuinely&lt;br&gt;
ambiguous.&lt;/p&gt;

&lt;p&gt;Label them yourself. Don't generate fixtures with another LLM. The whole&lt;br&gt;
point is that you, the human, know what right looks like.&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 2 — Scorer
&lt;/h2&gt;

&lt;p&gt;For structured-output features, the scorer is plain arithmetic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# tests/eval/score_fit.py
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tolerance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;EvalResult&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;fit_delta&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;fit_ok&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fit_delta&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;tolerance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;angle_overlap&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;jaccard&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;angle&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
        &lt;span class="nf"&gt;tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;angle&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;angle_ok&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;angle_overlap&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.3&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;EvalResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;passed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;fit_ok&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;angle_ok&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;fit_delta&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;fit_delta&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;angle_overlap&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;angle_overlap&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For free-form outputs (cover letters, descriptions), don't try to match the&lt;br&gt;
text exactly. Score for &lt;strong&gt;constraints&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Did it use the requested language?&lt;/li&gt;
&lt;li&gt;Did it stay under the word count?&lt;/li&gt;
&lt;li&gt;Did it mention every required fact?&lt;/li&gt;
&lt;li&gt;Did it avoid the forbidden phrases?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are deterministic checks against the output text. They're easier to&lt;br&gt;
write than people fear. For Career-OS's outreach drafter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;score_outreach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;EvalResult&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;EvalResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;passed&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
            &lt;span class="nf"&gt;language_matches&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;language&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
            &lt;span class="nf"&gt;word_count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_words&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fact&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;actual&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;fact&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;required_facts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
            &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;phrase&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;forbidden_phrases&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
        &lt;span class="p"&gt;]),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also do an "LLM-as-judge" pattern for fuzzy criteria — but only as a&lt;br&gt;
last resort. It's slower, more expensive, and adds another silent-regression&lt;br&gt;
surface (the judge model can drift too). Use deterministic checks whenever&lt;br&gt;
the criterion is decidable.&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 3 — Runner
&lt;/h2&gt;

&lt;p&gt;A 30-line script that loops fixtures, calls the feature live, scores each one,&lt;br&gt;
prints a summary table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# tests/eval/run.py
&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;rich.table&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Table&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;rich.console&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Console&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;career_os.scorer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;score_job&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;tests.eval.score_fit&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;eval_one&lt;/span&gt;

&lt;span class="n"&gt;console&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Console&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;fixtures&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tests/fixtures/scored_jobs.jsonl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;read_text&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;splitlines&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()]&lt;/span&gt;

&lt;span class="n"&gt;table&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Table&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Eval results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;expected_fit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;actual_fit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;delta&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;failed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;fx&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;fixtures&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;actual&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;score_job&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fx&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;           &lt;span class="c1"&gt;# the live call
&lt;/span&gt;    &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;eval_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;fx&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;expected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;fx&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tolerance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[green]PASS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;passed&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[red]FAIL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;passed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;failed&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_row&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fx&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fx&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;expected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit_delta&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;[red]&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;failed&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;[/red] failed of &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fixtures&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;failed&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[green]all &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fixtures&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; passed[/green]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;failed&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;exit(1 if failed else 0)&lt;/code&gt; is what makes this an actual test, not just a&lt;br&gt;
debugging tool. Wire it into CI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wiring it into CI
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/eval.yml&lt;/span&gt;

&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eval&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;src/career_os/scorer/**'&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tests/fixtures/**'&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;workflow_dispatch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;eval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-python@v5&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;python-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;3.12'&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pip install -e ".[dev]"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python tests/eval/run.py&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.ANTHROPIC_API_KEY }}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This runs the eval on every PR that touches the scorer or the fixtures. It&lt;br&gt;
costs a few cents per run. It catches all four silent-regression categories.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to extend the harness
&lt;/h2&gt;

&lt;p&gt;You will hit cases the eval can't catch. That's fine — every catch is a new&lt;br&gt;
fixture. The discipline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bug reported in production?&lt;/strong&gt; Add a fixture that reproduces it. Fix.
Eval now catches the regression next time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You hand-tuned the prompt for 3 hours?&lt;/strong&gt; Save 5 of those tuning cases as
fixtures with the now-correct expected outputs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic ships a new model?&lt;/strong&gt; Run the eval against the new model before
bumping it in production. The delta tells you whether the new model is a
free upgrade or a calibration redo.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The fixtures grow with the system. After a year, you have a regression suite&lt;br&gt;
that encodes most of what you know about how the feature &lt;em&gt;should&lt;/em&gt; behave —&lt;br&gt;
and any future engineer (or future-you, three months from now) can change&lt;br&gt;
the prompt with confidence because the eval will catch them if they break it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this is worth to a client
&lt;/h2&gt;

&lt;p&gt;Most teams shipping LLM features today do not have this. They ship a Claude&lt;br&gt;
call, watch metrics, hope. When the system slowly degrades, they can't tell&lt;br&gt;
whether it was the prompt change, the model bump, the user-input distribution&lt;br&gt;
shift, or something else entirely. Triage time on a silent LLM regression in&lt;br&gt;
production is measured in days.&lt;/p&gt;

&lt;p&gt;A 4-hour investment in an eval harness collapses that to minutes. If you&lt;br&gt;
ship a Claude feature without one, you are building technical debt you can't&lt;br&gt;
see and can't measure.&lt;/p&gt;

&lt;p&gt;If you have a Laravel / PrestaShop / Python app shipping an AI feature and&lt;br&gt;
nobody has built the eval harness for it yet, that is a 1-week scoped&lt;br&gt;
engagement I take on. The shape is on the &lt;a href="https://dev.to/hire-me"&gt;hire-me page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For the full architecture context where this pattern lives — including the&lt;br&gt;
fixture file format and the eval runner — see the&lt;br&gt;
&lt;a href="https://dev.to/blog/career-os-architecture"&gt;Career-OS architecture walkthrough&lt;/a&gt;. The same&lt;br&gt;
discipline applies to the Laravel patterns in&lt;br&gt;
&lt;a href="https://dev.to/blog/5-places-to-bolt-ai-onto-laravel"&gt;5 places to bolt AI&lt;/a&gt; and the&lt;br&gt;
PrestaShop module in&lt;br&gt;
&lt;a href="https://dev.to/blog/claude-agent-prestashop-5-files"&gt;the 5-file pattern&lt;/a&gt; — every Claude&lt;br&gt;
call in those posts is a place where the eval harness pattern earns its&lt;br&gt;
keep.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://bak-dev.com/blog/evaluating-claude-features-before-production" rel="noopener noreferrer"&gt;bak-dev.com&lt;/a&gt;. Find more build-in-public posts at &lt;a href="https://bak-dev.com/blog" rel="noopener noreferrer"&gt;bak-dev.com/blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudesdk</category>
      <category>aiagents</category>
      <category>testing</category>
      <category>promptengineering</category>
    </item>
    <item>
      <title>How to add a Claude agent to a PrestaShop store: a 5-file pattern</title>
      <dc:creator>Akram Bakhouche</dc:creator>
      <pubDate>Thu, 28 May 2026 13:21:46 +0000</pubDate>
      <link>https://dev.to/akram_bakhouche/how-to-add-a-claude-agent-to-a-prestashop-store-a-5-file-pattern-45p5</link>
      <guid>https://dev.to/akram_bakhouche/how-to-add-a-claude-agent-to-a-prestashop-store-a-5-file-pattern-45p5</guid>
      <description>&lt;p&gt;Liquid syntax error: Unknown tag 'endraw'&lt;/p&gt;
</description>
      <category>prestashop</category>
      <category>php</category>
      <category>claudesdk</category>
      <category>ecommerce</category>
    </item>
    <item>
      <title>Career-OS architecture — how I built a job-search agent with Claude SDK</title>
      <dc:creator>Akram Bakhouche</dc:creator>
      <pubDate>Thu, 28 May 2026 12:58:20 +0000</pubDate>
      <link>https://dev.to/akram_bakhouche/career-os-architecture-how-i-built-a-job-search-agent-with-claude-sdk-24g5</link>
      <guid>https://dev.to/akram_bakhouche/career-os-architecture-how-i-built-a-job-search-agent-with-claude-sdk-24g5</guid>
      <description>&lt;p&gt;I built Career-OS to solve a real problem for myself: spending two hours a day&lt;br&gt;
scrolling job boards is a terrible way to find your next role, and existing&lt;br&gt;
"AI job applier" tools are either spam machines or vapor.&lt;/p&gt;

&lt;p&gt;The system is open-source on&lt;br&gt;
&lt;a href="https://github.com/akrambak/career-os" rel="noopener noreferrer"&gt;github.com/akrambak/career-os&lt;/a&gt;.&lt;br&gt;
This post is the technical walkthrough: what it does, why each layer exists,&lt;br&gt;
and the design decisions that made it actually ship instead of dying as a&lt;br&gt;
weekend prototype.&lt;/p&gt;
&lt;h2&gt;
  
  
  What it does, in one paragraph
&lt;/h2&gt;

&lt;p&gt;Every morning, Career-OS crawls five sources (RemoteOK, two WeWorkRemotely&lt;br&gt;
feeds, Remotive, and two Hacker News monthly threads). It scores each posting&lt;br&gt;
against my profile using Claude with structured-output JSON, drafts a tailored&lt;br&gt;
outreach email when fit ≥ 70, and emails me a digest with the top picks. I&lt;br&gt;
review the digest at breakfast, decide which to send, and run&lt;br&gt;
&lt;code&gt;career-os apply&lt;/code&gt; from the terminal to track the pipeline through to offer or&lt;br&gt;
rejection.&lt;/p&gt;
&lt;h2&gt;
  
  
  The architecture
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────┐   ┌──────────────┐   ┌──────────────┐
│ 5 scrapers   │──▶│ SQLite store │──▶│ Claude       │
│ (RemoteOK,   │   │ (jobs,       │   │ scorer       │
│  WWR, HN…)   │   │  scores)     │   │ (fit 0-100)  │
└──────────────┘   └──────────────┘   └──────────────┘
                          │                   │
                          ▼                   ▼
                   ┌──────────────┐   ┌──────────────┐
                   │ CLI          │   │ Drafter      │
                   │ (fetch/score │   │ (outreach    │
                   │  /draft/…)   │   │  per fit)    │
                   └──────────────┘   └──────────────┘
                          │                   │
                          ▼                   ▼
                   ┌──────────────────────────────────┐
                   │ Daily email digest (Resend /     │
                   │ Postmark / SMTP)                 │
                   └──────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Five layers, each replaceable: scrapers, store, scorer, drafter, digest. No&lt;br&gt;
framework. Just Python modules and one Anthropic SDK dependency.&lt;/p&gt;
&lt;h2&gt;
  
  
  Layer 1 — Scrapers
&lt;/h2&gt;

&lt;p&gt;Each scraper is a 50-line file. The shape is intentionally boring:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# src/career_os/scrapers/remoteok.py
&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RemoteOKScraper&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseScraper&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;source_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;remoteok&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;RawJob&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AsyncClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://remoteok.com/api&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_job&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;RawJob&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;RawJob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;external_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
            &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;position&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;company&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;company&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;posted_at&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;parse_date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;date&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
            &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;source_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The design rule: &lt;strong&gt;a scraper does fetching + normalization, nothing else.&lt;/strong&gt; No&lt;br&gt;
filtering, no scoring, no business logic. Adding a new source — say, a French&lt;br&gt;
freelance board — is a 50-line file that the registry picks up automatically.&lt;/p&gt;

&lt;p&gt;The HN "Who is hiring?" and "Seeking freelancer?" scrapers use the public&lt;br&gt;
Algolia API instead of scraping HTML, which is rude. Algolia returns&lt;br&gt;
structured fields (stack, budget, location, contact email) that would be&lt;br&gt;
brittle to parse from prose.&lt;/p&gt;
&lt;h2&gt;
  
  
  Layer 2 — Store
&lt;/h2&gt;

&lt;p&gt;SQLite with a Postgres-shaped schema. Four tables: &lt;code&gt;jobs&lt;/code&gt;, &lt;code&gt;scores&lt;/code&gt;,&lt;br&gt;
&lt;code&gt;applications&lt;/code&gt;, &lt;code&gt;drafts&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The Postgres-shaped part matters. When this becomes a multi-tenant SaaS, the&lt;br&gt;
swap to Postgres is one connection-string change and one &lt;code&gt;npm i pg&lt;/code&gt; — no&lt;br&gt;
schema migration, no rewriting queries. SQLite is fast enough for one user&lt;br&gt;
and gives me file-based ergonomics (&lt;code&gt;scp career_os.db&lt;/code&gt; to back up, &lt;code&gt;sqlite3&lt;br&gt;
career_os.db&lt;/code&gt; to inspect).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# src/career_os/db/schema.sql
&lt;/span&gt;
&lt;span class="n"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;TABLE&lt;/span&gt; &lt;span class="nf"&gt;jobs &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nb"&gt;id&lt;/span&gt;           &lt;span class="n"&gt;INTEGER&lt;/span&gt; &lt;span class="n"&gt;PRIMARY&lt;/span&gt; &lt;span class="n"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;external_id&lt;/span&gt;  &lt;span class="n"&gt;TEXT&lt;/span&gt; &lt;span class="n"&gt;NOT&lt;/span&gt; &lt;span class="n"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;source&lt;/span&gt;       &lt;span class="n"&gt;TEXT&lt;/span&gt; &lt;span class="n"&gt;NOT&lt;/span&gt; &lt;span class="n"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt;        &lt;span class="n"&gt;TEXT&lt;/span&gt; &lt;span class="n"&gt;NOT&lt;/span&gt; &lt;span class="n"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;company&lt;/span&gt;      &lt;span class="n"&gt;TEXT&lt;/span&gt; &lt;span class="n"&gt;NOT&lt;/span&gt; &lt;span class="n"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;  &lt;span class="n"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt;          &lt;span class="n"&gt;TEXT&lt;/span&gt; &lt;span class="n"&gt;NOT&lt;/span&gt; &lt;span class="n"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;posted_at&lt;/span&gt;    &lt;span class="n"&gt;DATETIME&lt;/span&gt; &lt;span class="n"&gt;NOT&lt;/span&gt; &lt;span class="n"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;seen_at&lt;/span&gt;      &lt;span class="n"&gt;DATETIME&lt;/span&gt; &lt;span class="n"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;CURRENT_TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nc"&gt;UNIQUE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;external_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;TABLE&lt;/span&gt; &lt;span class="nf"&gt;scores &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nb"&gt;id&lt;/span&gt;            &lt;span class="n"&gt;INTEGER&lt;/span&gt; &lt;span class="n"&gt;PRIMARY&lt;/span&gt; &lt;span class="n"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;job_id&lt;/span&gt;        &lt;span class="n"&gt;INTEGER&lt;/span&gt; &lt;span class="n"&gt;REFERENCES&lt;/span&gt; &lt;span class="nf"&gt;jobs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;fit&lt;/span&gt;           &lt;span class="n"&gt;INTEGER&lt;/span&gt; &lt;span class="n"&gt;NOT&lt;/span&gt; &lt;span class="n"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;pros&lt;/span&gt;          &lt;span class="n"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;cons&lt;/span&gt;          &lt;span class="n"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;angle&lt;/span&gt;         &lt;span class="n"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;scored_at&lt;/span&gt;     &lt;span class="n"&gt;DATETIME&lt;/span&gt; &lt;span class="n"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;CURRENT_TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nc"&gt;UNIQUE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;job_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;(external_id, source)&lt;/code&gt; as a composite key dedupes across re-runs. The same&lt;br&gt;
RemoteOK job won't get scored twice.&lt;/p&gt;
&lt;h2&gt;
  
  
  Layer 3 — The Claude scorer
&lt;/h2&gt;

&lt;p&gt;The expensive layer, in both money and engineering effort. The scorer&lt;br&gt;
decides which jobs deserve a draft, which deserve a glance, and which to&lt;br&gt;
skip. Mis-calibrate it and you either drown in noise or miss good fits.&lt;/p&gt;

&lt;p&gt;Two design decisions made this work:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Decision 1: prompt caching on the system block.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The system prompt is ~2,400 tokens — my profile, the scoring rubric, examples&lt;br&gt;
of what "fit 80" vs "fit 50" looks like. It's identical for every job. Putting&lt;br&gt;
it inside a cached block means I pay full price for the first job of the day&lt;br&gt;
and ~10% for every subsequent one.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# src/career_os/scorer/claude_scorer.py
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;score_job&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Job&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;JobScore&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SCORER_SYSTEM_PROMPT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cache_control&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ephemeral&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;render_job&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;)},&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;parse_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Decision 2: structured JSON output, not free-form.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I want &lt;code&gt;{"fit": 87, "pros": [...], "cons": [...], "angle": "..."}&lt;/code&gt; — not a&lt;br&gt;
paragraph of prose I have to regex. The system prompt asks for valid JSON&lt;br&gt;
matching a schema, with a few-shot example. I use &lt;code&gt;pydantic&lt;/code&gt; to parse the&lt;br&gt;
response and fail loudly if the shape's wrong. Failure rate after the first&lt;br&gt;
week of tuning: under 0.5%.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;JobScore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ge&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;le&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;pros&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;min_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;cons&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;min_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;angle&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;  &lt;span class="c1"&gt;# the one-line pitch to lead the outreach with
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Layer 4 — The drafter
&lt;/h2&gt;

&lt;p&gt;When fit ≥ 70, the drafter writes a tailored outreach email. It takes the&lt;br&gt;
job, the score, and a draft type (FT-cover or freelance-pitch) and produces&lt;br&gt;
something I'd actually send.&lt;/p&gt;

&lt;p&gt;The interesting bit: &lt;strong&gt;two completely separate system prompts.&lt;/strong&gt; A cover&lt;br&gt;
letter for a senior FT role and a freelance pitch are different jobs with&lt;br&gt;
different tone requirements. I tried one prompt that "knew" which to write.&lt;br&gt;
It produced muddled outputs. Splitting them — different system prompts loaded&lt;br&gt;
based on the &lt;code&gt;draft_type&lt;/code&gt; — fixed it overnight.&lt;/p&gt;

&lt;p&gt;Hard rules baked into both prompts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Never invent metrics. If the candidate (me) hasn't shipped X, don't claim X.&lt;/li&gt;
&lt;li&gt;Freelance: no engagements under 2 weeks.&lt;/li&gt;
&lt;li&gt;Freelance: no hourly rates under €60/hr equivalent.&lt;/li&gt;
&lt;li&gt;FT: no roles requiring on-site days outside Europe.&lt;/li&gt;
&lt;li&gt;No emoji in the body. (Personal taste. Survives prompt-injection attempts
because it's in the cached system block.)
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;draft_outreach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;JobScore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;system_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;FT_COVER_PROMPT&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ft_cover&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;FREELANCE_PITCH_PROMPT&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-sonnet-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cache_control&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ephemeral&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}}],&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;render_brief&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;)}],&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h2&gt;
  
  
  Layer 5 — The eval harness
&lt;/h2&gt;

&lt;p&gt;The part of the system most "AI agent" projects skip. The scorer is the&lt;br&gt;
center of the system — if it silently regresses (because I tweaked the system&lt;br&gt;
prompt, because Anthropic shipped a new model version, because the&lt;br&gt;
temperature setting drifted), I'd ship bad drafts for weeks before noticing.&lt;/p&gt;

&lt;p&gt;So: ten hand-labeled job fixtures with expected fit scores ±5. The eval&lt;br&gt;
command (&lt;code&gt;career-os eval&lt;/code&gt;) replays them through the scorer and reports any&lt;br&gt;
that fell outside tolerance.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# tests/fixtures/scored_jobs.jsonl
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;job_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fixture_001&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;expected_fit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;88&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tolerance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...}&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;job_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fixture_002&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;expected_fit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;35&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tolerance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...}&lt;/span&gt;
&lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run before every commit to the scorer or any prompt change. Took about an hour&lt;br&gt;
to build and has saved me from at least three subtle regressions.&lt;/p&gt;

&lt;p&gt;This is non-negotiable for any LLM-in-production work. &lt;strong&gt;If you can't measure&lt;br&gt;
quality, you can't iterate.&lt;/strong&gt; Don't ship a Claude-powered feature without an&lt;br&gt;
eval harness if the cost of being wrong exceeds the cost of building the&lt;br&gt;
harness.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it cost to build
&lt;/h2&gt;

&lt;p&gt;Six weeks of evening + weekend work to get to the MVP I'm running today.&lt;br&gt;
The Anthropic bill: under $40/month including the eval re-runs. Less than my&lt;br&gt;
coffee budget.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;Three things on the roadmap:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;A LinkedIn / X cross-poster&lt;/strong&gt; that publishes commit milestones with one
command — &lt;code&gt;career-os post --milestone "5 scrapers live"&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;An applications tracker UI&lt;/strong&gt; beyond the current CLI. Probably a Next.js
dashboard pulling from the same SQLite store.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-tenant mode&lt;/strong&gt; — same architecture, Postgres swap, an auth layer,
and shipping as a SaaS.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you want to play with it locally, &lt;code&gt;pip install -e ".[dev]"&lt;/code&gt; and follow the&lt;br&gt;
README. If you want a system like this built for your team — sales-lead&lt;br&gt;
enrichment, internal ops agents, RAG-over-docs — that's exactly the work I&lt;br&gt;
take on. &lt;a href="https://dev.to/hire-me"&gt;The shape is on the hire-me page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;And if you want the LLM-in-production patterns from this post applied to your&lt;br&gt;
existing Laravel or PrestaShop stack, the&lt;br&gt;
&lt;a href="https://dev.to/blog/5-places-to-bolt-ai-onto-laravel"&gt;5-places-to-bolt-AI&lt;/a&gt; post covers&lt;br&gt;
that ground.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://bak-dev.com/blog/career-os-architecture" rel="noopener noreferrer"&gt;bak-dev.com&lt;/a&gt;. Find more build-in-public posts at &lt;a href="https://bak-dev.com/blog" rel="noopener noreferrer"&gt;bak-dev.com/blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudesdk</category>
      <category>python</category>
      <category>aiagents</category>
      <category>careeros</category>
    </item>
    <item>
      <title>5 places to bolt an AI agent onto a PHP/Laravel app without rebuilding</title>
      <dc:creator>Akram Bakhouche</dc:creator>
      <pubDate>Thu, 28 May 2026 12:58:19 +0000</pubDate>
      <link>https://dev.to/akram_bakhouche/5-places-to-bolt-an-ai-agent-onto-a-phplaravel-app-without-rebuilding-8fb</link>
      <guid>https://dev.to/akram_bakhouche/5-places-to-bolt-an-ai-agent-onto-a-phplaravel-app-without-rebuilding-8fb</guid>
      <description>&lt;p&gt;If you run a Laravel or PrestaShop app, "add AI to it" arrived on your roadmap&lt;br&gt;
sometime in the last twelve months. Probably from a CEO who saw a Claude demo,&lt;br&gt;
or a customer-success person watching support tickets pile up.&lt;/p&gt;

&lt;p&gt;The wrong answer is to rewrite. The right answer is to find the 5 places in&lt;br&gt;
your existing stack where an LLM call collapses 20 lines of brittle code into&lt;br&gt;
2 lines of Claude SDK — and start there.&lt;/p&gt;

&lt;p&gt;These are the 5 I keep returning to in client work, ranked roughly by how&lt;br&gt;
fast they break even.&lt;/p&gt;
&lt;h2&gt;
  
  
  1. Support-inbox triage
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The use case.&lt;/strong&gt; Every incoming support email goes through Claude before a&lt;br&gt;
human sees it. The model assigns a category, severity, and language; drafts a&lt;br&gt;
suggested reply; and routes it to the right inbox. Your team opens 50 emails&lt;br&gt;
and sees 50 pre-categorized rows with draft answers, not 50 unparsed strings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The failure mode if you do it badly.&lt;/strong&gt; You let Claude auto-send replies.&lt;br&gt;
Three months in, the model hallucinates a refund policy that doesn't exist and&lt;br&gt;
costs you a five-figure chargeback. The fix is a hard rule: &lt;strong&gt;draft yes,&lt;br&gt;
auto-send no.&lt;/strong&gt; Humans approve the draft. The savings come from time-to-first-&lt;br&gt;
draft, not from removing the human.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The integration point.&lt;/strong&gt; A queued job per inbound email.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Jobs/TriageSupportEmail.php&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Anthropic&lt;/span&gt; &lt;span class="nv"&gt;$claude&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$analysis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$claude&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="s1"&gt;'model'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'claude-haiku-4-5'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'system'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;triagePrompt&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;  &lt;span class="c1"&gt;// cached, 600 tokens&lt;/span&gt;
        &lt;span class="s1"&gt;'tools'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;triageTools&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;   &lt;span class="c1"&gt;// assign_category, set_severity&lt;/span&gt;
        &lt;span class="s1"&gt;'messages'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;raw_body&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;

    &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="s1"&gt;'category'&lt;/span&gt;        &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$analysis&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'assign_category'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'name'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="s1"&gt;'severity'&lt;/span&gt;        &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$analysis&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'set_severity'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'level'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="s1"&gt;'draft_reply'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$analysis&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;content_text&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="s1"&gt;'triaged_at'&lt;/span&gt;      &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cost: roughly &lt;strong&gt;$0.0003/email&lt;/strong&gt; on Haiku 4.5 with prompt caching. A team&lt;br&gt;
processing 5,000 support emails a month spends $1.50 to give every one of them&lt;br&gt;
a pre-categorized draft.&lt;/p&gt;
&lt;h2&gt;
  
  
  2. Product description generation (e-commerce)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The use case.&lt;/strong&gt; Admin panel gets a "Generate description" button next to&lt;br&gt;
the product editor. The button calls Claude with the product specs, the&lt;br&gt;
store's tone-of-voice guide, and any existing description. Claude returns 3&lt;br&gt;
variants. Editor picks one, edits if needed, saves.&lt;/p&gt;

&lt;p&gt;This is the lowest-hanging fruit I see for PrestaShop and Shopify stores.&lt;br&gt;
Catalog managers spend hours a week rewriting descriptions. This collapses to&lt;br&gt;
minutes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The failure mode if you do it badly.&lt;/strong&gt; You let Claude invent specs ("100%&lt;br&gt;
organic cotton" when you sold synthetic). The fix: pass the specs as a&lt;br&gt;
&lt;strong&gt;structured fact list&lt;/strong&gt; in the system prompt, and instruct Claude to&lt;br&gt;
&lt;strong&gt;never invent attributes outside that list&lt;/strong&gt;. Production-tested constraint&lt;br&gt;
that actually works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The integration point.&lt;/strong&gt; A single controller action, idempotent by product&lt;br&gt;
ID + tone.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Http/Controllers/Admin/DescriptionGeneratorController.php&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Product&lt;/span&gt; &lt;span class="nv"&gt;$product&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;Request&lt;/span&gt; &lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;JsonResponse&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$variants&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;claude&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="s1"&gt;'model'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'claude-sonnet-4-6'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'system'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;copywriterPrompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$request&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;tone&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s1"&gt;'messages'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;
            &lt;span class="s1"&gt;'role'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$product&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;fact_sheet&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;  &lt;span class="c1"&gt;// structured: name, attrs, materials&lt;/span&gt;
        &lt;span class="p"&gt;]],&lt;/span&gt;
        &lt;span class="s1"&gt;'max_tokens'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="s1"&gt;'variants'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;parseVariants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$variants&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;content_text&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tone is the parameter that matters. Premium · friendly · technical ·&lt;br&gt;
minimalist · playful. Same product, five wildly different angles, picker UI.&lt;/p&gt;
&lt;h2&gt;
  
  
  3. Semantic-search fallback
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The use case.&lt;/strong&gt; Your existing keyword search is fine for 70% of queries.&lt;br&gt;
For the other 30% — typos, synonyms, intent-based searches — it returns&lt;br&gt;
nothing and your bounce rate goes up. You add an embedding-based fallback:&lt;br&gt;
when keyword search returns fewer than N results, the query gets embedded,&lt;br&gt;
matched against pre-computed product embeddings, and the top results are&lt;br&gt;
appended.&lt;/p&gt;

&lt;p&gt;You don't replace keyword search. You augment it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The failure mode if you do it badly.&lt;/strong&gt; You replace keyword search wholesale.&lt;br&gt;
Now exact-SKU lookups stop working because embedding similarity ranks&lt;br&gt;
"BLACK-M-FRONT-9382" lower than "black t-shirt". The fix: &lt;strong&gt;augment, never&lt;br&gt;
replace&lt;/strong&gt;. Keyword is fast, exact, and deterministic. Embeddings are slow,&lt;br&gt;
fuzzy, and probabilistic. They're complements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The integration point.&lt;/strong&gt; A Laravel scout-driver decorator.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Search/HybridSearch.php&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nv"&gt;$query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nv"&gt;$threshold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;Collection&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$keyword&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Product&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;take&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$keyword&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nb"&gt;count&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nv"&gt;$threshold&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$keyword&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// keyword had enough, fast path&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nv"&gt;$embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$query&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;// cached, batched&lt;/span&gt;
    &lt;span class="nv"&gt;$semantic&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;vectorIndex&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;similar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;take&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nv"&gt;$keyword&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nb"&gt;count&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$keyword&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$semantic&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;unique&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'id'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cheap to run: embeddings cost about &lt;strong&gt;$0.00002/query&lt;/strong&gt;. A million queries a&lt;br&gt;
month is twenty bucks.&lt;/p&gt;
&lt;h2&gt;
  
  
  4. Daily admin summary
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The use case.&lt;/strong&gt; Every morning at 7am, the admin panel home screen loads&lt;br&gt;
with a 3-paragraph summary of the previous day's activity: orders, refunds,&lt;br&gt;
support tickets, unusual signals. Written by Claude, in your brand voice,&lt;br&gt;
referencing real numbers.&lt;/p&gt;

&lt;p&gt;This is the highest perceived value per dollar of any pattern on this list.&lt;br&gt;
The owner opens the admin, sees a coherent paragraph saying "Yesterday: 47&lt;br&gt;
orders (+8% vs Tuesday avg), 2 refund requests both for size issues on&lt;br&gt;
SKU-9382, support volume normal, one anomaly: 4 abandoned carts at the&lt;br&gt;
checkout step in the last 3 hours — worth checking the payment provider", and&lt;br&gt;
the AI bill paid for itself before they finish their coffee.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The failure mode if you do it badly.&lt;/strong&gt; You feed Claude raw database dumps&lt;br&gt;
and ask it to "find something interesting". It invents trends from noise. The&lt;br&gt;
fix: &lt;strong&gt;you do the aggregation, Claude does the narrative.&lt;/strong&gt; Pre-compute the&lt;br&gt;
stats. Pass them as a structured digest. Ask Claude to turn them into&lt;br&gt;
English, not to discover them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The integration point.&lt;/strong&gt; A scheduled task, run before 7am.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Console/Commands/DailyAdminDigest.php&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'orders'&lt;/span&gt;           &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;dailyOrders&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="s1"&gt;'refunds'&lt;/span&gt;          &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;dailyRefunds&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="s1"&gt;'support_volume'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;dailySupportVolume&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="s1"&gt;'unusual_signals'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;detectAnomalies&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;];&lt;/span&gt;

    &lt;span class="nv"&gt;$narrative&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;claude&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="s1"&gt;'model'&lt;/span&gt;    &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'claude-haiku-4-5'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'system'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;editorPrompt&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="s1"&gt;'messages'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;json_encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$stats&lt;/span&gt;&lt;span class="p"&gt;)]],&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;

    &lt;span class="nc"&gt;AdminDashboard&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;cacheTodayDigest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$narrative&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;content_text&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  5. Outbound email personalization
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The use case.&lt;/strong&gt; Your abandoned-cart and re-engagement emails currently use&lt;br&gt;
templated copy: "Hi {{first_name}}, you left items in your cart". Open rates&lt;br&gt;
are mid-single-digits.&lt;/p&gt;

&lt;p&gt;You add a Claude step: the model gets the customer's last 5 orders, the items&lt;br&gt;
they abandoned, their average order value, the season. It rewrites the email&lt;br&gt;
body to lean into the angle most likely to convert &lt;em&gt;this&lt;/em&gt; customer.&lt;/p&gt;

&lt;p&gt;Same template structure, same call-to-action button. Just the body paragraph&lt;br&gt;
is dynamic per recipient.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The failure mode if you do it badly.&lt;/strong&gt; You let Claude generate the subject&lt;br&gt;
line too — and your spam complaint rate triples because the model used&lt;br&gt;
clickbait phrases your sender reputation can't sustain. The fix: &lt;strong&gt;keep the&lt;br&gt;
subject line and the CTA pinned&lt;/strong&gt;. Let Claude work on the body only.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The integration point.&lt;/strong&gt; A queued personalizer running before the&lt;br&gt;
deliverability tool sends the email.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/Mail/Personalizers/AbandonedCartPersonalizer.php&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;personalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;Cart&lt;/span&gt; &lt;span class="nv"&gt;$cart&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;User&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s1"&gt;'abandoned_items'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$cart&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;toFactSheet&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="s1"&gt;'recent_orders'&lt;/span&gt;       &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;recentOrders&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;toFactSheet&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="s1"&gt;'avg_order_value_eur'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$user&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;avgOrderValue&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;];&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;claude&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="s1"&gt;'model'&lt;/span&gt;      &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'claude-haiku-4-5'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'system'&lt;/span&gt;     &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;copywriterPrompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'abandoned_cart'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s1"&gt;'messages'&lt;/span&gt;   &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="s1"&gt;'role'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'user'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'content'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;json_encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$context&lt;/span&gt;&lt;span class="p"&gt;)]],&lt;/span&gt;
        &lt;span class="s1"&gt;'max_tokens'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;240&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;content_text&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A/B test it against the template baseline before rolling out fully. I have&lt;br&gt;
seen lift between 18% and 40% in open-to-click conversion on abandoned-cart&lt;br&gt;
sequences, but your results depend entirely on the quality of your customer&lt;br&gt;
data.&lt;/p&gt;

&lt;h2&gt;
  
  
  What unites all five patterns
&lt;/h2&gt;

&lt;p&gt;Look at the code. None of them needed a new framework, a vector database, a&lt;br&gt;
"AI platform", or a rewrite. Each is a single class with a single Claude&lt;br&gt;
call, dropped into the existing Laravel app like any other Service.&lt;/p&gt;

&lt;p&gt;That's the actual playbook for adding AI to a production app:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Find the place where the existing code is brittle, slow, or expensive.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Wrap the brittle bit in a queued job or controller action.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Replace its body with one Claude SDK call.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add structured constraints in the system prompt&lt;/strong&gt; so the model can't
invent things outside the facts you pass it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keep the human in the loop&lt;/strong&gt; anywhere a hallucination would cost more
than a draft.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you have a Laravel or PrestaShop app shipping revenue and you want to add&lt;br&gt;
one of these patterns without burning down what works,&lt;br&gt;
&lt;a href="mailto:me@bak-dev.com?subject=%5BHire%5D%20AI%20feature%20retrofit"&gt;email me a brief&lt;/a&gt;.&lt;br&gt;
I do exactly this for a living: scope, build, ship, inside your codebase.&lt;br&gt;
2–4 weeks fixed-scope. The shape is documented on the&lt;br&gt;
&lt;a href="https://dev.to/hire-me"&gt;hire-me page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Or, if you want to see what the same patterns look like across the full&lt;br&gt;
agent-system stack instead of one feature at a time, the source for&lt;br&gt;
&lt;a href="https://github.com/akrambak/career-os" rel="noopener noreferrer"&gt;Career-OS&lt;/a&gt; — the AI-agent dashboard&lt;br&gt;
I run my own job search through — is MIT-licensed on GitHub.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://bak-dev.com/blog/5-places-to-bolt-ai-onto-laravel" rel="noopener noreferrer"&gt;bak-dev.com&lt;/a&gt;. Find more build-in-public posts at &lt;a href="https://bak-dev.com/blog" rel="noopener noreferrer"&gt;bak-dev.com/blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>laravel</category>
      <category>php</category>
      <category>claudesdk</category>
      <category>aiagents</category>
    </item>
    <item>
      <title>[FLUTTER] Need help on flutter!</title>
      <dc:creator>Akram Bakhouche</dc:creator>
      <pubDate>Fri, 30 Jul 2021 15:21:20 +0000</pubDate>
      <link>https://dev.to/akram_bakhouche/flutter-need-help-on-flutter-39hn</link>
      <guid>https://dev.to/akram_bakhouche/flutter-need-help-on-flutter-39hn</guid>
      <description>&lt;p&gt;Hello&lt;/p&gt;

&lt;p&gt;I search a solution like a voice call app.&lt;br&gt;
I need to have a solution to launch my app and show a screen with two buttons ( Accept / Reject )&lt;/p&gt;

&lt;p&gt;The app must fire like Skype/Messenger/Signal when it receive a signal call.&lt;/p&gt;

&lt;p&gt;Best regards&lt;/p&gt;

</description>
      <category>flutter</category>
      <category>dart</category>
      <category>ios</category>
      <category>android</category>
    </item>
  </channel>
</rss>
