<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ray</title>
    <description>The latest articles on DEV Community by Ray (@isray_notarray).</description>
    <link>https://dev.to/isray_notarray</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1125411%2Fd2fcd426-1e95-4df9-8a04-f82df08e90d3.jpeg</url>
      <title>DEV Community: Ray</title>
      <link>https://dev.to/isray_notarray</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/isray_notarray"/>
    <language>en</language>
    <item>
      <title>How Much of Your Blog Does AI Search Actually Grab? Breaking Down Claude's WebSearch and WebFetch</title>
      <dc:creator>Ray</dc:creator>
      <pubDate>Fri, 19 Jun 2026 14:23:46 +0000</pubDate>
      <link>https://dev.to/isray_notarray/how-much-of-your-blog-does-ai-search-actually-grab-breaking-down-claudes-websearch-and-webfetch-538f</link>
      <guid>https://dev.to/isray_notarray/how-much-of-your-blog-does-ai-search-actually-grab-breaking-down-claudes-websearch-and-webfetch-538f</guid>
      <description>&lt;p&gt;A while back I wrote &lt;a href="https://israynotarray.com/en/ai/2026/04/08/aeo-answer-engine-optimization-guide/" rel="noopener noreferrer"&gt;Is SEO Not Enough? Meet AEO — Getting Your Site Found by AI Search&lt;/a&gt;, and right after finishing it a question hit me: when AI does a web search, how much of my blog does it actually grab? The whole article verbatim? The first 500 characters? Or does it bail after seeing just the title? So I dug into it, and this post walks through Anthropic's official &lt;code&gt;web_search&lt;/code&gt; and &lt;code&gt;web_fetch&lt;/code&gt; tool specs, runs a quick test against my own blog, and ends with what all this concretely means for how you should write posts and copy.&lt;/p&gt;

&lt;h2&gt;
  
  
  "Search" and "fetch" are two different things
&lt;/h2&gt;

&lt;p&gt;Before going further, the one thing worth being crystal clear on: when AI runs a query, &lt;strong&gt;"search"&lt;/strong&gt; and &lt;strong&gt;"fetching the page body"&lt;/strong&gt; are not the same operation. They're two separate stages.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stage 1 (WebSearch):&lt;/strong&gt; the AI takes your question and calls the WebSearch tool (which hits a search engine). What comes back is a list of search results — each entry has things like URL and title, but no page body.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stage 2 (WebFetch):&lt;/strong&gt; after looking at the search result list, the AI decides which entries are worth digging into, then fires a WebFetch request at each of those URLs, basically saying "give me the body of this page." That's when actual content gets pulled in.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why doesn't it just grab the body during the search stage? Context window limits. If every search shoved 10 results' worth of full bodies in, your usable context would blow up fast — and then you'd start complaining the AI is dumb and forgets what you just asked it (because the context did overflow). So it's split into two stages: search first for a list, then decide which entries from that list to actually fetch.&lt;/p&gt;

&lt;p&gt;Once that two-stage split makes sense, the rest of this post is about what each stage actually pulls in.&lt;/p&gt;

&lt;h2&gt;
  
  
  What does WebSearch pull in?
&lt;/h2&gt;

&lt;p&gt;Going straight to Anthropic's official &lt;code&gt;web_search&lt;/code&gt; tool docs — every search result entry has only four fields:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;url&lt;/code&gt;: the page URL&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;title&lt;/code&gt;: the page title&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;page_age&lt;/code&gt;: when the page was last updated&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;encrypted_content&lt;/code&gt;: encrypted content, not for the AI to read the article — it's for multi-turn conversation citations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's it. Four fields.&lt;/p&gt;

&lt;p&gt;What the AI sees during the search stage is "URL, title, last updated" — three pieces of human-readable info. No body content at all.&lt;/p&gt;

&lt;p&gt;What if the AI cites your content? There's a cap on that too:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Each &lt;code&gt;web_search_result_location&lt;/code&gt;'s &lt;code&gt;cited_text&lt;/code&gt; is &lt;strong&gt;up to 150 characters of the cited content&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In short: at most 150 characters of quoted text. And that's just the API-level spec.&lt;/p&gt;

&lt;p&gt;Claude Code's built-in WebSearch shaves it down further. According to Mikhail Shilkov's &lt;a href="https://mikhail.io/2025/10/claude-code-web-tools/" rel="noopener noreferrer"&gt;breakdown of Claude Code's internal behavior&lt;/a&gt;, Claude Code even drops &lt;code&gt;page_age&lt;/code&gt; and &lt;code&gt;encrypted_content&lt;/code&gt;, keeping only &lt;code&gt;title&lt;/code&gt; and &lt;code&gt;url&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;So basically — at the search stage, the AI sees nothing more than &lt;strong&gt;one title and one URL&lt;/strong&gt; from your site. That's it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What does WebFetch pull in?
&lt;/h2&gt;

&lt;p&gt;Now for when the body actually gets pulled in — Stage 2, WebFetch.&lt;/p&gt;

&lt;p&gt;Once the AI has the search results, if it decides to open up a few entries, it fires one WebFetch request per URL, and that's when the full body comes back. How much of it?&lt;/p&gt;

&lt;p&gt;This needs to be split into two layers, because the API and Claude Code work differently.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;&lt;br&gt;
When I say "API" here, I mean the Anthropic API's &lt;code&gt;web_fetch&lt;/code&gt; tool. "Claude Code" means the WebFetch feature built into Anthropic's own product. The two have different specs and flows.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  API-level web_fetch
&lt;/h3&gt;

&lt;p&gt;The Anthropic API's &lt;code&gt;web_fetch&lt;/code&gt; tool has a parameter called &lt;code&gt;max_content_tokens&lt;/code&gt; that developers can set themselves — though the official docs use 100,000 tokens in their examples.&lt;/p&gt;

&lt;p&gt;The docs also give a reference conversion:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Content size&lt;/th&gt;
&lt;th&gt;Estimated tokens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Average web page 10 KB&lt;/td&gt;
&lt;td&gt;~2,500 tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Large doc page 100 KB&lt;/td&gt;
&lt;td&gt;~25,000 tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Research paper PDF 500 KB&lt;/td&gt;
&lt;td&gt;~125,000 tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;So a medium-length blog post in plain text is usually 1–2,000 tokens, way below the 100K ceiling. Truncation basically isn't a concern unless you wrote a 50,000-character monster.&lt;/p&gt;

&lt;p&gt;One thing to note: &lt;code&gt;web_fetch&lt;/code&gt;'s citation works differently from &lt;code&gt;web_search&lt;/code&gt;. It uses &lt;code&gt;start_char_index&lt;/code&gt; / &lt;code&gt;end_char_index&lt;/code&gt; to pick out a specific position in the article (although the docs don't pin down a hard character limit).&lt;/p&gt;

&lt;h3&gt;
  
  
  Claude Code's built-in WebFetch
&lt;/h3&gt;

&lt;p&gt;Claude Code's built-in WebFetch goes a different route.&lt;/p&gt;

&lt;p&gt;Per Mikhail Shilkov's breakdown, the WebFetch flow is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Convert HTML to Markdown using the Turndown package&lt;/li&gt;
&lt;li&gt;Extract the first 100 KB of plain text&lt;/li&gt;
&lt;li&gt;Pass that 100 KB to the Haiku 3.5 model&lt;/li&gt;
&lt;li&gt;Haiku summarizes the answer based on your prompt and returns the summary to the main model&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The real kicker is step 3. The main model — the Claude model you're actually using — never sees the page's original text. It only sees the version Haiku summarized. Which means &lt;strong&gt;what your writing turns into by the time it reaches the main model is decided by how Haiku reads it, not by how much you wrote&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The citation has a limit too. The rule Mikhail extracted from Claude Code's internal prompt is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Enforce a strict 125-character maximum for quotes from any source document.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So quotes max out at 125 characters.&lt;/p&gt;

&lt;p&gt;You're probably wondering — so how much is 100 KB of plain text? For Chinese, where each character is roughly 3 bytes, 100 KB fits 30,000+ characters; for English at 1 byte per character, that's well over 100,000 characters. No regular blog post is going to hit that ceiling. So the real concern should be "how do I get Haiku to extract the parts I want to be quoted," not "how much content am I feeding Haiku."&lt;/p&gt;

&lt;h2&gt;
  
  
  Running a real test against my own blog
&lt;/h2&gt;

&lt;p&gt;Enough theory — let's actually test it.&lt;/p&gt;

&lt;p&gt;I had Claude call the WebFetch tool against my blog's homepage (&lt;code&gt;israynotarray.com&lt;/code&gt;) and asked it to grab the body.&lt;/p&gt;

&lt;p&gt;To make this measurable, I leaned on Claude Code's Hook feature to log every stage's input and output. Here's the hook config — every WebSearch and WebFetch I/O gets logged:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PostToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"WebSearch|WebFetch"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/claude-tools/hooks/log-web-tool.sh"&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"SessionEnd"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/claude-tools/hooks/dump-transcript.sh"&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The prompt I used:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Please use the WebFetch tool to grab the content of israynotarray.com's homepage, then tell me what you extracted from that page.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The actual input passed to WebFetch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://israynotarray.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"prompt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Tell me the main content of this homepage: site name, author bio, latest article list (titles and summaries), categories, plus any navbar or sidebar info."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After it ran, the WebFetch result — the summary that came back — looked roughly like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"bytes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1068&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"codeText"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"OK"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"result"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"# Page Content Analysis&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;Based on the content snippet provided, I can share the following:&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;## Site Structure&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;- **Main category**: AI-related content, including Gemini and Gemini Omni products&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;## Latest Article&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;**Title**: &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;Gemini Omni Video Generation: 30 Prompt Templates for Reels, Product Demos, and Logo Reveals&lt;/span&gt;&lt;span class="se"&gt;\"\n\n&lt;/span&gt;&lt;span class="s2"&gt;**Summary**:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;This post introduces Gemini Omni video generation, released by Google at I/O in May 2026, and explains how it replaces the earlier Veo 3.1. The article covers new features like '10-second clips, native synced audio, and Chinese text rendering,' and provides 30 practical prompt templates for use cases including Instagram Reels, product demos, logo animations, B-roll, transitions, quote cards, lifestyle, and food and travel.&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;**Publish date**: 2026-06-06  &lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;**Reading time**: ~23 min&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;## Limitations&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s2"&gt;The content snippet provided does not include author bio or full navbar info."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"durationMs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4603&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://israynotarray.com"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See it? What the AI actually pulled from my blog is just this tiny under-1,000-character summary. A whole website obviously contains way more than this — the rest was never seen by the main model. The Haiku middle layer reads the page and only extracts what it judges relevant to my prompt. If I opened the same page in a browser, I'd see a full grid of article cards plus a sidebar — but Haiku doesn't ship the full grid back.&lt;/p&gt;

&lt;p&gt;I also tried an older post with a deliberately broken URL path, and got this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"bytes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"codeText"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Not Found"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"result"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The server returned HTTP 404 Not Found.&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;The response body was not retrieved. If this URL requires authentication, use an authenticated tool (e.g. `gh` for GitHub, or an MCP-provided fetch tool) instead of WebFetch."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"durationMs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;588&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://israynotarray.com/dqwdqwdqwd"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even the content of your 404 page is invisible to the AI — WebFetch just reports the 404 and the AI has no way to see what your 404 page says. Which means if your site has path issues, you've refactored URLs, or you only have frontend routing without real pages, the AI can't pull anything.&lt;/p&gt;

&lt;p&gt;Side note — this lines up with a caveat in Claude's official docs:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The web fetch tool currently does not support websites dynamically rendered with JavaScript.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If your blog is a frontend SPA where content is entirely rendered by JavaScript at runtime, what the AI grabs might just be empty-shell HTML with no articles visible. Static generators (Hexo, Astro, Next.js in SSG mode) are relatively safe, since the build output is fully rendered HTML — the AI grabs and immediately sees content.&lt;/p&gt;

&lt;h2&gt;
  
  
  Don't forget the robots.txt layer
&lt;/h2&gt;

&lt;p&gt;There's one more important piece — whether the AI can pull your site has a major prerequisite: robots.txt.&lt;/p&gt;

&lt;p&gt;AI crawlers basically split into two types: search-style (cite and link back to your site) and training-style (eat content to feed the model, not necessarily linking back). The common mapping:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Crawler&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Behavior&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claude-SearchBot / Claude-User&lt;/td&gt;
&lt;td&gt;Search&lt;/td&gt;
&lt;td&gt;Real-time fetch when Claude answers, cites back&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ClaudeBot&lt;/td&gt;
&lt;td&gt;Training&lt;/td&gt;
&lt;td&gt;Fetches content to feed Claude training&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OAI-SearchBot / ChatGPT-User&lt;/td&gt;
&lt;td&gt;Search&lt;/td&gt;
&lt;td&gt;Real-time fetch when ChatGPT answers, cites back&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPTBot&lt;/td&gt;
&lt;td&gt;Training&lt;/td&gt;
&lt;td&gt;Fetches content to feed GPT training&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PerplexityBot&lt;/td&gt;
&lt;td&gt;Search&lt;/td&gt;
&lt;td&gt;Used by Perplexity engine, cites back&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google-Extended&lt;/td&gt;
&lt;td&gt;Training&lt;/td&gt;
&lt;td&gt;For Gemini training&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CCBot&lt;/td&gt;
&lt;td&gt;Training&lt;/td&gt;
&lt;td&gt;Common Crawl public dataset&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you want to be cited by AI but don't want your content used for training, the most common strategy is "allow search-style, block training-style."&lt;/p&gt;

&lt;p&gt;Here's a robots.txt template you can copy-paste:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Search-style AI crawlers: allow (they cite back)
User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: Claude-SearchBot
Allow: /

User-agent: Claude-User
Allow: /

User-agent: PerplexityBot
Allow: /

# Training-style AI crawlers: block (consume data without citing)
User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: Bytespider
Disallow: /

# Content signal: searchable but not for training, not for direct AI input
Content-Signal: ai-train=no, search=yes, ai-input=no
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a full Agent Readiness setup to score 100, see &lt;a href="https://israynotarray.com/en/ai/2026/04/21/from-3-to-100-isitagentready-readiness-guide/" rel="noopener noreferrer"&gt;From 3 to 100! How to Get Your Site to Pass isitagentready's AI Agent Readiness Check&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  So what does this concretely mean for writing?
&lt;/h2&gt;

&lt;p&gt;Once you understand all the constraints above, there are four things worth specifically working on.&lt;/p&gt;

&lt;h3&gt;
  
  
  Titles need to stand on their own
&lt;/h3&gt;

&lt;p&gt;At the search stage, all the AI sees about your article is two fields — title and URL.&lt;/p&gt;

&lt;p&gt;If your title needs a subtitle or context to make sense, when the AI lines it up against ten other results it'll get skipped.&lt;/p&gt;

&lt;p&gt;A quick comparison:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Weaker: "Implementation Notes"&lt;/li&gt;
&lt;li&gt;Stronger: "Complete Implementation Notes: Content Negotiation for HTML-to-Markdown in a Cloudflare Worker"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The stronger version packs in "topic, tool, what it does, and article type" — the AI doesn't even need to open the page to know whether it's worth fetching.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lead with the answer
&lt;/h3&gt;

&lt;p&gt;At the WebFetch stage, the Haiku middle layer reads top-down. The first 300–500 characters decide what it summarizes back. If your opening is "Before we get into X, let's recap a bit of history…", Haiku reads halfway through and discovers the intro is all background and no answer — so it just summarizes the background.&lt;/p&gt;

&lt;p&gt;The right move is to make the first sentence of every H2 a direct conclusion, then add the context after. I covered this principle in &lt;a href="https://israynotarray.com/en/ai/2026/04/08/aeo-answer-engine-optimization-guide/" rel="noopener noreferrer"&gt;Is SEO Not Enough? Meet AEO — Getting Your Site Found by AI Search&lt;/a&gt; too — worth reading alongside this.&lt;/p&gt;

&lt;h3&gt;
  
  
  Design single sentences that can be quoted standalone
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;cited_text&lt;/code&gt; is 150 characters on &lt;code&gt;web_search&lt;/code&gt; and 125 characters on Claude Code's built-in WebFetch. That means when the AI quotes you, the slot it has is one short sentence that "makes sense without context."&lt;/p&gt;

&lt;p&gt;Consciously design sentences like that. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Weaker: "This is a bit different from the method mentioned earlier — the main difference is…" (makes no sense without context)&lt;/li&gt;
&lt;li&gt;Stronger: "llms.txt was proposed by Answer.AI co-founder Jeremy Howard in 2024, with the goal of proactively telling AI what important content a site has." (stands on its own without surrounding text)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After writing a paragraph, pick one sentence and ask yourself: if someone who hadn't read the rest of the post saw just this sentence, would they get it? If yes, it has a shot at being quoted.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use structure as Haiku's navigation markers
&lt;/h3&gt;

&lt;p&gt;H2, H3, &lt;code&gt;-&lt;/code&gt; lists, tables, &lt;code&gt;&lt;/code&gt;`&lt;code&gt;js&lt;/code&gt; code blocks — these Markdown structures are especially useful for the middleware summary layer. When Haiku reads the Markdown converted from your HTML, it treats headings as "what this section is about" indexes, lists as "main points" signals, and tables as "supporting data" units.&lt;/p&gt;

&lt;p&gt;If your whole article is pure prose paragraphs, Haiku has no markers and has to grind through it semantically — what comes out is scattered. If you have clear structure, Haiku can summarize along the markers, and the result lines up with the points you actually want quoted.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrap-up
&lt;/h2&gt;

&lt;p&gt;So how much of your blog does AI search actually pull?&lt;/p&gt;

&lt;p&gt;The answer breaks down into three layers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Search stage:&lt;/strong&gt; title + URL only&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Body fetch stage:&lt;/strong&gt; the API default can fit your whole article, but Claude Code goes through Haiku summarization with a 100 KB cut-off&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Citation stage:&lt;/strong&gt; &lt;code&gt;web_search&lt;/code&gt; is 150 characters, Claude Code WebFetch is 125 characters&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Writing for AI search means targeting those three gates — it's not about getting the AI to memorize your entire post.&lt;/p&gt;

&lt;p&gt;If your blog hasn't set up AI bot routing yet, copy the robots.txt template above to get the basics in place — the rest is just content over time.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>seo</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Is AI Getting Quietly Dumber? A 24/7 Benchmark That Catches LLM Degradation</title>
      <dc:creator>Ray</dc:creator>
      <pubDate>Thu, 18 Jun 2026 03:54:54 +0000</pubDate>
      <link>https://dev.to/isray_notarray/is-ai-getting-quietly-dumber-a-247-benchmark-that-catches-llm-degradation-2g6p</link>
      <guid>https://dev.to/isray_notarray/is-ai-getting-quietly-dumber-a-247-benchmark-that-catches-llm-degradation-2g6p</guid>
      <description>&lt;p&gt;You've probably hit this before — yesterday the AI felt sharp, fixed your bug without you even asking, and threw in a few extra cleanups along the way. Then today, same kind of problem, and suddenly it refuses to touch anything you didn't explicitly point at, or starts going in circles. And you start wondering "wait, is AI getting quietly dumber?"&lt;/p&gt;

&lt;p&gt;A lot of people have this feeling but it's hard to pin down. Is the provider messing with the model? Is today's problem genuinely harder? Or is it all in your head? Without a number to point at, it just stays an urban legend.&lt;/p&gt;

&lt;p&gt;So this post is about a site called &lt;a href="https://aistupidlevel.info/" rel="noopener noreferrer"&gt;AIStupidLevel&lt;/a&gt; — the name pretty much spells it out. It runs round-the-clock tests against the major AI models and turns "is AI getting dumber" into an actual curve you can look at.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why would an AI "get dumber" in the first place?
&lt;/h2&gt;

&lt;p&gt;A reasonable pushback: doesn't the model get trained once and then frozen? A few things are actually going on behind the scenes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Silent updates.&lt;/strong&gt; If you've worked with the API, you've probably seen version strings like &lt;code&gt;gpt-4.0-20240924&lt;/code&gt; — that trailing date hints at a specific build. But when you pick a model inside Codex or similar tools, you don't see that level of detail. Some models don't even have versioned IDs, just a generic &lt;code&gt;gpt-4&lt;/code&gt;. So you can't actually tell which build you're hitting, and a silent swap is really hard to notice.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quantization.&lt;/strong&gt; To handle global peak traffic and save compute, providers sometimes compress the model.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Throttled compute.&lt;/strong&gt; Once usage crosses a threshold, providers may cap how much compute each user can pull, which makes outputs feel worse.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compute migration.&lt;/strong&gt; When a new model is about to launch, providers sometimes shift compute away from the older model. From my own time hitting the API, this is the one I see most — the same prompt suddenly degrades overnight, then a few days later a stronger version drops, and shortly after the old version goes back to normal.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I covered quantization more thoroughly in &lt;a href="https://israynotarray.com/en/ai/2026/04/12/llm-parameters-quantization-guide/" rel="noopener noreferrer"&gt;Want to Run an LLM Yourself? Understanding Model Parameters and Quantization So You Stop Picking the Wrong Model&lt;/a&gt; — short version, it's a compression technique that saves resources but loses some detail.&lt;/p&gt;

&lt;p&gt;So an AI model isn't actually a frozen thing. Providers can silently update it, quantize it, throttle it, or migrate compute away from it — and any of those will change what you experience. The hard part is, you usually can't tell.&lt;/p&gt;

&lt;h2&gt;
  
  
  So what is AIStupidLevel exactly?
&lt;/h2&gt;

&lt;p&gt;AIStupidLevel is a third-party benchmark platform (open source, MIT) that continuously monitors whether AI models are regressing. Think of it like a system health check, except it's checking the "health" of AI models. It runs 24/7 against &lt;strong&gt;21 production models from 7 providers&lt;/strong&gt; — OpenAI, Anthropic, Google, xAI, DeepSeek, Kimi, GLM — and turns each model's current performance into a score on a dashboard.&lt;/p&gt;

&lt;p&gt;It's not run by any AI company, which matters here. You don't want the people grading the models to also be selling them.&lt;/p&gt;

&lt;h2&gt;
  
  
  How does it actually test?
&lt;/h2&gt;

&lt;p&gt;The core idea is simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Fixed question bank, run repeatedly.&lt;/strong&gt; It maintains a fixed set of tasks, throws them at each model on a schedule, and logs the scores.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Same task N times.&lt;/strong&gt; Because model outputs are stochastic, it runs each task 5 times and takes the median, plus a 95% confidence interval.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Four suites on a rotation
&lt;/h3&gt;

&lt;p&gt;It doesn't just run one kind of test — it has four suites taking turns, each watching for different things:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Test Suite&lt;/th&gt;
&lt;th&gt;Frequency&lt;/th&gt;
&lt;th&gt;What it tests&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Speed (coding)&lt;/td&gt;
&lt;td&gt;Every 4 hours&lt;/td&gt;
&lt;td&gt;147 coding problems, overall coding ability&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deep reasoning&lt;/td&gt;
&lt;td&gt;Daily&lt;/td&gt;
&lt;td&gt;5–7 turn multi-turn dialogues, checks long-conversation logic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tool calling&lt;/td&gt;
&lt;td&gt;Daily&lt;/td&gt;
&lt;td&gt;Spins up a real Docker sandbox so the AI can actually run multi-step &lt;code&gt;execute-command&lt;/code&gt; / &lt;code&gt;read-file&lt;/code&gt; / &lt;code&gt;write-file&lt;/code&gt; flows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Drift detection (canary)&lt;/td&gt;
&lt;td&gt;Hourly&lt;/td&gt;
&lt;td&gt;12 lightweight quick checks, first line of defense&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The hourly canary plays sentinel — if something starts looking off, it sounds the alarm. The daily deep reasoning and tool-calling runs are the heavier full-body checkup.&lt;/p&gt;

&lt;p&gt;The tool-calling suite is the one I find especially interesting. It actually spins up a Docker sandbox (think of it as an isolated mini-computer) and has the AI run real commands inside it, instead of just "verbally" claiming it can use tools. The results end up much closer to what you actually feel when using AI to write code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scoring isn't just right vs wrong
&lt;/h3&gt;

&lt;p&gt;A single coding task gets scored across 9 weighted dimensions:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Weight&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Correctness&lt;/td&gt;
&lt;td&gt;40%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Complexity&lt;/td&gt;
&lt;td&gt;20%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code Quality&lt;/td&gt;
&lt;td&gt;15%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Stability&lt;/td&gt;
&lt;td&gt;10%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Efficiency&lt;/td&gt;
&lt;td&gt;5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Edge Cases&lt;/td&gt;
&lt;td&gt;3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Debugging&lt;/td&gt;
&lt;td&gt;3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Format&lt;/td&gt;
&lt;td&gt;2%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Safety&lt;/td&gt;
&lt;td&gt;2%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Correctness clearly dominates, but even if the answer runs, you'll still lose points if the code is a mess, misses edge cases, or spews garbage formatting.&lt;/p&gt;

&lt;h2&gt;
  
  
  How does it catch the moment a model starts getting dumber?
&lt;/h2&gt;

&lt;p&gt;Just having a score isn't enough — the score naturally bounces around because AI is non-deterministic. The real question is: is this drop a real regression, or just noise?&lt;/p&gt;

&lt;p&gt;This is where the project gets technically interesting. It uses an algorithm called &lt;strong&gt;CUSUM&lt;/strong&gt; — short for Cumulative Sum Control Chart. CUSUM didn't come from AI research; it's an old quality-control method from manufacturing. The idea is to keep accumulating the gap between observed performance and the baseline. Once the accumulated gap crosses a threshold, you call it: this isn't noise, something actually changed.&lt;/p&gt;

&lt;p&gt;On top of CUSUM, it also runs statistical significance testing (checking whether the difference is statistically meaningful, p-value below 0.05) as a second pass, to keep false alarms down.&lt;/p&gt;

&lt;p&gt;The real win: with this statistical machinery, a degradation can be detected &lt;strong&gt;within hours&lt;/strong&gt; of starting, rather than waiting until people are venting on social media that "AI got dumber." The system has 29 warning categories built in for different anomaly patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reading the Stupid Meter
&lt;/h2&gt;

&lt;p&gt;Each model has a live 0–100 score next to it — higher is better. Next to the score is a status tag telling you which of four states the model is in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;STABLE&lt;/strong&gt; — performing normally&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VOLATILE&lt;/strong&gt; — jittering&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DEGRADED&lt;/strong&gt; — already worse&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RECOVERING&lt;/strong&gt; — climbing back up&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Beyond the current score, it lays out each model's historical curve, so you can compare scenarios like "this model was rock-solid last week, why is it jumping around this week."&lt;/p&gt;

&lt;p&gt;In practice it feels more like a stock-trading dashboard — you're not looking at a static ranking, you're checking "right at this moment, which model is worth using." If you're still wrestling with which AI tool to pick in the first place, my earlier post &lt;a href="https://israynotarray.com/en/ai/2026/03/26/2026-ai-coding-tools-comparison/" rel="noopener noreferrer"&gt;Which AI Coding Tool Should You Pick in 2026?&lt;/a&gt; pairs well with this one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Smart Router: route around degraded models automatically
&lt;/h2&gt;

&lt;p&gt;Beyond monitoring, AIStupidLevel also ships a feature called &lt;strong&gt;Smart Router&lt;/strong&gt;, which is a pretty interesting extension of the project.&lt;/p&gt;

&lt;p&gt;It's an OpenAI-compatible API endpoint, meaning code you wrote against OpenAI barely needs to change. You drop each provider's API key into it (stored with AES-256 encryption), point your base URL at it, and it routes each request to whichever model is currently in the best shape based on the live monitoring.&lt;/p&gt;

&lt;p&gt;It offers six routing strategies — just set the &lt;code&gt;model&lt;/code&gt; field in your API call to one of these and it picks dynamically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;auto-best&lt;/code&gt; — overall pick, whichever model has the best combined score right now&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;auto-coding&lt;/code&gt; — best at coding right now&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;auto-reasoning&lt;/code&gt; — strongest at reasoning&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;auto-creative&lt;/code&gt; — leans toward creative output&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;auto-cheapest&lt;/code&gt; — cheapest model above the quality bar&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;auto-fastest&lt;/code&gt; — fastest model above the quality bar&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you just pass &lt;code&gt;auto&lt;/code&gt;, it uses whichever strategy you've set as your default. And if you pass a specific model name (like &lt;code&gt;claude-opus-4-7&lt;/code&gt;), it pins that model directly and skips the router. So when a model quietly starts slipping and Smart Router catches it, traffic auto-routes to a sibling that's still healthy. Pretty useful if you actually want to wire AI into a product.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrap-up
&lt;/h2&gt;

&lt;p&gt;So what is AIStupidLevel?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An &lt;strong&gt;independent, open source&lt;/strong&gt; (MIT) third-party benchmark site monitoring 21 production AI models across 7 providers (OpenAI, Anthropic, Google, xAI, DeepSeek, Kimi, GLM), 24 hours a day&lt;/li&gt;
&lt;li&gt;Method: fixed question bank, run repeatedly — each task runs 5 times, median plus 95% confidence interval, across four test suites on a rotation&lt;/li&gt;
&lt;li&gt;Uses &lt;strong&gt;CUSUM change-point detection&lt;/strong&gt; plus statistical testing to catch a model quietly degrading within hours&lt;/li&gt;
&lt;li&gt;Ships &lt;strong&gt;Smart Router&lt;/strong&gt; that auto-routes API traffic to whichever model is in the best shape based on live monitoring&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next time you feel like the AI suddenly got dumber, don't jump straight to blaming yourself — pop the dashboard open and you might actually find the evidence.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>openai</category>
      <category>devtools</category>
    </item>
    <item>
      <title>Claude Code Auto Memory: Stop Re-Explaining Your Preferences Every Session</title>
      <dc:creator>Ray</dc:creator>
      <pubDate>Wed, 17 Jun 2026 02:50:33 +0000</pubDate>
      <link>https://dev.to/isray_notarray/claude-code-auto-memory-stop-re-explaining-your-preferences-every-session-35ne</link>
      <guid>https://dev.to/isray_notarray/claude-code-auto-memory-stop-re-explaining-your-preferences-every-session-35ne</guid>
      <description>&lt;p&gt;If you've spent any real time with Claude Code, you know this pain: you open a fresh session and it's forgotten everything you taught it yesterday. I'd tell it over and over not to dump a pile of emojis into my writing because it looks too AI — and the next morning, fresh session, emojis everywhere again.&lt;/p&gt;

&lt;p&gt;Turns out there's a fix built for exactly this, called &lt;strong&gt;Auto Memory&lt;/strong&gt;. I'd already written about &lt;a href="https://israynotarray.com/en/ai/2026/03/10/claude-md-rules-for-ai-writing-tips/" rel="noopener noreferrer"&gt;CLAUDE.md&lt;/a&gt; for setting project rules by hand, but Claude Code's memory system is bigger than just CLAUDE.md. Auto Memory quietly records your corrections and preferences as you work, so next session you don't have to say them again.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Auto Memory needs Claude Code v2.1.59 or newer. Check with &lt;code&gt;claude --version&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Auto Memory vs CLAUDE.md
&lt;/h2&gt;

&lt;p&gt;A lot of people mix these two up, so let's get it straight first.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;CLAUDE.md&lt;/th&gt;
&lt;th&gt;Auto Memory&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Who writes it&lt;/td&gt;
&lt;td&gt;You&lt;/td&gt;
&lt;td&gt;Claude, automatically&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;What's in it&lt;/td&gt;
&lt;td&gt;Instructions and rules&lt;/td&gt;
&lt;td&gt;Learned lessons and patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scope&lt;/td&gt;
&lt;td&gt;Project / user / org&lt;/td&gt;
&lt;td&gt;Per working tree&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Loading&lt;/td&gt;
&lt;td&gt;Fully loaded every session&lt;/td&gt;
&lt;td&gt;First 200 lines / 25KB only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Code conventions, workflow, architecture&lt;/td&gt;
&lt;td&gt;Build commands, debugging notes, preferences&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In short: CLAUDE.md is the set of rules you write for Claude on purpose — "use 2-space indentation," "write commit messages in English." Auto Memory is what Claude picks up on its own while working with you. You don't write it; it does.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Auto Memory actually does
&lt;/h2&gt;

&lt;p&gt;Auto Memory is basically Claude taking its own notes. When you correct it, or tell it whether an approach worked, Claude decides whether that's worth remembering for later. If it is, it saves it.&lt;/p&gt;

&lt;p&gt;So when I told it:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Don't use so many emojis, it looks too AI&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;it saved that as a memory file, and every new session loads it back. Now it remembers I don't like emoji soup.&lt;/p&gt;

&lt;p&gt;It doesn't save everything, though — one-off discussions or things irrelevant to future work don't get written down.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where it's stored (this is the confusing part)
&lt;/h2&gt;

&lt;p&gt;Here's the thing that trips everyone up. Your project has a &lt;code&gt;.claude/&lt;/code&gt; folder, but Auto Memory files are &lt;strong&gt;not&lt;/strong&gt; in there. The project &lt;code&gt;.claude/&lt;/code&gt; only holds settings (&lt;code&gt;settings.json&lt;/code&gt;, &lt;code&gt;settings.local.json&lt;/code&gt;) and rules.&lt;/p&gt;

&lt;p&gt;The actual memory lives under your home directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/Users/ray/my-project/.claude/                              &amp;lt;- project settings, inside your repo
/Users/ray/.claude/projects/-Users-ray-my-project/memory/   &amp;lt;- Auto Memory, under your home dir
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both paths contain &lt;code&gt;.claude&lt;/code&gt;, but one is in your project and one is in your home directory — don't mix them up. If you peek into your project's &lt;code&gt;.claude/&lt;/code&gt; and only find &lt;code&gt;settings.local.json&lt;/code&gt;, that's normal. The memory was never there.&lt;/p&gt;

&lt;p&gt;The memory path is derived from your git repo, so subdirectories and worktrees of the same repo share one memory. And it's machine-local — it does not sync to other machines or cloud environments.&lt;/p&gt;

&lt;p&gt;If you want it somewhere else, set &lt;code&gt;autoMemoryDirectory&lt;/code&gt; (Policy / Local / User level only — not project level, on purpose, so a shared repo can't redirect your memory somewhere shady):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"autoMemoryDirectory"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"~/my-custom-memory-dir"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The directory structure
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;~/.claude/projects/&amp;lt;project&amp;gt;/memory/
├── MEMORY.md          # index, loaded every session
├── debugging.md       # debugging notes
├── api-conventions.md # API design decisions
└── ...                # other topic files Claude creates
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;MEMORY.md&lt;/code&gt; is the index — Claude uses it to track which memories exist and which file each one lives in. Only the first 200 lines / 25KB of &lt;code&gt;MEMORY.md&lt;/code&gt; loads at session start, so Claude keeps it lean and pushes detail into separate topic files. Those topic files aren't loaded on startup; Claude reads them on demand. So your memory can grow without bloating context every single time.&lt;/p&gt;

&lt;p&gt;(That 200-line cap is only for &lt;code&gt;MEMORY.md&lt;/code&gt;. CLAUDE.md always loads in full — though shorter still means better adherence.)&lt;/p&gt;

&lt;h2&gt;
  
  
  How do you know it's writing memory?
&lt;/h2&gt;

&lt;p&gt;When you see "Writing memory" or "Recalled memory" in the Claude Code UI, that's it updating or reading memory files. You don't have to do anything; it handles it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Managing it
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;/memory&lt;/code&gt;&lt;/strong&gt; — the main entry point. See what's loaded, toggle Auto Memory on/off, open the memory folder, or open any file in your editor.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Just ask&lt;/strong&gt; — "remember: always use pnpm, never npm" saves a memory. "forget the rule about pnpm" updates or deletes it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Edit by hand&lt;/strong&gt; — they're plain Markdown files under &lt;code&gt;~/.claude/projects/&lt;/code&gt;. Add, edit, or delete whatever you want.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Turn it off&lt;/strong&gt; — toggle it in &lt;code&gt;/memory&lt;/code&gt;, set &lt;code&gt;"autoMemoryEnabled": false&lt;/code&gt;, or set &lt;code&gt;CLAUDE_CODE_DISABLE_AUTO_MEMORY=1&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And in case you were wondering: &lt;code&gt;/compact&lt;/code&gt; does not wipe it. The files are on disk, not in the conversation. Compacting only squeezes the chat history; &lt;code&gt;MEMORY.md&lt;/code&gt; and CLAUDE.md get re-read from disk and injected back afterward.&lt;/p&gt;

&lt;p&gt;That's the whole thing. It's a small feature, but once you stop repeating yourself every session it's hard to go back.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>productivity</category>
      <category>devtools</category>
    </item>
  </channel>
</rss>
