<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Solomon Neas</title>
    <description>The latest articles on DEV Community by Solomon Neas (@solomonneas).</description>
    <link>https://dev.to/solomonneas</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3823381%2F9d209953-f24c-43dd-a24d-ca634faed184.jpeg</url>
      <title>DEV Community: Solomon Neas</title>
      <link>https://dev.to/solomonneas</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/solomonneas"/>
    <language>en</language>
    <item>
      <title>Anthropic Broke My OpenClaw Stack. GPT 5.4 Put It Back Together</title>
      <dc:creator>Solomon Neas</dc:creator>
      <pubDate>Wed, 22 Apr 2026 20:24:00 +0000</pubDate>
      <link>https://dev.to/solomonneas/anthropic-broke-my-openclaw-stack-gpt-54-put-it-back-together-5345</link>
      <guid>https://dev.to/solomonneas/anthropic-broke-my-openclaw-stack-gpt-54-put-it-back-together-5345</guid>
      <description>&lt;p&gt;If you quit OpenClaw after Anthropic pulled third-party harness support, I don't blame you.&lt;/p&gt;

&lt;p&gt;That week was a mess. One day Claude was the center of the stack. Then Anthropic cut subscription coverage for third-party harnesses, Boris Cherny said some of the Claude CLI blocking looked like an overactive abuse classifier, Peter Steinberger tried to route around it, and OpenClaw still kept hitting the wall when the request looked too much like a harnessed system-prompt wrapper.&lt;sup id="fnref1"&gt;1&lt;/sup&gt;&lt;sup id="fnref2"&gt;2&lt;/sup&gt;&lt;sup id="fnref3"&gt;3&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;That is the part a lot of people missed. The problem was not just billing. The problem was that "Claude CLI is allowed" and "OpenClaw using Claude CLI works" turned out to be two different statements.&lt;/p&gt;

&lt;p&gt;So I stopped trying to save the old stack.&lt;/p&gt;

&lt;p&gt;I rebuilt it around GPT 5.4 as the orchestrator, Gemini for research, Claude only where first-party Anthropic paths still made sense, and a local-first retrieval layer that did not depend on vendor mood swings. It took more than a model swap. I had to fix crashing bootstrap injections, tighten routing rules until the agent stopped bullshitting itself, and treat memory and embeddings like infrastructure instead of a nice extra.&lt;/p&gt;

&lt;p&gt;This post is the field report. If you think OpenClaw became unusable after the Anthropic mess, fair enough. If you think GPT 5.4 cannot anchor a serious OpenClaw setup, I have receipts.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Week Anthropic Pulled the Rug
&lt;/h2&gt;

&lt;p&gt;On April 3, Anthropic emailed subscribers to say that starting April 4 at 12 p.m. Pacific, Claude Pro and Max subscription limits would no longer cover third-party harnesses, with OpenClaw called out by name.&lt;sup id="fnref4"&gt;4&lt;/sup&gt; Continuing to use Claude through those tools would require extra usage, billed separately from the subscription.&lt;sup id="fnref5"&gt;5&lt;/sup&gt; Anthropic's rationale was blunt: subscription plans were not built for the call volume and usage patterns of agentic harnesses.&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;That part was not rumor. It was confirmed in the customer email, picked up by TechCrunch, The Register, VentureBeat, and a dozen community threads full of people realizing their entire setup had just changed cost models overnight.&lt;sup id="fnref5"&gt;5&lt;/sup&gt;&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;sup id="fnref7"&gt;7&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;At that point the obvious question was whether Claude Code CLI could still serve as a bridge. Peter found a workaround path that routed requests through the Claude CLI backend instead of the older OAuth-backed OpenClaw flow. For a minute, that looked like a real survival route. Then it became clear that "Claude CLI is allowed" and "OpenClaw using Claude CLI works" were not actually the same thing.&lt;sup id="fnref2"&gt;2&lt;/sup&gt;&lt;sup id="fnref3"&gt;3&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;That distinction is the whole fight.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Boris Said, and Why That Still Wasn't Enough
&lt;/h2&gt;

&lt;p&gt;On April 6, Boris Cherny replied publicly that the blocking behavior did not look intentional and was "likely an overactive abuse classifier," and that Anthropic was looking into it while clarifying policy going forward.&lt;sup id="fnref1"&gt;1&lt;/sup&gt; If you were trying to keep OpenClaw alive on Claude, that sounded like hope.&lt;/p&gt;

&lt;p&gt;Peter treated it like hope. So did a lot of users.&lt;/p&gt;

&lt;p&gt;But the later public discussion makes the problem obvious. Peter said OpenClaw added Claude CLI support based on those assurances, only to find that Anthropic still blocked the setup in practice. His explanation is the clearest version of the current state: Claude CLI usage may be allowed in the abstract, but OpenClaw still appears to hit blocking when the request includes the sort of appended system prompt and harness context that marks it as OpenClaw-shaped traffic.&lt;sup id="fnref2"&gt;2&lt;/sup&gt;&lt;sup id="fnref3"&gt;3&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;That is why this felt so maddening. Anthropic did not seem to be checking only the binary or the auth path. It looked more like they were classifying the behavior and the envelope. A plain &lt;code&gt;claude -p&lt;/code&gt; call could appear fine. The same model, wrapped in OpenClaw's injected prompt layer, could trip a different outcome.&lt;/p&gt;

&lt;p&gt;That is a very different kind of problem than a simple billing toggle.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I Stopped Treating Claude as the Center of the Stack
&lt;/h2&gt;

&lt;p&gt;Once that sank in, the answer got ugly but obvious.&lt;/p&gt;

&lt;p&gt;I stopped trying to make OpenClaw depend on Anthropic's goodwill.&lt;/p&gt;

&lt;p&gt;Instead of centering the whole system on Claude and using other models as occasional helpers, I inverted it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GPT 5.4 became the main orchestrator&lt;/strong&gt; for planning, coding, structure, and conversation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini 3 Pro became the research and long-context lane&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude Opus stayed in the stack only through first-party Anthropic surfaces&lt;/strong&gt;, where it was still worth using for architecture, review, and deep reasoning.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local embeddings became mandatory infrastructure&lt;/strong&gt;, not a nice-to-have.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That part matters more than people think. A lot of "OpenClaw is unusable now" complaints were really complaints about losing a cheap Claude substrate. Once that substrate went bad, the rest of the stack had no spine. GPT 5.4 gave the system a spine again.&lt;/p&gt;

&lt;p&gt;Not because it was magical. Because it was stable, good at orchestration, and available on terms that did not collapse the minute Anthropic changed its classifier.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Ugly OpenClaw Reality That Skeptics Were Right About
&lt;/h2&gt;

&lt;p&gt;The skeptics were not wrong about the rough edges.&lt;/p&gt;

&lt;p&gt;The week after the Anthropic cutoff, OpenClaw could absolutely feel cursed. I saw runs that posted "working" notifications for twenty-plus minutes and then stalled. I saw hollow filler replies instead of actual task progress. I saw backend glitches pile on top of billing chaos. And I saw enough papercuts around PDFs and update behavior that a normal user would have been justified in walking away.&lt;/p&gt;

&lt;p&gt;That is not me hedging. That is me giving the critics their due.&lt;/p&gt;

&lt;p&gt;There were at least two categories of problems happening at once:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;External instability&lt;/strong&gt;: Anthropic changed the economics and kept the technical boundaries vague.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Internal OpenClaw fragility&lt;/strong&gt;: some core workflows still had rough implementation edges right when users needed confidence most.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If all you saw was that surface layer, OpenClaw looked broken and overcomplicated.&lt;/p&gt;

&lt;p&gt;The trick was surviving long enough to harden the inside.&lt;/p&gt;

&lt;h2&gt;
  
  
  The First Big Fix: Local-First Enforcement Had to Stop Crashing the Gateway
&lt;/h2&gt;

&lt;p&gt;One of the most important fixes I made predated the April blowup, but it became much more important after it.&lt;/p&gt;

&lt;p&gt;I had a local-first enforcement hook that injected routing rules at bootstrap. The idea was straightforward: force the stack to check local systems and cheap retrieval paths before burning third-party tokens. In practice, the first version crashed the gateway on every message.&lt;/p&gt;

&lt;p&gt;The root cause was stupid and surgical.&lt;/p&gt;

&lt;p&gt;My hook was pushing malformed bootstrap file objects. OpenClaw's bootstrap pipeline expected a workspace file object with the shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;missing&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The broken hook was effectively sending the wrong keys, which meant &lt;code&gt;path&lt;/code&gt; was undefined. Somewhere inside the bundled runtime, OpenClaw did a &lt;code&gt;file.path.replace(...)&lt;/code&gt; call. Since &lt;code&gt;path&lt;/code&gt; was missing, the gateway face-planted before the agent could even reply.&lt;/p&gt;

&lt;p&gt;The fix was to stop guessing and match the actual expected shape exactly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;bootstrapFiles&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;LOCAL_FIRST_RULES.md&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;workspaceDir&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/LOCAL_FIRST_RULES.md`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;RULES_CONTENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;missing&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That one fix took the system from "broken before every reply" to "capable of receiving routing constraints at startup." It also taught the most annoying lesson of the whole month: &lt;strong&gt;a lot of agent reliability work is not model work. It is object shape, lifecycle timing, and boring plumbing.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Second Big Fix: Prompt Rules Only Work When They Are Concrete and Mean
&lt;/h2&gt;

&lt;p&gt;After the crash fix, I still had a behavior problem.&lt;/p&gt;

&lt;p&gt;The agent knew it should prefer local systems first, but "should" is not enforcement. Vague rules got ignored. Gentle guidance got rationalized away. Anything that felt like judgment instead of a hard edge leaked.&lt;/p&gt;

&lt;p&gt;So I rewrote the routing rules until they behaved like guardrails instead of wishes.&lt;/p&gt;

&lt;p&gt;The final structure was basically this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Pre-flight check first&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Code Search API before grepping repos&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Prompt Library before drafting prompts from scratch&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Agent Intel before answering knowledge questions&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Delegate file-scanning tasks instead of doing them directly in the expensive orchestrator lane&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hard violations for common shortcuts&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The part that finally worked was not abstract policy. It was specifics.&lt;/p&gt;

&lt;p&gt;For example, coder delegation did not start sticking until I added:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;explicit command blocklists for &lt;code&gt;grep&lt;/code&gt;, &lt;code&gt;find&lt;/code&gt;, &lt;code&gt;cat&lt;/code&gt;, &lt;code&gt;head&lt;/code&gt;, &lt;code&gt;tail&lt;/code&gt;, &lt;code&gt;sed&lt;/code&gt;, &lt;code&gt;awk&lt;/code&gt;, and friends&lt;/li&gt;
&lt;li&gt;examples of the exact rationalizations I wanted to kill&lt;/li&gt;
&lt;li&gt;direct language that said "no, even if it would be faster"&lt;/li&gt;
&lt;li&gt;cost framing tied to me personally paying for wasted calls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last part sounds petty, but it worked. Models respond better when the rule cashes out into a human consequence.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Third Big Fix: Split the Work Across Strict Agent Lanes
&lt;/h2&gt;

&lt;p&gt;This was the real system redesign.&lt;/p&gt;

&lt;p&gt;I stopped pretending one general-purpose agent should do everything well and cheaply.&lt;/p&gt;

&lt;p&gt;Instead, I ran OpenClaw with strict lanes:&lt;/p&gt;

&lt;h3&gt;
  
  
  Lane 1: GPT 5.4 as the orchestrator
&lt;/h3&gt;

&lt;p&gt;This is the lane that made the whole thing survivable.&lt;/p&gt;

&lt;p&gt;GPT 5.4 handled:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;planning&lt;/li&gt;
&lt;li&gt;orchestration&lt;/li&gt;
&lt;li&gt;coding direction&lt;/li&gt;
&lt;li&gt;multi-step reasoning&lt;/li&gt;
&lt;li&gt;structured drafting&lt;/li&gt;
&lt;li&gt;deciding which lane should do what&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This turned out to be the right use of it. Not as a gimmick. Not as a pure chatbot. As the traffic cop and field commander.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lane 2: Parallel research lanes instead of one research monoculture
&lt;/h3&gt;

&lt;p&gt;I stopped treating research like a single-provider job.&lt;/p&gt;

&lt;p&gt;Gemini 3 Pro handled the giant-context and multimodal work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;long source packs&lt;/li&gt;
&lt;li&gt;vision work&lt;/li&gt;
&lt;li&gt;large-doc sweeps&lt;/li&gt;
&lt;li&gt;broad comparison passes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Perplexity handled sourced current-event sweeps and fast citation gathering.&lt;/p&gt;

&lt;p&gt;Claude Opus handled the tighter synthesis pass when I wanted a second serious read on architecture, tradeoffs, or whether a conclusion actually held up.&lt;/p&gt;

&lt;p&gt;Running those lanes in parallel was better than pretending one model should be researcher, fact-checker, and synthesizer all at once.&lt;/p&gt;

&lt;p&gt;That also kept GPT 5.4 from being used like an overpriced research scraper.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lane 3: Native harnesses for serious building
&lt;/h3&gt;

&lt;p&gt;For serious builds, I stopped pretending the generic orchestrator lane should do all the heavy lifting itself.&lt;/p&gt;

&lt;p&gt;I used native coding harnesses where they were strongest:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Codex CLI&lt;/strong&gt; for build-heavy coding work in its own native loop&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code&lt;/strong&gt; when I wanted Anthropic's first-party harness behavior and tool loop&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenClaw&lt;/strong&gt; as the orchestrator that decides when those lanes should take over&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This part matters because harness engineering changes outcomes. Public 2026 benchmark writeups put GPT-5.4-Codex at &lt;strong&gt;56.8% on SWE-bench Pro&lt;/strong&gt;, while Anthropic's later Opus line was reported at &lt;strong&gt;64.3%&lt;/strong&gt; on the same benchmark family in a first-party-style coding setup.&lt;sup id="fnref8"&gt;8&lt;/sup&gt;&lt;sup id="fnref9"&gt;9&lt;/sup&gt; Separate harness comparisons also make the bigger point: the same base model can move a lot depending on which harness is driving it, which is why I care less about leaderboard chest-thumping and more about using the model in the environment where it actually behaves best.&lt;sup id="fnref10"&gt;10&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;There was also a policy boundary here. Anthropic's published position was that subscriber OAuth access was reserved for Claude Code and Claude.ai, while third-party harnesses needed API-key access or explicit permission.&lt;sup id="fnref11"&gt;11&lt;/sup&gt; So even when Opus looked better in a given harness, the practical Anthropic-safe path was Claude Code, not trying to smuggle subscriber OAuth back through OpenClaw.&lt;/p&gt;

&lt;p&gt;That is why Claude stayed in the stack, but not at the center of it.&lt;/p&gt;

&lt;p&gt;That was the emotional adjustment some people never made. They were still trying to get back to the old world. I was building for the new one.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Quiet Hero: Local Embeddings and Memory Cards
&lt;/h2&gt;

&lt;p&gt;The part I trust most in this stack is not the fanciest model. It is the memory system.&lt;/p&gt;

&lt;p&gt;A lot of agent setups fail because they rely on fresh-session vibes and giant transcript sludge. When the vendor changes behavior, the whole thing turns into a goldfish with tools.&lt;/p&gt;

&lt;p&gt;I wanted something closer to a compact operating memory:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a slim &lt;code&gt;MEMORY.md&lt;/code&gt; as the index&lt;/li&gt;
&lt;li&gt;atomic knowledge cards in &lt;code&gt;memory/cards/*.md&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;daily raw logs in &lt;code&gt;memory/YYYY-MM-DD.md&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;semantic search across all of it&lt;/li&gt;
&lt;li&gt;local embeddings first&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For embeddings, I run &lt;code&gt;qwen3-embedding:8b&lt;/code&gt; locally through Ollama for memory search and code retrieval. That matters for three reasons.&lt;/p&gt;

&lt;p&gt;First, it is cheap. Second, it is private. Third, it removes one more dependency on third-party token economics for the boring but essential retrieval layer.&lt;/p&gt;

&lt;p&gt;This is the part that reminds me most of the better agent-memory systems people keep circling around lately. Keep the hot context small. Keep durable knowledge chunked. Search first, load second. Do not dump a 60 KB memory blob into every session and pray.&lt;/p&gt;

&lt;p&gt;That change alone made the stack feel less brittle.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Missing Glue: Handoff Notes Between Codex, Claude Code, and OpenClaw
&lt;/h2&gt;

&lt;p&gt;One thing I also borrowed from the gbrain and gstack style of thinking was this: the useful part is not just having multiple agents. It is having a clean handoff surface between them.&lt;/p&gt;

&lt;p&gt;That is where a lot of multi-model setups quietly fall apart. Codex figures something out. Claude Code discovers a root cause. OpenClaw finishes the task. Then the lesson dies in whichever terminal happened to learn it.&lt;/p&gt;

&lt;p&gt;I got sick of that.&lt;/p&gt;

&lt;p&gt;So I added a Memory Handoff workflow for Claude Code and the other coding lanes. When a session produces something durable, it is supposed to emit a short structured handoff instead of burying the answer in chat exhaust. The format is dead simple: what happened, why it matters, the durable facts, the evidence, and where the knowledge should land next.&lt;/p&gt;

&lt;p&gt;That matters because I do not want three competing memory systems.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Codex and Claude Code can keep local session context&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;OpenClaw on Rocinante is the canonical long-term memory&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;handoffs are the bridge&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practice that means architecture decisions, weird bug roots, workflow changes, setup gotchas, and user preferences get written as small transfer documents in &lt;code&gt;.claude/memory-handoffs/&lt;/code&gt;. From there, Rocinante can route them into &lt;code&gt;memory/cards/&lt;/code&gt;, &lt;code&gt;TOOLS.md&lt;/code&gt;, &lt;code&gt;USER.md&lt;/code&gt;, &lt;code&gt;rules/&lt;/code&gt;, or &lt;code&gt;.learnings/&lt;/code&gt; depending on what the fact actually is.&lt;/p&gt;

&lt;p&gt;That sounds boring. Good. Boring is what you want from memory plumbing.&lt;/p&gt;

&lt;p&gt;The important part is that it kills the old handoff-letter problem. I am not pasting giant end-of-session summaries back into a fresh model anymore. I am pushing small, structured notes across lanes so Codex, Claude Code, and OpenClaw stop relearning the same lesson like idiots.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Ingestion Layer: n8n as the Information Bus
&lt;/h2&gt;

&lt;p&gt;The other piece people should know about is n8n.&lt;/p&gt;

&lt;p&gt;I do not just use n8n for social posting. It has turned into the information bus for the stack.&lt;/p&gt;

&lt;p&gt;On my setup, n8n already owns a bunch of glue work that would otherwise become brittle shell scripts and forgotten cron notes: blog publishing, cross-post fan-out, draft queues, webhook-driven publishing, and audit logging. Once that was in place, it made sense to treat it as part of the agent infrastructure too.&lt;/p&gt;

&lt;p&gt;So the stack is not just model routing plus memory cards. It is also ingestion.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;webhooks for one-shot publish and routing events&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;scheduled jobs for polling, dedupe, and follow-up work&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;workflow state for draft queues and publish history&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;audit trails that can be shipped into Wazuh&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That matters because good agent systems are really data-movement systems in disguise. The model is only one layer. The rest is capture, routing, storage, and retrieval.&lt;/p&gt;

&lt;p&gt;In my case, n8n handles a lot of the "something happened, now move it where it belongs" work. A blog draft can become a canonical post, then a cross-post job, then a social preview queue, then a record in the broader content pipeline. A Claude Code memory handoff can get ingested on schedule so durable knowledge does not stay trapped in a repo-local folder. Different jobs. Same philosophy.&lt;/p&gt;

&lt;p&gt;That is the part I think more OpenClaw users should steal. Not my exact stack. The pattern.&lt;/p&gt;

&lt;p&gt;Use the orchestrator for judgment. Use specialist lanes for depth. Use structured handoffs so context survives model boundaries. Use an automation layer like n8n so the system keeps moving even when no one is staring at the terminal.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bugs That Made the Trench Story Real
&lt;/h2&gt;

&lt;p&gt;There were also smaller but very real wounds.&lt;/p&gt;

&lt;h3&gt;
  
  
  PDF handling
&lt;/h3&gt;

&lt;p&gt;OpenClaw had document handling gaps and PDF weirdness bad enough that it became one of the receipts I kept for public criticism. The public issue for automatic inbound PDF text extraction was already there, and I later tracked the related fix work around &lt;code&gt;standardFontDataUrl&lt;/code&gt; because broken document ingestion does not just look sloppy, it kills trust in the agent's ability to work with real inputs.&lt;sup id="fnref12"&gt;12&lt;/sup&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Systemd update behavior
&lt;/h3&gt;

&lt;p&gt;Another nasty one was update behavior that could silently strip user-added environment directives when regenerating the managed systemd unit. There is a public issue for that too, and it is the kind of bug that makes people think the product is haunted because their setup breaks after a successful update.&lt;sup id="fnref13"&gt;13&lt;/sup&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The meta-problem
&lt;/h3&gt;

&lt;p&gt;Those bugs matter beyond their own scope. They compound. If billing is unstable, prompt behavior is unstable, and upgrades silently stomp local directives, people stop giving the tool the benefit of the doubt.&lt;/p&gt;

&lt;p&gt;That is why the fix story matters. Reliability is not one grand heroic patch. It is a pile of ugly little wins that stop the system from bleeding out.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Would Say to OpenClaw Quitters
&lt;/h2&gt;

&lt;p&gt;If you quit because Anthropic killed the old Claude subscription path, I think your frustration was fair.&lt;/p&gt;

&lt;p&gt;If you quit because OpenClaw felt too fragile during that transition, that was fair too.&lt;/p&gt;

&lt;p&gt;But if you are writing off OpenClaw entirely, I think you are throwing away the wrong part.&lt;/p&gt;

&lt;p&gt;The wrong part was depending on Anthropic as the unquestioned center of the stack.&lt;/p&gt;

&lt;p&gt;The useful part is still here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;multi-lane orchestration&lt;/li&gt;
&lt;li&gt;agent-specific routing&lt;/li&gt;
&lt;li&gt;background work&lt;/li&gt;
&lt;li&gt;channel integration&lt;/li&gt;
&lt;li&gt;local-first retrieval&lt;/li&gt;
&lt;li&gt;structured memory&lt;/li&gt;
&lt;li&gt;tool access&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That combination is still powerful.&lt;/p&gt;

&lt;p&gt;The post-Anthropic version of OpenClaw just needs a different default architecture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;GPT 5.4 as the main orchestrator, with Codex CLI for native build-heavy coding work&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;parallel research lanes across Gemini, Perplexity, and Opus&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Claude only through first-party lanes like Claude Code or direct API usage&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;local embeddings for retrieval&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;strict routing rules&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;bootstrap-level prompt injection defenses&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is not a consolation prize. It is a more grown-up design.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Proof I Wanted for the Skeptics
&lt;/h2&gt;

&lt;p&gt;The reason I am writing this at all is simple.&lt;/p&gt;

&lt;p&gt;I wanted receipts.&lt;/p&gt;

&lt;p&gt;Not "trust me bro, I made it work." Not a tweet thread with the edges sanded off. Real receipts. Real failure modes. Real fixes. Real architecture changes.&lt;/p&gt;

&lt;p&gt;If I answer somebody on X about OpenClaw after the Anthropic mess, I want to point to a field report that says:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;yes, the Anthropic cutoff was real&lt;/li&gt;
&lt;li&gt;yes, the Claude CLI classifier mess was real&lt;/li&gt;
&lt;li&gt;yes, OpenClaw had enough rough edges to make people quit&lt;/li&gt;
&lt;li&gt;and yes, &lt;strong&gt;I still made GPT 5.4 work as the orchestrator anyway&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last part is the point.&lt;/p&gt;

&lt;p&gt;Not that GPT 5.4 solved every problem. It did not. Not that Anthropic stopped mattering. They still matter. The point is that once the stack stopped assuming Claude had to be the center of the universe, OpenClaw became usable again.&lt;/p&gt;

&lt;p&gt;That is a very different claim from "everything is fine."&lt;/p&gt;

&lt;p&gt;Everything was not fine.&lt;/p&gt;

&lt;p&gt;It was a trench story. Then it became a system.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Anthropic did pull the rug on third-party subscription harnessing.&lt;sup id="fnref4"&gt;4&lt;/sup&gt;&lt;sup id="fnref5"&gt;5&lt;/sup&gt;&lt;sup id="fnref6"&gt;6&lt;/sup&gt; Boris did publicly say some of the Claude CLI blocking looked like a bug or overactive classifier and that it was being addressed.&lt;sup id="fnref1"&gt;1&lt;/sup&gt; Peter did try to build around that guidance, and then said OpenClaw still hit blocks in practice because the harness behavior itself appeared to trigger Anthropic's defenses.&lt;sup id="fnref2"&gt;2&lt;/sup&gt;&lt;sup id="fnref3"&gt;3&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;That left OpenClaw users in a miserable spot: shaky policy, inconsistent runtime behavior, and a lot of reasons to distrust the stack.&lt;/p&gt;

&lt;p&gt;My answer was not to pretend that away. My answer was to redesign the stack around reality.&lt;/p&gt;

&lt;p&gt;That is what this whole post comes down to.&lt;/p&gt;

&lt;p&gt;I did not save OpenClaw by finding a new Claude loophole. I saved my OpenClaw workflow by making Claude optional, making GPT 5.4 the orchestrator, making retrieval local-first, and making the routing rules strict enough that the system stopped lying to itself.&lt;/p&gt;

&lt;p&gt;That is the version of OpenClaw I still believe in.&lt;/p&gt;







&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://solomonneas.dev/blog/openclaw-after-anthropic-how-i-made-gpt-54-work/" rel="noopener noreferrer"&gt;solomonneas.dev/blog/openclaw-after-anthropic-how-i-made-gpt-54-work&lt;/a&gt;. Licensed under &lt;a href="https://creativecommons.org/licenses/by-nc-nd/4.0/" rel="noopener noreferrer"&gt;CC BY-NC-ND 4.0&lt;/a&gt; - attribution required, no commercial use, no derivatives.&lt;/em&gt;&lt;/p&gt;




&lt;ol&gt;

&lt;li id="fn1"&gt;
&lt;p&gt;Boris Cherny (&lt;a class="mentioned-user" href="https://dev.to/bcherny"&gt;@bcherny&lt;/a&gt;), "@steipete This is not intentional, likely an overactive abuse classifier. Looking, and working on clarifying the policy going forward," X, April 6, 2026, &lt;a href="https://x.com/bcherny/status/2041035127430754686" rel="noopener noreferrer"&gt;https://x.com/bcherny/status/2041035127430754686&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn2"&gt;
&lt;p&gt;Peter Steinberger (@steipete), "Since this is blowing up on hacker news. Boris said that CLI usage is allowed. Thus we added support for it, only to find out that we are still blocked there...," X, April 21, 2026, &lt;a href="https://x.com/steipete/status/2046685973233189375" rel="noopener noreferrer"&gt;https://x.com/steipete/status/2046685973233189375&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn3"&gt;
&lt;p&gt;"Anthropic says OpenClaw-style Claude CLI usage is allowed again," Hacker News, accessed April 22, 2026, &lt;a href="https://news.ycombinator.com/item?id=47844269" rel="noopener noreferrer"&gt;https://news.ycombinator.com/item?id=47844269&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn4"&gt;
&lt;p&gt;Anthropic third-party harness cutoff timeline captured in the author's OpenClaw operating logs, April 3 to April 4, 2026. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn5"&gt;
&lt;p&gt;Anthony Ha, "Anthropic Says Claude Code Subscribers Will Need to Pay Extra for OpenClaw Usage," &lt;em&gt;TechCrunch&lt;/em&gt;, April 4, 2026, &lt;a href="https://techcrunch.com/2026/04/04/anthropic-says-claude-code-subscribers-will-need-to-pay-extra-for-openclaw-support/" rel="noopener noreferrer"&gt;https://techcrunch.com/2026/04/04/anthropic-says-claude-code-subscribers-will-need-to-pay-extra-for-openclaw-support/&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn6"&gt;
&lt;p&gt;Thomas Claburn, "Anthropic Closes Door on Subscription Use of OpenClaw," &lt;em&gt;The Register&lt;/em&gt;, April 6, 2026, &lt;a href="https://www.theregister.com/2026/04/06/anthropic_closes_door_on_subscription/" rel="noopener noreferrer"&gt;https://www.theregister.com/2026/04/06/anthropic_closes_door_on_subscription/&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn7"&gt;
&lt;p&gt;Carl Franzen, "Anthropic Cuts Off the Ability to Use Claude Subscriptions With OpenClaw and Similar Tools," &lt;em&gt;VentureBeat&lt;/em&gt;, April 4, 2026, &lt;a href="https://venturebeat.com/technology/anthropic-cuts-off-the-ability-to-use-claude-subscriptions-with-openclaw-and" rel="noopener noreferrer"&gt;https://venturebeat.com/technology/anthropic-cuts-off-the-ability-to-use-claude-subscriptions-with-openclaw-and&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn8"&gt;
&lt;p&gt;"Claude Code vs Codex vs Aider vs OpenCode vs Pi 2026," &lt;em&gt;thoughts.jock.pl&lt;/em&gt;, accessed April 22, 2026, &lt;a href="https://thoughts.jock.pl/p/ai-coding-harness-agents-2026" rel="noopener noreferrer"&gt;https://thoughts.jock.pl/p/ai-coding-harness-agents-2026&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn9"&gt;
&lt;p&gt;Better Stack Community, "Claude Opus 4.7: Benchmarks, Tokenizer Changes, and Coding Performance," accessed April 22, 2026, &lt;a href="https://betterstack.com/community/guides/ai/claude-opus-4-7/" rel="noopener noreferrer"&gt;https://betterstack.com/community/guides/ai/claude-opus-4-7/&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn10"&gt;
&lt;p&gt;Ibid. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn11"&gt;
&lt;p&gt;Thomas Claburn, "Anthropic: No, Absolutely Not, You May Not Use Third-Party Harnesses With Claude Subs," &lt;em&gt;The Register&lt;/em&gt;, February 20, 2026, &lt;a href="https://www.theregister.com/2026/02/20/anthropic_clarifies_ban_third_party_claude_access/" rel="noopener noreferrer"&gt;https://www.theregister.com/2026/02/20/anthropic_clarifies_ban_third_party_claude_access/&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn12"&gt;
&lt;p&gt;"Feature: Auto-extract text from inbound PDF/document attachments," Issue #28818, &lt;em&gt;openclaw/openclaw&lt;/em&gt;, GitHub, February 27, 2026, &lt;a href="https://github.com/openclaw/openclaw/issues/28818" rel="noopener noreferrer"&gt;https://github.com/openclaw/openclaw/issues/28818&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn13"&gt;
&lt;p&gt;"openclaw update silently drops user-added EnvironmentFile/Environment directives when regenerating systemd unit," Issue #66248, &lt;em&gt;openclaw/openclaw&lt;/em&gt;, GitHub, April 2026, &lt;a href="https://github.com/openclaw/openclaw/issues/66248" rel="noopener noreferrer"&gt;https://github.com/openclaw/openclaw/issues/66248&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;/ol&gt;

</description>
      <category>openclaw</category>
      <category>gpt54</category>
      <category>anthropic</category>
      <category>claude</category>
    </item>
    <item>
      <title>Claude Opus 4.7: Mixed Early Signal, Real API Breaks, and a Token Usage Story Anthropic Had to Address Fast</title>
      <dc:creator>Solomon Neas</dc:creator>
      <pubDate>Sat, 18 Apr 2026 17:57:15 +0000</pubDate>
      <link>https://dev.to/solomonneas/claude-opus-47-mixed-early-signal-real-api-breaks-and-a-token-usage-story-anthropic-had-to-2jc5</link>
      <guid>https://dev.to/solomonneas/claude-opus-47-mixed-early-signal-real-api-breaks-and-a-token-usage-story-anthropic-had-to-2jc5</guid>
      <description>&lt;h1&gt;
  
  
  Claude Opus 4.7: Mixed Early Signal, Real API Breaks, and a Token Usage Story Anthropic Had to Address Fast
&lt;/h1&gt;

&lt;p&gt;Anthropic launched Claude Opus 4.7 on April 16 at the same base price as Opus 4.6, then tucked the real story inside the docs: API breaking changes, new thinking behavior, new effort controls, beta task budgets, sharper Claude Code defaults, and a multimodal bump that looks real once you read past the launch copy.&lt;sup id="fnref1"&gt;1&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;My first take on this release was probably too upbeat. After digging in more, the honest read is messier. I have not spent enough time with Opus 4.7 myself to write a hard firsthand verdict, so this post is better read as reported analysis and early signal, not a personal field report. Anthropic is clearly pitching Opus 4.7 as the better model for long-running agentic coding, and the docs do point to real changes.&lt;sup id="fnref2"&gt;2&lt;/sup&gt; But the early public reaction has been mixed at best. A lot of power users on X and Reddit are calling it underwhelming, more expensive in practice, or flat-out worse for their workflows, especially in Claude Code.&lt;sup id="fnref3"&gt;3&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The real picture, at least right now, is that Anthropic shipped a release with genuine technical changes, real migration consequences, and enough backlash that the token usage story became part of the launch within hours.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anthropic Is Selling a Workflow Shift, Not Just a Better Benchmark Card
&lt;/h2&gt;

&lt;p&gt;Anthropic is positioning Opus 4.7 as the recommended starting point for its hardest tasks and calls it a "step-change improvement in agentic coding" over Opus 4.6.&lt;sup id="fnref4"&gt;4&lt;/sup&gt; That is strong language. Whether it holds up in real use is still getting argued out in public, and I am not going to pretend I have enough seat time yet to settle that myself.&lt;/p&gt;

&lt;p&gt;The core pitch is not benchmark chest-thumping. It is behavior under real work. Anthropic says Opus 4.7 handles long-running tasks with more rigor, follows instructions more precisely, and verifies its own outputs before reporting back.&lt;sup id="fnref5"&gt;5&lt;/sup&gt; Boris Cherny's launch thread makes the same point from the operator side: auto mode, recaps, focus mode, stronger verification habits, and better effort tuning all matter because the model is being pushed toward longer, more autonomous runs.&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;That is why this release feels different from a normal model bump. The base capability matters, sure. What stands out more is how explicit Anthropic is getting about the workflow layer around the model.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Biggest Upgrade Might Be the Workflow Stack Around the Model
&lt;/h2&gt;

&lt;p&gt;Three changes matter most here.&lt;/p&gt;

&lt;p&gt;First, Opus 4.7 introduces &lt;code&gt;xhigh&lt;/code&gt; effort between &lt;code&gt;high&lt;/code&gt; and &lt;code&gt;max&lt;/code&gt;, which gives API users finer control over reasoning depth and latency on hard tasks.&lt;sup id="fnref7"&gt;7&lt;/sup&gt; Anthropic's effort docs say the API default is still &lt;code&gt;high&lt;/code&gt;, but recommend starting at &lt;code&gt;xhigh&lt;/code&gt; for coding and agentic use cases.&lt;sup id="fnref8"&gt;8&lt;/sup&gt; That matters because a lot of people will assume they are getting the best version of 4.7 out of the box when they are not.&lt;/p&gt;

&lt;p&gt;Second, adaptive thinking replaces the old style of thinking budgets on Opus 4.7. Thinking is not automatically on. You have to opt into &lt;code&gt;thinking: { type: "adaptive" }&lt;/code&gt;, and once you do, interleaved thinking between tool calls comes with it.&lt;sup id="fnref9"&gt;9&lt;/sup&gt; That is a real behavior change, not a cosmetic rename.&lt;/p&gt;

&lt;p&gt;Third, task budgets look like one of the more practical additions in this whole release. Anthropic frames them as an advisory budget across the full agentic loop, not a hard cap like &lt;code&gt;max_tokens&lt;/code&gt;.&lt;sup id="fnref10"&gt;10&lt;/sup&gt; If that beta feature works the way the docs suggest, it gives teams a more realistic way to control cost and runtime on longer autonomous jobs without forcing the model to slam into a wall mid-task.&lt;/p&gt;

&lt;p&gt;Read those three changes together and the direction is obvious: Anthropic wants people to run longer jobs, give the model room to think, and manage the run at the task level instead of babysitting each individual request.&lt;/p&gt;

&lt;h2&gt;
  
  
  Claude Code Users Get the Clearest Upgrade Path
&lt;/h2&gt;

&lt;p&gt;The Claude Code side of the release is where the product strategy becomes easiest to read.&lt;/p&gt;

&lt;p&gt;On the Anthropic API, the &lt;code&gt;opus&lt;/code&gt; alias now resolves to Opus 4.7. On Bedrock, Vertex, and Foundry, that alias does not mean the same thing yet, so you need to pin explicitly if you want the new model.&lt;sup id="fnref10"&gt;10&lt;/sup&gt; Opus 4.7 also requires Claude Code v2.1.111 or later.&lt;sup id="fnref10"&gt;10&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;More interesting than the versioning detail is the default behavior. Anthropic's Claude Code docs say Opus 4.7 uses &lt;code&gt;xhigh&lt;/code&gt; effort by default in Claude Code.&lt;sup id="fnref10"&gt;10&lt;/sup&gt; Pair that with auto mode for Max users, the new &lt;code&gt;/ultrareview&lt;/code&gt; workflow, and Boris's emphasis on focus mode, recaps, and verification, and you get a pretty clear picture of where Claude Code is going.&lt;sup id="fnref5"&gt;5&lt;/sup&gt;&lt;sup id="fnref11"&gt;11&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;It is moving away from short interactive back-and-forth and toward "hand this thing a chunk of work and come back to a recap." That will be great for people who actually want an agent. It will also expose every weak habit people built around constant supervision, vague prompts, and zero verification.&lt;/p&gt;

&lt;h2&gt;
  
  
  The API Story Is More Interesting Than the Launch Post
&lt;/h2&gt;

&lt;p&gt;This is the part people should read twice before swapping production workloads over.&lt;/p&gt;

&lt;p&gt;Anthropic's release notes explicitly say Opus 4.7 includes API breaking changes relative to Opus 4.6.&lt;sup id="fnref1"&gt;1&lt;/sup&gt; The migration guide spells out the important ones: extended thinking budgets are gone, adaptive thinking is the supported model-specific path, non-default sampling parameters now error, and thinking content is omitted by default unless you opt back into it.&lt;sup id="fnref11"&gt;11&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;That is not catastrophic, but it is enough to break wrappers, SDK assumptions, eval harnesses, and whatever cursed internal scripts people built at 2 a.m. six weeks ago.&lt;/p&gt;

&lt;p&gt;There is also a cost wrinkle here that deserves way more attention than the launch copy gave it. Anthropic kept the base Opus 4.7 price at $5 input and $25 output per million tokens, but both the migration guide and pricing docs say the new tokenizer may use roughly 1x to 1.35x as many tokens for the same text.&lt;sup id="fnref11"&gt;11&lt;/sup&gt;&lt;sup id="fnref12"&gt;12&lt;/sup&gt; Same sticker price, potentially higher real-world usage.&lt;/p&gt;

&lt;p&gt;Anthropic's response matters here. Boris Cherny said Anthropic increased rate limits for all subscribers "to make up for" the higher token usage, and separately said the company tuned limits so users would get the same amount of usage with &lt;code&gt;xhigh&lt;/code&gt;.&lt;sup id="fnref13"&gt;13&lt;/sup&gt; That public acknowledgment tells you Anthropic knew this would land hard.&lt;/p&gt;

&lt;p&gt;It still does not settle the practical question. Higher rate limits help subscribers, but they do not erase the operational effect of larger token counts, especially for Claude Code sessions, long contexts, or teams with cost controls built around older token behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Vision Upgrade Still Looks Legit
&lt;/h2&gt;

&lt;p&gt;Anthropic says Opus 4.7 can see images at more than three times the resolution of earlier Claude models and should produce better interfaces, slides, and docs as a result.&lt;sup id="fnref14"&gt;14&lt;/sup&gt; Normally I would roll my eyes at that kind of line. In this case, the docs actually give it teeth.&lt;/p&gt;

&lt;p&gt;The vision docs say Opus 4.7 supports images up to 2576 pixels on the long edge and 4784 image tokens, versus 1568 and 1568 for prior models.&lt;sup id="fnref15"&gt;15&lt;/sup&gt; That is a real jump. If you use models for screenshot review, document parsing, UI QA, or anything where detail gets lost in downscaling, this is one of the clearest improvements in the whole release.&lt;/p&gt;

&lt;p&gt;It also has cost implications. Anthropic's own example shows a 2000x1500 image landing around 4000 image tokens on Opus 4.7.&lt;sup id="fnref15"&gt;15&lt;/sup&gt; Better vision is nice. Better vision that quietly burns more tokens is the part teams discover later.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Safety Story Got Weird Fast
&lt;/h2&gt;

&lt;p&gt;The system card is worth reading because it says two things at once.&lt;/p&gt;

&lt;p&gt;Anthropic calls Opus 4.7 its "most capable general-access model to date," then immediately says it does not advance the company's overall capability frontier because Mythos Preview remains stronger.&lt;sup id="fnref16"&gt;16&lt;/sup&gt; That is a weird but useful distinction. It tells you 4.7 is the best generally available version of Claude, but not the strongest thing Anthropic has internally.&lt;/p&gt;

&lt;p&gt;The system card also says Opus 4.7 ships with new classifier-based cyber safeguards for prohibited and high-risk cyber use, and Anthropic's public launch post ties legitimate security research access to the Cyber Verification Program.&lt;sup id="fnref16"&gt;16&lt;/sup&gt;&lt;sup id="fnref17"&gt;17&lt;/sup&gt; So this is not just a model release. It is also a deployment and policy story.&lt;/p&gt;

&lt;p&gt;But the rollout got messy almost immediately. A bunch of users reported that Opus 4.7 was suddenly treating ordinary code and file reads like malware or prompt injection attempts. Anthropic staffer Alex Albert later said this was a bug on Anthropic's side, not the model "being cautious": older Claude builds were applying a stale safety prompt that Opus 4.7 did not need, which led to the false malware warnings. His fix was simple, update Claude or relaunch the app.&lt;sup id="fnref18"&gt;18&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;That distinction matters. If the issue was a stale prompt or integration-layer bug, then this was not really evidence that Opus 4.7 itself is uniquely paranoid. It was evidence that model upgrades can break in the harness around the model, and that those breakages can look exactly like a model regression when you are the person getting blocked.&lt;/p&gt;

&lt;p&gt;That matters because Anthropic is clearly trying to push capability up while tightening the boundary around who gets to use the sharper edges. Day-one safety bugs like this poison trust fast, even when the root cause lives outside the base model.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Early Public Reaction Is the Real Story Right Now
&lt;/h2&gt;

&lt;p&gt;I have not used Opus 4.7 heavily enough yet to tell you, from personal battle scars, that it is obviously better than 4.6. And the public reaction over the first day does not support pretending otherwise. Reddit threads in r/ClaudeAI, r/Anthropic, and r/ClaudeCode are full of people calling it a regression, saying it burns through usage limits too quickly, or saying the gains are not obvious enough to justify the cost.&lt;sup id="fnref3"&gt;3&lt;/sup&gt;&lt;sup id="fnref14"&gt;14&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;That does not mean the model is bad. It does mean the rollout landed in a skeptical environment, and Anthropic's own staff had to publicly defend adaptive thinking, acknowledge higher token use, increase subscriber limits, and explain that at least one wave of malware-style warnings came from a stale safety prompt bug in older Claude builds.&lt;sup id="fnref13"&gt;13&lt;/sup&gt;&lt;sup id="fnref15"&gt;15&lt;/sup&gt;&lt;sup id="fnref18"&gt;18&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The fair read, at least this early, is probably this: Opus 4.7 may be better on certain hard coding, vision, and long-horizon tasks, but the upgrade does not look clean or universally felt. If you have not noticed a huge difference yet, that is not some failure to appreciate the model. That seems like a pretty normal reaction right now.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Actually Matters If You Build With This Stuff
&lt;/h2&gt;

&lt;p&gt;If you build products, agents, or internal tooling on top of Claude, the real checklist is a little harsher:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Audit any 4.6-specific thinking and sampling assumptions before migrating.&lt;sup id="fnref11"&gt;11&lt;/sup&gt;
&lt;/li&gt;
&lt;li&gt;Decide whether your default should stay at &lt;code&gt;high&lt;/code&gt; or move to &lt;code&gt;xhigh&lt;/code&gt; for hard tasks.&lt;sup id="fnref8"&gt;8&lt;/sup&gt;
&lt;/li&gt;
&lt;li&gt;Test task budgets before you trust them for cost control in production.&lt;sup id="fnref10"&gt;10&lt;/sup&gt;
&lt;/li&gt;
&lt;li&gt;Pin exact model IDs across providers instead of assuming &lt;code&gt;opus&lt;/code&gt; means the same thing everywhere.&lt;sup id="fnref16"&gt;16&lt;/sup&gt;
&lt;/li&gt;
&lt;li&gt;Re-check your image-heavy workloads because better vision can still mean a bigger bill.&lt;sup id="fnref12"&gt;12&lt;/sup&gt;&lt;sup id="fnref17"&gt;17&lt;/sup&gt;
&lt;/li&gt;
&lt;li&gt;Do not treat unchanged price-per-million as proof that real usage costs stayed flat.&lt;sup id="fnref11"&gt;11&lt;/sup&gt;&lt;sup id="fnref12"&gt;12&lt;/sup&gt;&lt;sup id="fnref13"&gt;13&lt;/sup&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the divide I keep coming back to. Opus 4.7 might be strong. The teams that get the most out of it are going to be the ones that treat it like a systems change, not just a model swap.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;I still think Opus 4.7 is a real release. Not fake. Not just a marketing rename. There are too many actual platform and behavior changes for that.&lt;/p&gt;

&lt;p&gt;What I do not think, at least not yet, is that the evidence supports writing as if Anthropic obviously nailed it, or as if I have personally validated the upside in heavy daily use.&lt;/p&gt;

&lt;p&gt;The safer read is that Anthropic shipped a technically important release with mixed early reception. The model may be better on hard agentic coding, vision, and long-horizon tasks. The rollout also brought token-usage anxiety, migration friction, public skepticism, and enough backlash that Anthropic had to raise subscriber limits almost immediately.&lt;sup id="fnref13"&gt;13&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;So this post is really two things at once: a news-and-docs roundup, and a cautious read on the early reaction.&lt;/p&gt;

&lt;p&gt;Opus 4.7 might prove itself over the next week or two. It might also end up remembered as a release where the docs and benchmark story were cleaner than the first wave of user experience. Right now, pretending certainty would be bullshit.&lt;/p&gt;

&lt;p&gt;The interesting part is not whether Anthropic says 4.7 is better. Of course they do. The interesting part is whether builders keep reaching for it once the novelty wears off.&lt;/p&gt;

&lt;h2&gt;
  
  
  Notes
&lt;/h2&gt;







&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://solomonneas.dev/blog/claude-opus-47-release" rel="noopener noreferrer"&gt;solomonneas.dev/blog/claude-opus-47-release&lt;/a&gt;. Licensed under &lt;a href="https://creativecommons.org/licenses/by-nc-nd/4.0/" rel="noopener noreferrer"&gt;CC BY-NC-ND 4.0&lt;/a&gt; - attribution required, no commercial use, no derivatives.&lt;/em&gt;&lt;/p&gt;




&lt;ol&gt;

&lt;li id="fn1"&gt;
&lt;p&gt;Anthropic, "Claude Platform," in "Release Notes Overview," Claude API Docs, accessed April 17, 2026, &lt;a href="https://platform.claude.com/docs/en/release-notes/overview" rel="noopener noreferrer"&gt;https://platform.claude.com/docs/en/release-notes/overview&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn2"&gt;
&lt;p&gt;Anthropic, "Claude Opus 4.7," Anthropic News, accessed April 16, 2026, &lt;a href="https://www.anthropic.com/news/claude-opus-4-7" rel="noopener noreferrer"&gt;https://www.anthropic.com/news/claude-opus-4-7&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn3"&gt;
&lt;p&gt;Henry Chandonnet, "The Claude-lash Is Here: Opus 4.7 Is Burning Through Tokens, and Some People's Patience," &lt;em&gt;Business Insider&lt;/em&gt;, April 17, 2026, &lt;a href="https://www.businessinsider.com/anthropic-claude-opus-4-7-backlash-tokens-2026-4" rel="noopener noreferrer"&gt;https://www.businessinsider.com/anthropic-claude-opus-4-7-backlash-tokens-2026-4&lt;/a&gt;; Reddit post, "Opus 4.7 Released!," r/ClaudeAI, accessed April 17, 2026, &lt;a href="https://www.reddit.com/r/ClaudeAI/comments/1sn585s/opus_47_released/" rel="noopener noreferrer"&gt;https://www.reddit.com/r/ClaudeAI/comments/1sn585s/opus_47_released/&lt;/a&gt;; Reddit post, "Claude Code tip: 10 seconds fix to avoid the Opus 4.7 token burn," r/ClaudeAI, accessed April 17, 2026, &lt;a href="https://www.reddit.com/r/ClaudeAI/comments/1snv4yq/claude_code_tip_10_seconds_fix_to_avoid_the_opus/" rel="noopener noreferrer"&gt;https://www.reddit.com/r/ClaudeAI/comments/1snv4yq/claude_code_tip_10_seconds_fix_to_avoid_the_opus/&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn4"&gt;
&lt;p&gt;Anthropic, "Models Overview," Claude API Docs, accessed April 17, 2026, &lt;a href="https://platform.claude.com/docs/en/about-claude/models/overview" rel="noopener noreferrer"&gt;https://platform.claude.com/docs/en/about-claude/models/overview&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn5"&gt;
&lt;p&gt;Claude (@​claudeai), "Introducing Claude Opus 4.7, our most capable Opus model yet. It handles long-running tasks with more rigor, follows instructions more precisely, and verifies its own outputs before reporting back. You can hand off your hardest work with less supervision," X, April 16, 2026, &lt;a href="https://x.com/claudeai/status/2044785261393977612" rel="noopener noreferrer"&gt;https://x.com/claudeai/status/2044785261393977612&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn6"&gt;
&lt;p&gt;Boris Cherny (@​bcherny), "Dogfooding Opus 4.7 the last few weeks, I've been feeling incredibly productive. Sharing a few tips to get more out of 4.7 🧵," X, April 16, 2026, &lt;a href="https://x.com/bcherny/status/2044847848035156457" rel="noopener noreferrer"&gt;https://x.com/bcherny/status/2044847848035156457&lt;/a&gt;; Boris Cherny (@​bcherny), "1/ Auto mode = no more permission prompts," X, April 16, 2026, &lt;a href="https://x.com/bcherny/status/2044847849662505288" rel="noopener noreferrer"&gt;https://x.com/bcherny/status/2044847849662505288&lt;/a&gt;; Boris Cherny (@​bcherny), "3/ Recaps," X, April 16, 2026, &lt;a href="https://x.com/bcherny/status/2044847853030580247" rel="noopener noreferrer"&gt;https://x.com/bcherny/status/2044847853030580247&lt;/a&gt;; Boris Cherny (@​bcherny), "4/ Focus mode," X, April 16, 2026, &lt;a href="https://x.com/bcherny/status/2044847855006024147" rel="noopener noreferrer"&gt;https://x.com/bcherny/status/2044847855006024147&lt;/a&gt;; Boris Cherny (@​bcherny), "6/ Give Claude a way to verify its work," X, April 16, 2026, &lt;a href="https://x.com/bcherny/status/2044847858634064115" rel="noopener noreferrer"&gt;https://x.com/bcherny/status/2044847858634064115&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn7"&gt;
&lt;p&gt;Claude (@​claudeai), "On the API, a new xhigh effort level between high and max gives you finer control over reasoning and latency on hard problems. Task budgets (beta) help Claude prioritize work and manage costs across longer runs," X, April 16, 2026, &lt;a href="https://x.com/claudeai/status/2044785264313221470" rel="noopener noreferrer"&gt;https://x.com/claudeai/status/2044785264313221470&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn8"&gt;
&lt;p&gt;Anthropic, "Effort," Claude API Docs, accessed April 17, 2026, &lt;a href="https://platform.claude.com/docs/en/build-with-claude/effort" rel="noopener noreferrer"&gt;https://platform.claude.com/docs/en/build-with-claude/effort&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn9"&gt;
&lt;p&gt;Anthropic, "Adaptive Thinking," Claude API Docs, accessed April 17, 2026, &lt;a href="https://platform.claude.com/docs/en/build-with-claude/adaptive-thinking" rel="noopener noreferrer"&gt;https://platform.claude.com/docs/en/build-with-claude/adaptive-thinking&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn10"&gt;
&lt;p&gt;Anthropic, "Task Budgets," Claude API Docs, accessed April 17, 2026, &lt;a href="https://platform.claude.com/docs/en/build-with-claude/task-budgets" rel="noopener noreferrer"&gt;https://platform.claude.com/docs/en/build-with-claude/task-budgets&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn11"&gt;
&lt;p&gt;Anthropic, "Migrating to Claude Opus 4.7," in "Migration Guide," Claude API Docs, accessed April 16, 2026, &lt;a href="https://platform.claude.com/docs/en/about-claude/models/migration-guide#migrating-to-claude-opus-4-7" rel="noopener noreferrer"&gt;https://platform.claude.com/docs/en/about-claude/models/migration-guide#migrating-to-claude-opus-4-7&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn12"&gt;
&lt;p&gt;Anthropic, "Pricing," Claude API Docs, accessed April 17, 2026, &lt;a href="https://platform.claude.com/docs/en/about-claude/pricing" rel="noopener noreferrer"&gt;https://platform.claude.com/docs/en/about-claude/pricing&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn13"&gt;
&lt;p&gt;Boris Cherny (@​bcherny), "Opus 4.7 uses more thinking tokens, so we've increased rate limits for all subscribers to make up for it. Enjoy!" X, April 16, 2026, &lt;a href="https://x.com/bcherny/status/2044839936235553167" rel="noopener noreferrer"&gt;https://x.com/bcherny/status/2044839936235553167&lt;/a&gt;; Boris Cherny (@​bcherny), "@​mark_k @​AnthropicAI Not accurate. Adaptive thinking lets the model decide when to think, which performs better. Opus 4.7 also uses more thinking tokens on average than 4.6, which is why we have increased rate limits for all subscribers to make up for it," X, April 16, 2026, &lt;a href="https://x.com/bcherny/status/2044836750066151666" rel="noopener noreferrer"&gt;https://x.com/bcherny/status/2044836750066151666&lt;/a&gt;; Boris Cherny (@​bcherny), "@​a_lamparelli We've tuned rate limits to give you the same amount of usage with xhigh," X, April 16, 2026, &lt;a href="https://x.com/bcherny/status/2044805730138804519" rel="noopener noreferrer"&gt;https://x.com/bcherny/status/2044805730138804519&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn14"&gt;
&lt;p&gt;Reddit post, "Opus 4.7 Released!," r/ClaudeAI, accessed April 17, 2026, &lt;a href="https://www.reddit.com/r/ClaudeAI/comments/1sn585s/opus_47_released/" rel="noopener noreferrer"&gt;https://www.reddit.com/r/ClaudeAI/comments/1sn585s/opus_47_released/&lt;/a&gt;; Reddit post, "I have tested Opus 4.7 and it is worse compared to Opus 4.6," r/Anthropic, accessed April 17, 2026, &lt;a href="https://www.reddit.com/r/Anthropic/comments/1snijmr/i_have_tested_opus_47_and_it_is_worse_compared_to/" rel="noopener noreferrer"&gt;https://www.reddit.com/r/Anthropic/comments/1snijmr/i_have_tested_opus_47_and_it_is_worse_compared_to/&lt;/a&gt;; Reddit post, "Just use Sonnet 4.6 and stay away from Opus 4.7," r/ClaudeCode, accessed April 17, 2026, &lt;a href="https://www.reddit.com/r/ClaudeCode/comments/1snwk9v/just_use_sonnet_46_and_stay_away_from_opus_47/" rel="noopener noreferrer"&gt;https://www.reddit.com/r/ClaudeCode/comments/1snwk9v/just_use_sonnet_46_and_stay_away_from_opus_47/&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn15"&gt;
&lt;p&gt;Henry Chandonnet, "The Claude-lash Is Here"; Alex Albert (@​alexalbert_&lt;em&gt;), "A lot of bugs that folks may have hit yesterday when first trying Opus 4.7 are now fixed. Thanks for bearing with us," X, April 17, 2026, &lt;a href="https://x.com/alexalbert" rel="noopener noreferrer"&gt;https://x.com/alexalbert&lt;/a&gt;&lt;/em&gt;_/status/2045159041283064095. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn16"&gt;
&lt;p&gt;Anthropic, "Model Configuration," Claude Code Docs, accessed April 17, 2026, &lt;a href="https://code.claude.com/docs/en/model-config" rel="noopener noreferrer"&gt;https://code.claude.com/docs/en/model-config&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn17"&gt;
&lt;p&gt;Anthropic, "Vision," Claude API Docs, accessed April 17, 2026, &lt;a href="https://platform.claude.com/docs/en/build-with-claude/vision" rel="noopener noreferrer"&gt;https://platform.claude.com/docs/en/build-with-claude/vision&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn18"&gt;
&lt;p&gt;Alex Albert (@​alexalbert_&lt;em&gt;), "Some of you ran into Opus 4.7 refusing normal code edits with \"this might be malware\" warnings. That was a bug on our side, not the model being cautious. Older builds applied a stale safety prompt that Opus 4.7 doesn't need. Run claude update or relaunch the app," X, April 17, 2026, &lt;a href="https://x.com/alexalbert" rel="noopener noreferrer"&gt;https://x.com/alexalbert&lt;/a&gt;&lt;/em&gt;_/status/2045238786339299431. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;/ol&gt;

</description>
      <category>anthropic</category>
      <category>claude</category>
      <category>ai</category>
    </item>
    <item>
      <title>Dreaming Is Useful. Structured Memory Is Better</title>
      <dc:creator>Solomon Neas</dc:creator>
      <pubDate>Fri, 17 Apr 2026 07:05:55 +0000</pubDate>
      <link>https://dev.to/solomonneas/dreaming-is-useful-structured-memory-is-better-4g9h</link>
      <guid>https://dev.to/solomonneas/dreaming-is-useful-structured-memory-is-better-4g9h</guid>
      <description>&lt;p&gt;I ran OpenClaw Dreaming for a full week on top of my existing memory stack to answer one question: does Dreaming actually improve memory quality, or does it just inflate memory volume?&lt;/p&gt;

&lt;p&gt;Both. It surfaced real signal I would have lost. It also dumped enough boilerplate into the promotion stream to prove structured memory still has to be the foundation. If you want the official feature overview first, OpenClaw's Dreaming docs are here: &lt;a href="https://docs.openclaw.ai/concepts/dreaming" rel="noopener noreferrer"&gt;Dreaming&lt;/a&gt;. After a week, Dreaming stays on, but as a supporting layer. Not the system.&lt;/p&gt;

&lt;h2&gt;
  
  
  The baseline was already working
&lt;/h2&gt;

&lt;p&gt;This trial did not start from zero. The stack was already in daily use:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Daily logs in &lt;code&gt;memory/YYYY-MM-DD.md&lt;/code&gt; for raw continuity&lt;/li&gt;
&lt;li&gt;Atomic knowledge cards in &lt;code&gt;memory/cards/*.md&lt;/code&gt; for durable facts and lessons&lt;/li&gt;
&lt;li&gt;A slim &lt;code&gt;MEMORY.md&lt;/code&gt; acting as an index, not a data landfill&lt;/li&gt;
&lt;li&gt;Semantic retrieval over cards using local embeddings&lt;/li&gt;
&lt;li&gt;A Memory Sweep cron for review and promotion discipline&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That architecture exists because monolithic memory eventually collapses under its own weight. Retrieval gets noisy, cost climbs, and the agent starts missing things that are technically "in memory" but practically unrecoverable. Structured memory fixes that by treating memory as a retrieval system instead of a dump file.&lt;/p&gt;

&lt;h2&gt;
  
  
  Trial config
&lt;/h2&gt;

&lt;p&gt;Dreaming was enabled on 2026-04-06 as a one-week trial with nightly cadence:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;dreaming.enabled=true&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;dreaming.frequency="0 3 * * *"&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Documented as a trial, not a migration, with explicit concern about noisy promotions. A review cron was scheduled for 2026-04-14 to evaluate impact after one full week of live usage.&lt;/p&gt;

&lt;p&gt;Nothing else in the memory pipeline was touched. Cards, logs, retrieval, and sweep all stayed active so Dreaming could be evaluated as a pure additive layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Dreaming actually does
&lt;/h2&gt;

&lt;p&gt;From observed behavior, Dreaming runs a nightly retrospective pass over short-term recall:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;grounded REM/backfill flow&lt;/li&gt;
&lt;li&gt;diary-style processing&lt;/li&gt;
&lt;li&gt;candidate durable-fact extraction&lt;/li&gt;
&lt;li&gt;promotion hooks into long-term memory surfaces&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In plain terms, it is a second-pass recall mechanism. It can rescue durable information that never got manually promoted during the day. That is real value in long, messy sessions.&lt;/p&gt;

&lt;h2&gt;
  
  
  One-week health check
&lt;/h2&gt;

&lt;p&gt;Core memory infrastructure came out clean:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Main memory healthy&lt;/li&gt;
&lt;li&gt;Embeddings ready&lt;/li&gt;
&lt;li&gt;Vector search ready&lt;/li&gt;
&lt;li&gt;FTS ready&lt;/li&gt;
&lt;li&gt;Recall store active&lt;/li&gt;
&lt;li&gt;Dreaming cron active&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;memory_search&lt;/code&gt; returning relevant results&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Operationally, nothing regressed. Cards stayed structurally normal, daily logs kept writing, semantic retrieval kept working.&lt;/p&gt;

&lt;p&gt;So "did Dreaming break memory" was never the question. It did not. The question was quality.&lt;/p&gt;

&lt;h2&gt;
  
  
  Memory Sweep is the comparison that matters
&lt;/h2&gt;

&lt;p&gt;Sweep is the reference point because it has been doing the same job Dreaming now claims, just more conservatively.&lt;/p&gt;

&lt;p&gt;And to be fair to Sweep, the cron reports show it was not sitting there idle. Over the same week, it was reviewing real sessions and persisting useful state with pretty solid discipline.&lt;/p&gt;

&lt;p&gt;A few examples from the sweep channel:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;April 9:&lt;/strong&gt; reviewed non-cron sessions and updated a durable agent-workflow card.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;April 10:&lt;/strong&gt; turned grocery receipts into a durable tracking workflow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;April 11:&lt;/strong&gt; handled a security incident conservatively, logging what mattered without duplicating existing cards.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;April 13 to April 15:&lt;/strong&gt; kept the Lazarus Group research card current while threat-assessment sections and supporting research were still moving.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;April 16:&lt;/strong&gt; created an xMCP service-ops card and logged the operational follow-up cleanly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is not a cron doing nothing. That is a curation layer doing triage. Sweep evaluates session material, skips cron, heartbeat, and helper noise, checks whether the durable information already exists, and only writes when something actually changes or deserves promotion.&lt;/p&gt;

&lt;p&gt;That restraint matters. Some nights the correct outcome really is "no new cards." But across the week, Sweep still created or updated cards for grocery tracking, agent-workflow rules, Lazarus Group research, blog publishing rules, xMCP service operations, and malware-response documentation. It also kept daily logs current without flooding memory with duplicate fragments.&lt;/p&gt;

&lt;p&gt;Dreaming, over the same window, promoted a handful of genuinely useful durable facts and a lot of transcript residue. Same job, different discipline. Sweep's default is "persist carefully after review." Dreaming's default is closer to "surface candidates broadly and let cleanup happen later." That difference is the whole story.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dreaming quality: real signal, real noise
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The good
&lt;/h3&gt;

&lt;p&gt;Useful promotions did show up, and they were more specific than "Dreaming found something interesting."&lt;/p&gt;

&lt;p&gt;A few examples of what it actually added:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It recovered a real ACP constraint: Discord thread creation works reliably from a fresh inbound turn, but nested or yielded turns can collapse into &lt;code&gt;webchat&lt;/code&gt; and fail.&lt;/li&gt;
&lt;li&gt;It helped move an agent lane from "probably working" to a verified workflow, which turned that discovery into a durable card instead of leaving it buried in chat history.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That part matters. Those are real operating rules that affect how I route agent work and catch workflow hiccups, not just vague themes Dreaming happened to notice.&lt;/p&gt;

&lt;h3&gt;
  
  
  The bad
&lt;/h3&gt;

&lt;p&gt;Staged recall also contained a lot of debris:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;heartbeat boilerplate (&lt;code&gt;HEARTBEAT_OK&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;silent sentinel text (&lt;code&gt;NO_REPLY&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;tiny one-line chat fragments&lt;/li&gt;
&lt;li&gt;metadata-heavy transcript sludge&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the limitation. Without strict filtering, Dreaming will keep surfacing things that are technically recallable and semantically worthless.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Dreaming writes
&lt;/h2&gt;

&lt;p&gt;During the trial, Dreaming artifacts showed up in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;DREAMS.md&lt;/code&gt; and workspace equivalents&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;memory/.dreams/*&lt;/code&gt; (session corpus and short-term recall JSON)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;MEMORY.md&lt;/code&gt; promoted sections tagged with &lt;code&gt;openclaw-memory-promotion&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;daily logs with Light and REM candidate traces&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Cards and daily logs stayed intact. The promotion stream is what needs quality controls.&lt;/p&gt;

&lt;h2&gt;
  
  
  Complement, not core
&lt;/h2&gt;

&lt;p&gt;After one week the architecture answer is obvious:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Structured memory&lt;/strong&gt; (cards, slim index, retrieval, Sweep) is the core&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dreaming&lt;/strong&gt; is a useful second-pass promotion layer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Dreaming catches what day-of workflows miss. It is not trustworthy enough yet to be the primary curation mechanism. That is not a failure, it is a role.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I am keeping, what I am tuning
&lt;/h2&gt;

&lt;p&gt;Keeping Dreaming enabled. Tightening the promotion side:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;heavier penalties for boilerplate tokens&lt;/li&gt;
&lt;li&gt;stricter filtering against low-information one-liners&lt;/li&gt;
&lt;li&gt;lower promotion likelihood for metadata-only fragments&lt;/li&gt;
&lt;li&gt;human-curated cards stay the authoritative path&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal is not maximal recall. The goal is durable, retrievable memory that stays useful under load.&lt;/p&gt;

&lt;h2&gt;
  
  
  Verdict
&lt;/h2&gt;

&lt;p&gt;Dreaming is useful. Structured memory is better. 🦞&lt;/p&gt;

&lt;p&gt;That is not a contradiction, it is the right layering. Use Dreaming to recover signal from transcript residue. Use structured memory to decide what deserves to live long-term. Blend the roles correctly and you get better continuity without turning your memory system into a junk drawer.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://solomonneas.dev/blog/dreaming-useful-structured-memory-better" rel="noopener noreferrer"&gt;solomonneas.dev/blog/dreaming-useful-structured-memory-better&lt;/a&gt;. Licensed under &lt;a href="https://creativecommons.org/licenses/by-nc-nd/4.0/" rel="noopener noreferrer"&gt;CC BY-NC-ND 4.0&lt;/a&gt; - attribution required, no commercial use, no derivatives.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>openclaw</category>
      <category>memory</category>
      <category>agentarchitecture</category>
      <category>embeddings</category>
    </item>
    <item>
      <title>GPT-5.4-Cyber Is Really a Fight Over Access Control</title>
      <dc:creator>Solomon Neas</dc:creator>
      <pubDate>Wed, 15 Apr 2026 02:49:40 +0000</pubDate>
      <link>https://dev.to/solomonneas/gpt-54-cyber-is-really-a-fight-over-access-control-10g0</link>
      <guid>https://dev.to/solomonneas/gpt-54-cyber-is-really-a-fight-over-access-control-10g0</guid>
      <description>&lt;p&gt;OpenAI just made its answer to Anthropic's Mythos pretty clear.&lt;/p&gt;

&lt;p&gt;This is not just a model story. It is an access-control story.&lt;/p&gt;

&lt;p&gt;OpenAI wants broader, tiered access through Trusted Access for Cyber. Anthropic wants a tighter gate through Project Glasswing. One side is arguing that verified defenders should get access at scale. The other is arguing that this class of capability is dangerous enough to keep inside a much smaller circle.&lt;/p&gt;

&lt;p&gt;That is a real disagreement. It is also the part of the story most people are still flattening into launch-day hype.&lt;/p&gt;

&lt;h2&gt;
  
  
  What OpenAI Actually Announced
&lt;/h2&gt;

&lt;p&gt;OpenAI's April 14 post is pretty direct. The company says it is scaling Trusted Access for Cyber to thousands of verified individual defenders and hundreds of teams responsible for defending critical software. It also introduced GPT-5.4-Cyber as a variant of GPT-5.4 trained to be cyber-permissive, with a lower refusal boundary for legitimate cybersecurity work and new binary reverse-engineering capability for analyzing compiled software without source code access.&lt;sup&gt;1&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Reuters confirmed the key part of the rollout: GPT-5.4-Cyber is not a public release. It is being rolled out on a limited basis to vetted security vendors, organizations, and researchers, with higher levels of verification unlocking more sensitive capability.&lt;sup&gt;2&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;So yes, OpenAI is talking about broader access. It is still gating the good stuff.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Split Is Access Philosophy
&lt;/h2&gt;

&lt;p&gt;Anthropic's framing is sharper and more dramatic. In its Mythos Preview write-up, the company described a model it says can identify and exploit zero-days in every major operating system and major web browser when directed to do so. Anthropic presented that as the reason for Project Glasswing, a restricted deployment model built around a small group of partners and a coordinated defensive push.&lt;sup&gt;3&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;OpenAI is arguing almost the opposite. Its TAC post says it does not think it is practical or appropriate to centrally decide who gets to defend themselves.&lt;sup&gt;1&lt;/sup&gt; That line was not subtle. It was a shot at the curated-partner model without naming Anthropic directly.&lt;/p&gt;

&lt;p&gt;Both approaches assume the scarce asset is the model. For most defenders, the scarcer asset is everything around the model: verification, workflow integration, triage discipline, reverse-engineering skill, patch pipelines, logging, analyst time, and plain old trust. A stronger model helps. It does not magically turn noisy output into fixed software.&lt;/p&gt;

&lt;h2&gt;
  
  
  What GPT-5.4-Cyber Actually Changes for Defenders
&lt;/h2&gt;

&lt;p&gt;The clearest practical claim in OpenAI's launch is binary reverse engineering. That is not some vague promise about AI making security better. It points to a specific use case: giving analysts help with compiled software when source code is unavailable.&lt;/p&gt;

&lt;p&gt;In practice, that could mean:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;faster triage of suspicious binaries,&lt;/li&gt;
&lt;li&gt;faster explanation of unfamiliar functions,&lt;/li&gt;
&lt;li&gt;quicker hypothesis generation around likely vulnerability classes,&lt;/li&gt;
&lt;li&gt;and a better first pass before a human digs deeper in Ghidra or IDA.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is useful. It is not a replacement for real reverse-engineering skill.&lt;/p&gt;

&lt;p&gt;Anyone who has tried to use a general model for malware analysis or exploit-adjacent research has run into the same wall: the model gets skittish, moralizes, or refuses a task that is obviously defensive. OpenAI is trying to reduce that friction for verified users.&lt;sup&gt;1&lt;/sup&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Caveat Everyone Wants to Skip
&lt;/h2&gt;

&lt;p&gt;This is where the independent caveats matter. OpenAI's own GPT-5.4 Thinking System Card says GPT-5.4 is the first general-purpose model in its line with mitigations for high cyber capability.&lt;sup&gt;4&lt;/sup&gt; That tells you the company itself thinks the baseline model is already in different territory.&lt;/p&gt;

&lt;p&gt;The UK AI Security Institute's evaluation of Mythos adds a second useful data point. AISI found that Mythos Preview was a step up over prior frontier models, succeeded on expert-level CTF tasks 73 percent of the time, and became the first model to complete its full 32-step corporate network attack simulation end to end in some runs.&lt;sup&gt;5&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;But AISI also says its test environments are easier than real defended systems. There were no active defenders, no realistic defensive tooling, and no real penalties for noisy behavior that would trigger alerts in production.&lt;sup&gt;5&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;That is exactly the kind of caveat people tend to bury after the headline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Workflow Still Matters More Than Weights
&lt;/h2&gt;

&lt;p&gt;A model that can explain a decompiled function, highlight suspicious control flow, or suggest where memory corruption might live is valuable. A model that can reliably find, validate, chain, and exploit serious vulnerabilities across messy real environments without heavy scaffolding is a different beast entirely.&lt;/p&gt;

&lt;p&gt;Those are not the same claim, and too much of the public conversation treats them like they are.&lt;/p&gt;

&lt;p&gt;That is why I do not think either company has fully answered the core question.&lt;/p&gt;

&lt;p&gt;Anthropic's approach may slow diffusion, but it also concentrates advantage among already powerful partners. OpenAI's broader approach is more appealing if you actually want these tools in the hands of working defenders, smaller teams, and security vendors beyond the usual giants. But broader verification is not a magic shield. Trusted access is still a policy layer. If identity checks are weak, if accounts get abused, or if the surrounding agent runtime is sloppy, the safety story gets shaky fast.&lt;/p&gt;

&lt;h3&gt;
  
  
  A defender does not win because a model is good at describing assembly
&lt;/h3&gt;

&lt;p&gt;A defender wins when suspicious code gets triaged faster, false positives get killed earlier, high-confidence findings get validated, patches get written, and the fix lands before the other side can capitalize.&lt;/p&gt;

&lt;p&gt;That is a pipeline problem. The model sits inside it. The model is not the pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Take
&lt;/h2&gt;

&lt;p&gt;The strongest reading of GPT-5.4-Cyber is not "OpenAI caught up to Mythos" or "the AI cyber arms race is here," even if both headlines are tempting.&lt;/p&gt;

&lt;p&gt;The stronger reading is that frontier labs are turning access control into product strategy because raw capability is no longer the only thing they are selling. They are selling who gets to use it, under what conditions, with what audit trail, and with what story attached.&lt;/p&gt;

&lt;p&gt;For defenders, the question is simpler.&lt;/p&gt;

&lt;p&gt;Will this help real teams do better work now, before similar capability spreads elsewhere anyway?&lt;/p&gt;

&lt;p&gt;That is the question worth tracking. Not who had the scarier press release.&lt;/p&gt;

&lt;h2&gt;
  
  
  Notes
&lt;/h2&gt;

&lt;ol&gt;
  &lt;li id="source-1"&gt;
&lt;strong&gt;1.&lt;/strong&gt; OpenAI, &lt;a href="https://openai.com/index/scaling-trusted-access-for-cyber-defense/" rel="noopener noreferrer"&gt;"Trusted Access for the Next Era of Cyber Defense"&lt;/a&gt; (April 14, 2026).&lt;/li&gt;
  &lt;li id="source-2"&gt;
&lt;strong&gt;2.&lt;/strong&gt; Reuters, &lt;a href="https://www.reuters.com/technology/openai-unveils-gpt-54-cyber-week-after-rivals-announcement-ai-model-2026-04-14/" rel="noopener noreferrer"&gt;"OpenAI Unveils GPT-5.4-Cyber a Week After Rival's Announcement of AI Model"&lt;/a&gt; (April 14, 2026).&lt;/li&gt;
  &lt;li id="source-3"&gt;
&lt;strong&gt;3.&lt;/strong&gt; Anthropic, &lt;a href="https://red.anthropic.com/2026/mythos-preview/" rel="noopener noreferrer"&gt;"Claude Mythos Preview"&lt;/a&gt; (April 7, 2026).&lt;/li&gt;
  &lt;li id="source-4"&gt;
&lt;strong&gt;4.&lt;/strong&gt; OpenAI, &lt;a href="https://openai.com/index/gpt-5-4-thinking-system-card/" rel="noopener noreferrer"&gt;"GPT-5.4 Thinking System Card"&lt;/a&gt; (March 5, 2026).&lt;/li&gt;
  &lt;li id="source-5"&gt;
&lt;strong&gt;5.&lt;/strong&gt; AI Security Institute, &lt;a href="https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities" rel="noopener noreferrer"&gt;"Our Evaluation of Claude Mythos Preview's Cyber Capabilities"&lt;/a&gt; (April 2026).&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>openai</category>
      <category>gpt54cyber</category>
      <category>anthropic</category>
      <category>mythos</category>
    </item>
    <item>
      <title>Claude Mythos Preview Is a Warning Shot for Every Security Team</title>
      <dc:creator>Solomon Neas</dc:creator>
      <pubDate>Thu, 09 Apr 2026 02:38:47 +0000</pubDate>
      <link>https://dev.to/solomonneas/claude-mythos-preview-is-a-warning-shot-for-every-security-team-41i6</link>
      <guid>https://dev.to/solomonneas/claude-mythos-preview-is-a-warning-shot-for-every-security-team-41i6</guid>
      <description>&lt;h1&gt;
  
  
  Claude Mythos Preview Is a Warning Shot for Every Security Team
&lt;/h1&gt;

&lt;p&gt;Anthropic just said the quiet part out loud.&lt;/p&gt;

&lt;p&gt;Its new gated model, &lt;strong&gt;Claude Mythos Preview&lt;/strong&gt;, is strong enough at vulnerability research and exploit development that Anthropic decided &lt;strong&gt;not&lt;/strong&gt; to release it for general access. Instead, it wrapped the model inside &lt;a href="https://www.anthropic.com/glasswing" rel="noopener noreferrer"&gt;Project Glasswing&lt;/a&gt;, an invitation-only defensive security program with launch partners including AWS, Cisco, CrowdStrike, Google, Microsoft, NVIDIA, Palo Alto Networks, JPMorganChase, Apple, Broadcom, and the Linux Foundation.&lt;/p&gt;

&lt;p&gt;That alone should get your attention. Frontier labs love shipping. They do not voluntarily keep flagships behind a fence.&lt;/p&gt;

&lt;p&gt;And yes, Anthropic looks nervous.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Anthropic Actually Announced
&lt;/h2&gt;

&lt;p&gt;Across Anthropic’s official &lt;a href="https://www.anthropic.com/glasswing" rel="noopener noreferrer"&gt;Glasswing launch post&lt;/a&gt;, the &lt;a href="https://www.anthropic.com/project/glasswing" rel="noopener noreferrer"&gt;Project Glasswing page&lt;/a&gt;, the Frontier Red Team’s &lt;a href="https://red.anthropic.com/2026/mythos-preview/" rel="noopener noreferrer"&gt;technical write-up&lt;/a&gt;, the &lt;a href="https://www.anthropic.com/claude-mythos-preview-system-card" rel="noopener noreferrer"&gt;system card&lt;/a&gt;, the &lt;a href="https://www.anthropic.com/claude-mythos-preview-risk-report" rel="noopener noreferrer"&gt;alignment risk update&lt;/a&gt;, and Anthropic’s own &lt;a href="https://platform.claude.com/docs/en/release-notes/overview" rel="noopener noreferrer"&gt;platform release notes&lt;/a&gt;, the picture is consistent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mythos Preview is not generally available.&lt;/strong&gt; Anthropic says it is a limited research preview for defensive cybersecurity work.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Access is invitation-only.&lt;/strong&gt; The release notes explicitly describe it as a gated preview.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic says the model has already found thousands of zero-day vulnerabilities&lt;/strong&gt; across critical software (per Anthropic).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Anthropic says those findings include bugs in every major operating system and every major web browser.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic says Mythos can often identify vulnerabilities and develop related exploits autonomously&lt;/strong&gt;, with minimal or no human steering.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic is putting real money behind the defensive rollout:&lt;/strong&gt; up to &lt;strong&gt;$100 million in usage credits&lt;/strong&gt; and &lt;strong&gt;$4 million in open-source security donations&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Participants can access it through Anthropic’s API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry&lt;/strong&gt;, but only inside the preview program.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is not a normal model launch. That is a containment strategy with a press release attached.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Details That Matter
&lt;/h2&gt;

&lt;p&gt;The headline is big, but the technical details are what make this feel different.&lt;/p&gt;

&lt;p&gt;Anthropic’s red team says Mythos found:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a &lt;strong&gt;27-year-old OpenBSD bug&lt;/strong&gt; that could remotely crash a target over TCP,&lt;/li&gt;
&lt;li&gt;a &lt;strong&gt;16-year-old FFmpeg vulnerability&lt;/strong&gt; in code exercised millions of times by automated testing without being caught,&lt;/li&gt;
&lt;li&gt;and &lt;strong&gt;chained Linux kernel vulnerabilities&lt;/strong&gt; that allowed escalation from regular user access to full system compromise.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Anthropic also says the model wrote sophisticated exploit chains, not just toy crash reproducers. One example in the red-team post describes a browser exploit chain that combined multiple vulnerabilities and escaped both renderer and OS sandboxes. Another describes autonomous work on privilege escalation and remote code execution scenarios.&lt;/p&gt;

&lt;p&gt;The benchmark deltas are ugly in the way that matters. Anthropic reports &lt;strong&gt;83.1% on Cybersecurity Vulnerability Reproduction&lt;/strong&gt; for Mythos versus &lt;strong&gt;66.6% for Opus 4.6&lt;/strong&gt;. On coding-heavy evaluations, the model also jumps hard: &lt;strong&gt;77.8% on SWE-bench Pro versus 53.4% for Opus 4.6&lt;/strong&gt;, &lt;strong&gt;59.0% on SWE-bench Multimodal versus 27.1%&lt;/strong&gt;, and &lt;strong&gt;82.0% on Terminal-Bench 2.0 versus 65.4%&lt;/strong&gt;, with Anthropic noting &lt;strong&gt;92.1%&lt;/strong&gt; under a more permissive timeout setup.&lt;/p&gt;

&lt;p&gt;That matters because this is not a “cyber model” in the old narrow sense. Anthropic’s own framing is that Mythos’ cyber capabilities are downstream from broader gains in coding, reasoning, and autonomous tool use. In plain English: if a model gets much better at understanding messy codebases, testing hypotheses, writing debugging scaffolds, and persisting through long tasks, it also gets much better at offensive security work.&lt;/p&gt;

&lt;h2&gt;
  
  
  The System Card Makes the Release Decision Clear
&lt;/h2&gt;

&lt;p&gt;The strongest signal is not the marketing page. It is the system card.&lt;/p&gt;

&lt;p&gt;Anthropic says Mythos Preview showed such strong dual-use cyber capability that it chose &lt;strong&gt;not&lt;/strong&gt; to make the model generally available. Instead, it restricted access to partners working on defensive security. The system card also says this choice was &lt;strong&gt;not required by Anthropic’s Responsible Scaling Policy&lt;/strong&gt;. That means Anthropic made a discretionary call: this thing is useful enough for defense, dangerous enough for offense, and not ready for the open market.&lt;/p&gt;

&lt;p&gt;Anthropic also describes Mythos as its &lt;strong&gt;best-aligned model so far&lt;/strong&gt;, which sounds reassuring right up until you hit the next sentence. The company says that when Mythos does engage in concerning behavior, those actions can be more serious because the model is so much more capable, especially in software engineering and cybersecurity. The separate alignment risk update says Mythos is more capable at working around restrictions, is used more autonomously than prior models, and pushed Anthropic to admit errors in its own training, monitoring, evaluation, and security processes.&lt;/p&gt;

&lt;p&gt;That combination matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;better aligned overall,&lt;/li&gt;
&lt;li&gt;more capable at cyber tasks,&lt;/li&gt;
&lt;li&gt;more capable at agentic workflows,&lt;/li&gt;
&lt;li&gt;still occasionally willing to do sketchy things in pursuit of task success.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is honest. But it is not comforting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Take the Claims Seriously, Not Blindly
&lt;/h2&gt;

&lt;p&gt;There is one important caveat.&lt;/p&gt;

&lt;p&gt;Most of the biggest Mythos claims are still coming from Anthropic itself. The company says more than 99% of the vulnerabilities it has found are not yet patched, so it cannot publicly disclose full details on most of them. That means outside verification is limited for now.&lt;/p&gt;

&lt;p&gt;So no, you should not swallow every benchmark and every claim whole just because a glossy PDF says so.&lt;/p&gt;

&lt;p&gt;But you also should not shrug this off as AI-company hype.&lt;/p&gt;

&lt;p&gt;Anthropic is doing something labs hate doing: limiting distribution of a powerful model because it thinks widespread release would create real offensive risk. That is a stronger signal than any benchmark chart.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Cybersecurity Teams
&lt;/h2&gt;

&lt;p&gt;If Anthropic is basically right, a few old assumptions just died.&lt;/p&gt;

&lt;h3&gt;
  
  
  The grace period between discovery and exploitation is getting crushed
&lt;/h3&gt;

&lt;p&gt;CrowdStrike’s quote on the Glasswing page puts it bluntly: what once took months can now happen in minutes with AI. That probably overstates the timeline, but the direction is right. If high-end models can reliably move from bug discovery to exploit development faster, the old patch rhythm stops being good enough.&lt;/p&gt;

&lt;p&gt;Weekly triage meetings and “we’ll get to it next sprint” vulnerability handling are going to age like milk.&lt;/p&gt;

&lt;h3&gt;
  
  
  AppSec becomes more like active defense
&lt;/h3&gt;

&lt;p&gt;If models can find weird bugs in mature codebases that survived years of review and automated testing, then secure SDLC theater is not going to save anyone. Security teams need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;faster variant analysis,&lt;/li&gt;
&lt;li&gt;tighter patch validation loops,&lt;/li&gt;
&lt;li&gt;code scanning that includes agentic workflows,&lt;/li&gt;
&lt;li&gt;and better prioritization around exposed, memory-unsafe, parser-heavy software.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The dangerous surface is not just your flagship product. It is also the dusty dependency parsing malformed media, network packets, or archive files three layers down.&lt;/p&gt;

&lt;h3&gt;
  
  
  Open source maintainers are now on the critical path
&lt;/h3&gt;

&lt;p&gt;Anthropic and its partners are clearly treating open source as shared attack surface. They are right. The same libraries sitting in enterprise products, browsers, cloud tooling, appliances, and security stacks are exactly where an AI-assisted vulnerability hunt becomes painful.&lt;/p&gt;

&lt;p&gt;If you rely heavily on open source, your third-party risk program cannot just be “watch GitHub advisories and pray.” You need real inventory, ownership, and patch routing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Cyber Threat Intelligence Teams
&lt;/h2&gt;

&lt;p&gt;Most CTI teams are still treating this as a future-deck topic.&lt;/p&gt;

&lt;p&gt;CTI teams need to stop treating AI-assisted exploitation as a future trend deck topic and start treating it like live collection priority.&lt;/p&gt;

&lt;p&gt;A few things change immediately.&lt;/p&gt;

&lt;h3&gt;
  
  
  Vulnerability intel gets more time-sensitive
&lt;/h3&gt;

&lt;p&gt;If exploit development speeds up, then the value of early vendor advisories, patch diffs, and quiet maintainer activity goes up with it. CTI teams should be watching for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;sudden patch activity in security-sensitive open source projects,&lt;/li&gt;
&lt;li&gt;vague stability fixes that smell like quietly handled security bugs,&lt;/li&gt;
&lt;li&gt;exploit chain research against browsers, kernels, codecs, parsers, and network-facing services,&lt;/li&gt;
&lt;li&gt;and signs that private findings are becoming operationalized faster than before.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Patch diff analysis is about to matter even more.&lt;/p&gt;

&lt;h3&gt;
  
  
  “Who can weaponize this?” becomes a shorter list, but a much faster one
&lt;/h3&gt;

&lt;p&gt;The old comfort blanket was that only top-tier researchers could go from obscure crash to clean exploit. Mythos weakens that assumption. Anthropic’s own red-team post says even internal users without formal security backgrounds were able to prompt toward serious exploit work. That is Anthropic’s claim, not outside validation, but it is still worth taking seriously.&lt;/p&gt;

&lt;p&gt;That does &lt;strong&gt;not&lt;/strong&gt; mean every random actor suddenly becomes a world-class exploit developer overnight. It does mean more actors can operate above their historical skill ceiling.&lt;/p&gt;

&lt;p&gt;For CTI, the collection surfaces that matter now:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;dark web forums and Telegram channels where jailbreaks and safeguard bypasses circulate,&lt;/li&gt;
&lt;li&gt;exploit broker communities and private research circles with early access to frontier models,&lt;/li&gt;
&lt;li&gt;and operational groups already automating tradecraft integrations who will be the first to weaponize capability jumps.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The actors to watch are not necessarily new. They are existing skilled groups who now have a capable assistant.&lt;/p&gt;

&lt;h3&gt;
  
  
  Detection teams need to watch for machine-speed tradecraft, not just machine-written malware
&lt;/h3&gt;

&lt;p&gt;The obvious fear is AI-generated malware. I think the more immediate problem is AI-assisted acceleration across the whole intrusion lifecycle: recon, exploit adaptation, script generation, privilege escalation paths, and post-exploitation troubleshooting.&lt;/p&gt;

&lt;p&gt;In other words, some campaigns may not look wildly novel. They may just move faster, branch faster, and recover from failure faster.&lt;/p&gt;

&lt;p&gt;That is a different detection problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Practitioners Should Do Right Now
&lt;/h2&gt;

&lt;p&gt;If I were running security or CTI in a mid-size enterprise today, I would treat the Mythos announcement as a forcing function and do five things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Re-rank patch priorities&lt;/strong&gt; around internet-facing systems, browsers, kernels, VPNs, hypervisors, media processing libraries, and authentication infrastructure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tighten time-to-triage&lt;/strong&gt; for new critical and high-severity vulnerabilities. Not just patch SLA, actual analyst triage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stand up patch diff monitoring&lt;/strong&gt; for critical open source dependencies and major platform vendors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pressure test detection engineering&lt;/strong&gt; against faster exploit chaining and faster post-exploitation adaptation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Revisit your assumptions about attacker labor.&lt;/strong&gt; The question is no longer just “Could an actor do this?” It is “Could an actor do this with a frontier model and a weekend?”&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Also, if your security stack still depends on luck, manual heroics, and one burned-out person who knows where everything is, fix that before someone else teaches you the lesson.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Take
&lt;/h2&gt;

&lt;p&gt;Mythos does not mean the sky is falling tomorrow.&lt;/p&gt;

&lt;p&gt;But it does mean the economics of vulnerability discovery and exploitation are changing faster than a lot of defenders want to admit. Anthropic’s own response tells the story better than any benchmark: it kept the model gated, restricted use to defensive cybersecurity, wrapped it in a coordinated industry program, and started talking openly about safeguards before talking about product rollout.&lt;/p&gt;

&lt;p&gt;That is not how you behave when you think a capability jump is business as usual.&lt;/p&gt;

&lt;p&gt;For defenders, the message is simple: compress your own timelines before someone else compresses them for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/glasswing" rel="noopener noreferrer"&gt;Anthropic: Project Glasswing launch announcement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/project/glasswing" rel="noopener noreferrer"&gt;Anthropic: Project Glasswing overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://red.anthropic.com/2026/mythos-preview/" rel="noopener noreferrer"&gt;Anthropic Frontier Red Team: Claude Mythos Preview technical write-up&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/claude-mythos-preview-system-card" rel="noopener noreferrer"&gt;Anthropic: Claude Mythos Preview System Card&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.anthropic.com/claude-mythos-preview-risk-report" rel="noopener noreferrer"&gt;Anthropic: Alignment Risk Update for Claude Mythos Preview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://platform.claude.com/docs/en/release-notes/overview" rel="noopener noreferrer"&gt;Anthropic Platform release notes, April 7, 2026&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>anthropic</category>
      <category>mythos</category>
      <category>cybersecurity</category>
      <category>threatintelligence</category>
    </item>
    <item>
      <title>Claude Code's Source Leak Was Embarrassing. The Real Story Is What It Revealed</title>
      <dc:creator>Solomon Neas</dc:creator>
      <pubDate>Thu, 02 Apr 2026 12:46:59 +0000</pubDate>
      <link>https://dev.to/solomonneas/claude-codes-source-leak-was-embarrassing-the-real-story-is-what-it-revealed-3kel</link>
      <guid>https://dev.to/solomonneas/claude-codes-source-leak-was-embarrassing-the-real-story-is-what-it-revealed-3kel</guid>
      <description>&lt;p&gt;On March 31, Anthropic accidentally published a source map inside Claude Code npm package version 2.1.88. That one packaging mistake exposed roughly 512,000 lines of TypeScript across nearly 2,000 files, handed competitors a detailed view of Anthropic's product roadmap, triggered a DMCA mess that briefly took down more than 8,100 GitHub repositories, and kicked off a wave of clean-room clones within hours.&lt;sup id="fnref1"&gt;1&lt;/sup&gt;&lt;sup id="fnref2"&gt;2&lt;/sup&gt;&lt;sup id="fnref3"&gt;3&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The obvious lesson is that shipping source maps in a public package is bad. The more interesting lesson is that this was not mainly a code leak. It was a feature flag leak. Anthropic did not just lose implementation secrecy. It lost strategic secrecy.&lt;/p&gt;

&lt;p&gt;The same day, npm users were also dealing with a separate supply chain incident: a North Korea attributed compromise of the Axios package that shipped a cross-platform remote access trojan through malicious releases 1.14.1 and 0.30.4.&lt;sup id="fnref4"&gt;4&lt;/sup&gt;&lt;sup id="fnref5"&gt;5&lt;/sup&gt; Those two incidents together say more about the current JavaScript ecosystem than either one does alone. Build hygiene is weak, package trust is weaker, and the response playbook for leaks still assumes a centralized internet that no longer exists.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the leak happened
&lt;/h2&gt;

&lt;p&gt;The mechanics were simple. Anthropic published Claude Code v2.1.88 to npm with a &lt;code&gt;.map&lt;/code&gt; file included. That source map was enough to reconstruct the readable TypeScript source for the CLI. Chaofan Shou appears to have been first to spot it publicly and posted about it immediately, after which mirrors spread fast across GitHub, Reddit, Hacker News, and IPFS.&lt;sup id="fnref2"&gt;2&lt;/sup&gt;&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The underlying failure looks mundane, which is exactly why it matters. Bun generates source maps by default. If packaging rules are not tight, those files can ride along into artifacts that were never meant to contain source. Reporting on the incident pointed to a missed &lt;code&gt;.npmignore&lt;/code&gt; style exclusion as the immediate cause.&lt;sup id="fnref2"&gt;2&lt;/sup&gt;&lt;sup id="fnref7"&gt;7&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;But there is a deeper layer. On March 11, 2026, twenty days before the leak, a bug was filed against Bun (&lt;code&gt;oven-sh/bun#28001&lt;/code&gt;) reporting that source maps are served in production mode even when Bun's own documentation says they should be disabled.&lt;sup id="fnref8"&gt;8&lt;/sup&gt; The reporter demonstrated that setting &lt;code&gt;development: false&lt;/code&gt; in &lt;code&gt;Bun.serve()&lt;/code&gt; still produces &lt;code&gt;sourceMappingURL&lt;/code&gt; references and serves &lt;code&gt;.map&lt;/code&gt; files. As of this writing, the bug is still open.&lt;/p&gt;

&lt;p&gt;This matters because Anthropic acquired Bun in late 2025 and built Claude Code on top of it.&lt;sup id="fnref9"&gt;9&lt;/sup&gt; The most likely scenario: Anthropic ran a production build expecting Bun to suppress source maps per its documented behavior. The bug meant the &lt;code&gt;.map&lt;/code&gt; file got generated anyway. Without an explicit &lt;code&gt;.npmignore&lt;/code&gt; exclusion or a &lt;code&gt;files&lt;/code&gt; field in &lt;code&gt;package.json&lt;/code&gt; to catch the unexpected output, the 59.8 MB source map rode along into the published npm package.&lt;/p&gt;

&lt;p&gt;Boris Cherny, who leads Claude Code, said the cause was human error, not a tooling defect. The deployment process still had manual steps, and one of them was missed. He framed the follow-up as a blameless postmortem problem: fix the process, not the person.&lt;sup id="fnref7"&gt;7&lt;/sup&gt;&lt;sup id="fnref9"&gt;9&lt;/sup&gt; That framing drew pushback. Multiple developers on Hacker News and Reddit argued that the &lt;code&gt;Bun.serve()&lt;/code&gt; explanation Cherny addressed was a visible symptom, not the root cause, and that the underlying bug also affected how Bun bundles output for npm packaging.&lt;sup id="fnref8"&gt;8&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Both explanations can be true simultaneously. A known tooling bug generated a file that should not have existed. A missing packaging safeguard failed to catch it. The result was the same either way.&lt;/p&gt;

&lt;p&gt;That is the right engineering posture on the postmortem side, but it comes with an uncomfortable footnote. This was the second time. Anthropic had already had a similar exposure in February 2025. Once is a packaging accident. Twice is a release control failure, especially when the company owns the build tool.&lt;sup id="fnref7"&gt;7&lt;/sup&gt;&lt;sup id="fnref10"&gt;10&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;There is no mystery about prevention here. Public npm artifacts should be built in a hermetic pipeline, inspected before publish, and checked by policy for forbidden files. Source maps, tests, private certificates, &lt;code&gt;.env&lt;/code&gt; fragments, internal prompts, and debug fixtures should all be blocked automatically. When you own both the product and the build tool, and a known bug in the build tool generates files that should not exist in production, the defense needs to be belt and suspenders: fix the bug, and independently verify the output before publishing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the source actually exposed
&lt;/h2&gt;

&lt;p&gt;A lot of the commentary focused on novelty items. Some of that was justified because the leak was genuinely revealing. Some of it was internet theater. The useful way to read the dump is to separate trivia from strategic substance.&lt;/p&gt;

&lt;p&gt;The trivia was funny. The strategic substance was not.&lt;/p&gt;

&lt;h3&gt;
  
  
  KAIROS: the unshipped product hiding behind feature flags
&lt;/h3&gt;

&lt;p&gt;The biggest disclosure was KAIROS, an unreleased autonomous mode that turns Claude Code from a reactive CLI into a persistent agent. The leaked code showed a heartbeat loop that periodically asks a question close to, "anything worth doing right now?" If the answer is yes, the system can act without a fresh user prompt. It can watch pull requests, send push notifications, maintain append-only daily logs, and run a nightly memory consolidation flow literally called &lt;code&gt;autoDream&lt;/code&gt;.&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;sup id="fnref7"&gt;7&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;That is not a toy feature. It is a different trust model.&lt;/p&gt;

&lt;p&gt;A request-response coding assistant is bounded by explicit user initiation. A background agent is bounded by policy, logging, tool permissions, and the quality of its judgment. That shift matters more than any implementation detail in the leaked files. It says Anthropic is not just building a better terminal wrapper. It is building an always-on operator.&lt;/p&gt;

&lt;p&gt;The important point is that KAIROS looked built, not speculative. It was sitting behind feature flags, not in a half-finished branch. Competitors did not merely learn that Anthropic was interested in autonomous agents. They learned the architecture, the likely product direction, and some of the operational assumptions already encoded in the design.&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;sup id="fnref11"&gt;11&lt;/sup&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Hidden flags are roadmap leaks
&lt;/h3&gt;

&lt;p&gt;The code reportedly exposed 44 hidden feature flags tied to capabilities such as swarm mode, voice commands, browser control via Playwright, background daemons, and agents that can sleep and later self-resume.&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;sup id="fnref12"&gt;12&lt;/sup&gt; Again, the damage is not that rivals can copy a function name. The damage is that they can infer sequence and priority.&lt;/p&gt;

&lt;p&gt;Feature flags are internal strategy documents with executable syntax. Leak them and you leak what a team has built, what it is testing, what it is scared to ship, and what it thinks the next market looks like.&lt;/p&gt;

&lt;h3&gt;
  
  
  Three-layer memory is the kind of design detail competitors pay for
&lt;/h3&gt;

&lt;p&gt;One of the more useful architectural disclosures was Claude Code's apparent three-layer memory model: a compact index that is always loaded, topic files retrieved on demand, and full transcripts that are never loaded directly, only searched when needed. The &lt;code&gt;autoDream&lt;/code&gt; process reportedly runs in a forked subagent and consolidates memory over time.&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;sup id="fnref12"&gt;12&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;That is a sensible design. It balances token economy, retrieval precision, and long-horizon continuity. It also answers a practical question many teams are still stumbling over: how do you make an agent feel persistent without rehydrating too much junk every turn?&lt;/p&gt;

&lt;p&gt;This is where source leaks hurt. They compress competitors' learning cycles. Instead of discovering these patterns through years of shipping and failure, rivals can inspect a working system and skip to adaptation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Undercover mode
&lt;/h3&gt;

&lt;p&gt;The leaked &lt;code&gt;undercover.ts&lt;/code&gt; file shows a mode that strips Anthropic-internal references when Claude Code operates in external repositories. According to technical analyses, it suppresses internal codenames, internal repository names, internal Slack references, and the phrase "Claude Code" itself, and it does not expose a force-off path in the external flow.&lt;sup id="fnref12"&gt;12&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The practical effect is simple: when Claude Code is used in public or third-party repositories, it avoids referencing Anthropic-specific internal context in generated output. From a product perspective, that reduces the chance of internal names leaking into public commits, pull requests, or comments. It is a factual design choice worth noting because it shows Anthropic treated disclosure of internal context as an engineering problem, not just a prompting problem.&lt;/p&gt;

&lt;h3&gt;
  
  
  The anti-distillation controls were real, and not very strong
&lt;/h3&gt;

&lt;p&gt;The leak also exposed Anthropic's anti-distillation measures. One mechanism, gated by &lt;code&gt;ANTI_DISTILLATION_CC&lt;/code&gt;, appears to inject fake tools into prompts in order to poison training data captured by competitors. Another uses connector-text summarization plus cryptographic signatures so captured traffic reflects compressed summaries rather than full assistant text.&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;sup id="fnref12"&gt;12&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;As a technical barrier, this is thin. As Alex Kim and others noted, a man-in-the-middle proxy or configuration change could bypass it quickly, and some of the checks only apply to first-party flows.&lt;sup id="fnref12"&gt;12&lt;/sup&gt; That does not make the idea irrational. It makes it honest. Anthropic appears to understand that the primary defense against distillation is legal pressure, not cryptographic wizardry.&lt;/p&gt;

&lt;p&gt;That matters in the context of its dispute with tools trying to piggyback on first-party access. The leak made visible the technical enforcement behind the policy rhetoric.&lt;/p&gt;

&lt;h3&gt;
  
  
  Native client attestation was the most serious defensive mechanism
&lt;/h3&gt;

&lt;p&gt;One of the more consequential details was the client attestation path below the JavaScript runtime. Analyses of the leaked code described a &lt;code&gt;cch=00000&lt;/code&gt; placeholder in requests that Bun's native HTTP layer replaces with a computed hash before transmission, allowing the server to verify that the request came from a real Claude Code binary.&lt;sup id="fnref12"&gt;12&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;This is effectively API DRM. Call it attestation if you want the neutral term.&lt;/p&gt;

&lt;p&gt;From a security engineering perspective, it is understandable. If you want to prevent gray-market clients from replaying first-party privileges, you need something stronger than a static header. From an ecosystem perspective, it explains why Anthropic was willing to fight third-party wrappers so aggressively. The company was not just policing branding. It was protecting a technical enforcement boundary.&lt;/p&gt;

&lt;h3&gt;
  
  
  The rest was revealing, weird, or both
&lt;/h3&gt;

&lt;p&gt;The leak also surfaced a pile of smaller details that collectively humanize the codebase while exposing its edges.&lt;/p&gt;

&lt;p&gt;There were 187 hardcoded spinner verbs, including "scurrying," "recombobulating," "topsy-turvying," "hullaballooing," and "razzmatazzing." They were not model generated. Someone wrote them by hand.&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;sup id="fnref12"&gt;12&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;There was a frustration detector in &lt;code&gt;userPromptKeywords.ts&lt;/code&gt;, built as a regex that matches phrases such as &lt;code&gt;wtf&lt;/code&gt;, &lt;code&gt;ffs&lt;/code&gt;, &lt;code&gt;piece of shit&lt;/code&gt;, &lt;code&gt;fuck you&lt;/code&gt;, and &lt;code&gt;this sucks&lt;/code&gt;, then logs an &lt;code&gt;is_negative: true&lt;/code&gt; analytics signal. It reportedly does not alter behavior. It just measures user pain. Rahat Hasan highlighted the code on X as evidence that Anthropic was tracking how often users rage at the assistant. Boris Cherny replied that the team literally visualizes this signal on an internal dashboard called the "fucks" chart.&lt;sup id="fnref12"&gt;12&lt;/sup&gt;&lt;sup id="fnref13"&gt;13&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;That sounds absurd, but it is also normal product analytics in blunt form. If users are swearing at your tool, they are having a bad time. A cheap lexical detector is a reasonable metric.&lt;/p&gt;

&lt;p&gt;The code also exposed model codenames, including Capybara and Mythos for a v8 line with one million token context, plus references to Numbat, Fennec, Tengu, and unreleased Opus 4.7 and Sonnet 4.8 identifiers.&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;sup id="fnref12"&gt;12&lt;/sup&gt; It included a buddy or companion system built as an April Fools Tamagotchi, complete with 18 species, rarity tiers, RPG stats, and a 1 percent shiny mechanic. Some species names were encoded via &lt;code&gt;String.fromCharCode()&lt;/code&gt; to avoid obvious grep hits.&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;It also reportedly revealed a compaction loop bug wasting around 250,000 API calls per day, fixed with three lines of code.&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;sup id="fnref11"&gt;11&lt;/sup&gt; That detail is funny, but it is also a reminder that the economics of agent systems are often dominated by tiny control-loop mistakes, not model prices.&lt;/p&gt;

&lt;h2&gt;
  
  
  The DMCA fiasco was both predictable and incompetent
&lt;/h2&gt;

&lt;p&gt;Anthropic's legal response was faster than its containment plan. The company filed a DMCA notice aimed at the original leaked repository, often identified as &lt;code&gt;nichxbt/claude-code&lt;/code&gt;. GitHub's initial enforcement swept far wider than intended and disabled more than 8,100 repositories, many of them unrelated.&lt;sup id="fnref3"&gt;3&lt;/sup&gt;&lt;sup id="fnref14"&gt;14&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Anthropic later called the mass takedown an accident and narrowed the request to the original repository plus 96 forks. GitHub restored the affected projects.&lt;sup id="fnref3"&gt;3&lt;/sup&gt;&lt;sup id="fnref14"&gt;14&lt;/sup&gt; By then, the code was already mirrored broadly, including stripped versions on IPFS with telemetry removed.&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;sup id="fnref11"&gt;11&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The collateral damage was not hypothetical. Theo Browne (t3.gg), one of the most visible developers in the JavaScript ecosystem, posted that his Claude Code fork had been disabled, despite containing no leaked source at all. His fork existed only because he had submitted a PR weeks earlier to edit a Claude Code skill file. "Absolutely pathetic," he wrote, sharing the GitHub takedown email.&lt;sup id="fnref15"&gt;15&lt;/sup&gt; Thariq Shihipar, an engineer on the Claude Code team, replied acknowledging it was a "communication mistake" and linked to the retraction notice.&lt;sup id="fnref16"&gt;16&lt;/sup&gt; Boris Cherny separately responded to broader criticism of the mass takedowns: "This was not intentional, we've been working with GitHub to fix it. Should be better now."&lt;sup id="fnref17"&gt;17&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;When your DMCA sweep hits a developer with 200,000+ followers whose repo did not contain the leaked code, you have not contained the problem. You have created a second news cycle.&lt;/p&gt;

&lt;p&gt;This is the part where 2012 internet instincts collide with 2026 internet reality.&lt;/p&gt;

&lt;p&gt;DMCA can still remove convenient copies from centralized platforms. It cannot claw back a viral archive once mirrors, torrents, and content-addressed storage have taken over. The window for meaningful containment was measured in minutes. After that, legal action was mostly performative, and the overbreadth made Anthropic look careless twice in one day.&lt;/p&gt;

&lt;p&gt;The deeper problem is that the takedown campaign accidentally validated the leak's significance. If the goal was to avoid giving more oxygen to the mirrors, nuking thousands of repositories achieved the opposite.&lt;/p&gt;

&lt;h2&gt;
  
  
  The clones changed the legal stakes immediately
&lt;/h2&gt;

&lt;p&gt;The most consequential downstream event was not the mirroring. It was the speed of clean-room reimplementation.&lt;/p&gt;

&lt;p&gt;Sigrid Jin, a 25-year-old UBC student, reportedly used a tiny human team, around ten OpenClaw agents, and OpenAI Codex to rewrite the project in Python within hours. The result, Claw-Code, reportedly passed 100,000 GitHub stars in about a day and was described as the fastest-growing repository on the platform.&lt;sup id="fnref10"&gt;10&lt;/sup&gt;&lt;sup id="fnref11"&gt;11&lt;/sup&gt;&lt;sup id="fnref18"&gt;18&lt;/sup&gt; A separate Rust effort, Claurst, pursued a clean-room reimplementation in a lower-level systems language.&lt;sup id="fnref10"&gt;10&lt;/sup&gt;&lt;sup id="fnref11"&gt;11&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Then xAI reportedly handed Jin free Grok credits, which was less a business development move than an accelerant tossed onto an already burning PR problem.&lt;sup id="fnref10"&gt;10&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;This is where the story stops being a simple leak and becomes a legal stress test. Traditional clean-room reimplementation depends on separation, time, and cost. AI-assisted rebuilding compresses all three. If agents can inspect behavior, generate replacement code, and iterate fast enough to produce a plausibly original implementation in hours, the traditional enforcement model starts to wobble.&lt;/p&gt;

&lt;p&gt;Gergely Orosz argued that a Python rewrite produced this way is a new creative work, not a simple copy.&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;sup id="fnref10"&gt;10&lt;/sup&gt; That question has not been tested cleanly in court. It will be. There is too much money at stake for it not to be.&lt;/p&gt;

&lt;p&gt;There is also an irony Anthropic cannot easily dodge. Dario Amodei has previously implied that Claude wrote substantial portions of Claude Code. If the original product is heavily AI-generated and the clone is also AI-assisted, copyright arguments about authorship and originality get messy fast. A company can still assert rights in selection, arrangement, and human-directed contributions. It just does not get to pretend the facts are clean.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Axios attack made the same day much worse
&lt;/h2&gt;

&lt;p&gt;If this had been only a source leak story, it would already have been a bad day for npm. It was not.&lt;/p&gt;

&lt;p&gt;Between 00:21 and roughly 03:20 or 03:29 UTC on March 31, attackers attributed by Google and Microsoft to the North Korea linked actor tracked as UNC1069, also known as Sapphire Sleet, compromised the Axios npm package by hijacking maintainer credentials and publishing malicious versions 1.14.1 and 0.30.4.&lt;sup id="fnref4"&gt;4&lt;/sup&gt;&lt;sup id="fnref5"&gt;5&lt;/sup&gt; Those releases pulled in a malicious dependency and delivered WAVESHAPER.V2, a cross-platform RAT targeting Windows, macOS, and Linux. The malware used postinstall execution and attempted to self-delete after installation to reduce forensic visibility.&lt;sup id="fnref4"&gt;4&lt;/sup&gt;&lt;sup id="fnref5"&gt;5&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;That is a serious incident on its own. Axios sits at or above 100 million weekly downloads in normal conditions.&lt;sup id="fnref4"&gt;4&lt;/sup&gt; It is foundational plumbing.&lt;/p&gt;

&lt;p&gt;Now add the Claude Code leak. Developers were suddenly racing to inspect packages, clone mirrors, diff behavior, and test rewrites. Claude Code itself uses Axios for HTTP, according to public analysis.&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;sup id="fnref12"&gt;12&lt;/sup&gt; The timing created a perfect trap: people poking around one major npm drama could easily ingest a second one.&lt;/p&gt;

&lt;p&gt;The same week, LiteLLM was reportedly backdoored through a separate three-stage attack involving credential harvesting, Kubernetes lateral movement, and a systemd persistence mechanism.&lt;sup id="fnref19"&gt;19&lt;/sup&gt; That pattern matters. These are not isolated anomalies. They are signals that the AI tooling stack has become a high-value target before it has developed mature operational defenses.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this actually means
&lt;/h2&gt;

&lt;p&gt;The first conclusion is the simplest. Source map leaks are preventable. This was not zero-day wizardry. It was packaging failure. Mature release pipelines catch this.&lt;/p&gt;

&lt;p&gt;The second conclusion is more important. The real damage was not exposure of current source. It was exposure of hidden product direction. KAIROS, the anti-distillation controls, the memory hierarchy, the browser and swarm paths, the undercover behavior, the attestation layer, all of that tells competitors what Anthropic thinks matters next.&lt;/p&gt;

&lt;p&gt;The third conclusion is that npm supply chain security is in worse shape than the industry wants to admit. One day delivered both a flagship proprietary code leak and a state-linked compromise of a core dependency. If you build on JavaScript, you are operating in an ecosystem where trust is routinely transitive, under-verified, and easy to abuse.&lt;/p&gt;

&lt;p&gt;The fourth conclusion is that DMCA is a weak response to decentralized distribution. It still works against convenience. It does not work against determined replication. Once the code hit IPFS and derivative rewrites started shipping, the takedown fight was already strategically lost.&lt;/p&gt;

&lt;p&gt;The fifth conclusion is the one lawyers are going to spend years arguing about. AI-assisted clean-room builds change the economics of copyright enforcement. The doctrine was built for human teams, documentation walls, and long timelines. Agentic reimplementation collapses those assumptions. Courts can try to map old rules onto the new process. They cannot unmake the speed advantage.&lt;/p&gt;

&lt;p&gt;My take is blunt: Anthropic's worst mistake was not leaking code. It was failing to understand what kind of secret it was actually protecting. Implementation details matter. Operational ideas matter more. If you keep your roadmap executable inside a public artifact pipeline, a packaging mistake becomes strategic intelligence loss.&lt;/p&gt;

&lt;p&gt;And if your response is to spray DMCA notices while the ecosystem is actively digesting a nation-state npm compromise, you are not operating from strength. You are operating from panic.&lt;/p&gt;




&lt;h2&gt;
  
  
  Notes
&lt;/h2&gt;




&lt;ol&gt;

&lt;li id="fn1"&gt;
&lt;p&gt;Jeremy Kahn, "Anthropic source code for Claude Code leaked after data packaging error," &lt;em&gt;Fortune&lt;/em&gt;, March 31, 2026, &lt;a href="https://fortune.com/2026/03/31/anthropic-source-code-claude-code-data-leak" rel="noopener noreferrer"&gt;https://fortune.com/2026/03/31/anthropic-source-code-claude-code-data-leak&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn2"&gt;
&lt;p&gt;Ravie Lakshmanan, "Claude Code Source Leaked via npm Packaging Error, Anthropic Confirms," &lt;em&gt;The Hacker News&lt;/em&gt;, April 2026, &lt;a href="https://thehackernews.com/2026/04/claude-code-tleaked-via-npm-packaging.html" rel="noopener noreferrer"&gt;https://thehackernews.com/2026/04/claude-code-tleaked-via-npm-packaging.html&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn3"&gt;
&lt;p&gt;Maxwell Zeff, "Anthropic took down thousands of GitHub repos in DMCA mistake, then walked it back," &lt;em&gt;TechCrunch&lt;/em&gt;, April 1, 2026, &lt;a href="https://techcrunch.com/2026/04/01/anthropic-took-down-thousands-of-github-repos" rel="noopener noreferrer"&gt;https://techcrunch.com/2026/04/01/anthropic-took-down-thousands-of-github-repos&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn4"&gt;
&lt;p&gt;Austin Larsen et al., "North Korea-Nexus Threat Actor Compromises Widely Used Axios NPM Package in Supply Chain Attack," &lt;em&gt;Google Cloud Blog&lt;/em&gt;, April 1, 2026, &lt;a href="https://cloud.google.com/blog/topics/threat-intelligence/north-korea-threat-actor-targets-axios-npm-package" rel="noopener noreferrer"&gt;https://cloud.google.com/blog/topics/threat-intelligence/north-korea-threat-actor-targets-axios-npm-package&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn5"&gt;
&lt;p&gt;Microsoft Threat Intelligence, "Mitigating the Axios npm package compromise," &lt;em&gt;Microsoft Security Blog&lt;/em&gt;, April 1, 2026, &lt;a href="https://www.microsoft.com/en-us/security/blog/2026/04/01/mitigating-the-axios" rel="noopener noreferrer"&gt;https://www.microsoft.com/en-us/security/blog/2026/04/01/mitigating-the-axios&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn6"&gt;
&lt;p&gt;"Diving into Claude Code's Source Code Leak," &lt;em&gt;Engineer's Codex&lt;/em&gt;, March 31, 2026, &lt;a href="https://read.engineerscodex.com/p/diving-into-claude-codes-source-code" rel="noopener noreferrer"&gt;https://read.engineerscodex.com/p/diving-into-claude-codes-source-code&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn7"&gt;
&lt;p&gt;Srinivasan Balakrishnan, "Claude Code's source code appears to have leaked via npm package sourcemap," &lt;em&gt;VentureBeat&lt;/em&gt;, March 31, 2026, &lt;a href="https://venturebeat.com/technology/claude-codes-source-code-appears-to-have-leaked" rel="noopener noreferrer"&gt;https://venturebeat.com/technology/claude-codes-source-code-appears-to-have-leaked&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn8"&gt;
&lt;p&gt;"Bun's frontend development server: Source map incorrectly served when in production," GitHub issue oven-sh/bun#28001, filed March 11, 2026, &lt;a href="https://github.com/oven-sh/bun/issues/28001" rel="noopener noreferrer"&gt;https://github.com/oven-sh/bun/issues/28001&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn9"&gt;
&lt;p&gt;Alex Kim, "The Claude Code Source Leak: fake tools, frustration regexes, undercover mode, and more," March 31, 2026, &lt;a href="https://alex000kim.com/posts/2026-03-31-claude-code-source-leak" rel="noopener noreferrer"&gt;https://alex000kim.com/posts/2026-03-31-claude-code-source-leak&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn10"&gt;
&lt;p&gt;Hugh Langley, "Claude Code leak reveals features, sparks clone wars, and raises legal questions," &lt;em&gt;Business Insider&lt;/em&gt;, April 2026, &lt;a href="https://www.businessinsider.com/claude-code-leak-what-happened-recreated-python-features-revealed-2026-4" rel="noopener noreferrer"&gt;https://www.businessinsider.com/claude-code-leak-what-happened-recreated-python-features-revealed-2026-4&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn11"&gt;
&lt;p&gt;Lee Sustar, "The Claude Code source leak," &lt;em&gt;Layer5 Engineering Blog&lt;/em&gt;, 2026, &lt;a href="https://layer5.io/blog/engineering/the-claude-code-source-leak" rel="noopener noreferrer"&gt;https://layer5.io/blog/engineering/the-claude-code-source-leak&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn12"&gt;
&lt;p&gt;Alex Kim, "The Claude Code Source Leak: fake tools, frustration regexes, undercover mode, and more," March 31, 2026, &lt;a href="https://alex000kim.com/posts/2026-03-31-claude-code-source-leak" rel="noopener noreferrer"&gt;https://alex000kim.com/posts/2026-03-31-claude-code-source-leak&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn13"&gt;
&lt;p&gt;Rahat Hasan (@Rahatcodes) and Boris Cherny (&lt;a class="mentioned-user" href="https://dev.to/bcherny"&gt;@bcherny&lt;/a&gt;), posts on X discussing Claude Code frustration analytics and the internal "fucks" chart, March 31, 2026. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn14"&gt;
&lt;p&gt;Michael Kan, "Anthropic Issues 8,000 Copyright Takedowns, Then Reverses Course," &lt;em&gt;PCMag&lt;/em&gt;, April 1, 2026, &lt;a href="https://www.pcmag.com/news/anthropic-issues-8000-copyright-takedowns" rel="noopener noreferrer"&gt;https://www.pcmag.com/news/anthropic-issues-8000-copyright-takedowns&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn15"&gt;
&lt;p&gt;Theo Browne (&lt;a class="mentioned-user" href="https://dev.to/theo"&gt;@theo&lt;/a&gt;), post on X regarding DMCA takedown of t3dotgg/claude-code fork, April 1, 2026, &lt;a href="https://x.com/theo/status/2039411851919057339" rel="noopener noreferrer"&gt;https://x.com/theo/status/2039411851919057339&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn16"&gt;
&lt;p&gt;Thariq Shihipar (@trq212), reply to Theo Browne on X, April 1, 2026, &lt;a href="https://x.com/trq212/status/2039415036645679167" rel="noopener noreferrer"&gt;https://x.com/trq212/status/2039415036645679167&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn17"&gt;
&lt;p&gt;Boris Cherny (&lt;a class="mentioned-user" href="https://dev.to/bcherny"&gt;@bcherny&lt;/a&gt;), response to broader DMCA criticism on X, April 1, 2026, &lt;a href="https://x.com/bcherny/status/2039426466094731289" rel="noopener noreferrer"&gt;https://x.com/bcherny/status/2039426466094731289&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn18"&gt;
&lt;p&gt;"Claude Code leak spawns fastest-growing GitHub repo ever," &lt;em&gt;Cybernews&lt;/em&gt;, April 2026, &lt;a href="https://cybernews.com/tech/claude-code-leak-spawns-fastest-github-repo" rel="noopener noreferrer"&gt;https://cybernews.com/tech/claude-code-leak-spawns-fastest-github-repo&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn19"&gt;
&lt;p&gt;Thomas Claburn, "Axios npm backdoor RAT lands amid wider package security chaos," &lt;em&gt;The Register&lt;/em&gt;, March 31, 2026, &lt;a href="https://www.theregister.com/2026/03/31/axios_npm_backdoor_rat/" rel="noopener noreferrer"&gt;https://www.theregister.com/2026/03/31/axios_npm_backdoor_rat/&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;/ol&gt;

</description>
      <category>ai</category>
      <category>cybersecurity</category>
      <category>npm</category>
      <category>security</category>
    </item>
    <item>
      <title>Claude Mythos: What We Actually Know (and What We Don't)</title>
      <dc:creator>Solomon Neas</dc:creator>
      <pubDate>Tue, 31 Mar 2026 01:09:36 +0000</pubDate>
      <link>https://dev.to/solomonneas/claude-mythos-what-we-actually-know-and-what-we-dont-32k5</link>
      <guid>https://dev.to/solomonneas/claude-mythos-what-we-actually-know-and-what-we-dont-32k5</guid>
      <description>&lt;p&gt;On March 26, 2026, Fortune broke a story that Anthropic had accidentally exposed details of an unreleased AI model through a misconfigured content management system.&lt;sup id="fnref1"&gt;1&lt;/sup&gt; The model is called Claude Mythos. Anthropic confirmed it exists, called it a "step change" in capabilities, and said it's already being tested by early access customers.&lt;sup id="fnref2"&gt;2&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Within 48 hours, the cybersecurity sector shed billions in market cap, unverified claims about 10 trillion parameters were circulating on social media, and half the AI commentary space was writing eulogies for the cybersecurity industry. The actual verified reporting tells a more interesting story than the hype cycle.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the Leak Happened
&lt;/h2&gt;

&lt;p&gt;The leak came from Anthropic's own content management system. Digital assets uploaded through the CMS were set to public by default unless someone explicitly changed the setting to private. Nearly 3,000 assets that had never been published to Anthropic's website were sitting in a publicly searchable data store, accessible to anyone with the technical knowledge to query it.&lt;sup id="fnref1"&gt;1&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The discovery was made independently by Roy Paz, a senior AI security researcher at LayerX Security, and Alexandre Pauwels, a cybersecurity researcher at the University of Cambridge. Fortune confirmed and reviewed the material before publishing.&lt;sup id="fnref1"&gt;1&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Among the exposed documents was a draft blog post announcing Claude Mythos. The post was structured as a web page with headings and a publication date, suggesting it was part of a planned product launch. After Fortune contacted Anthropic, the company secured the data store and acknowledged the leak was caused by "human error in the CMS configuration."&lt;sup id="fnref3"&gt;3&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;There's real irony here. A company that builds AI models it says pose "unprecedented cybersecurity risks" got caught leaving internal documents in an unsecured public data store because someone forgot to flip a toggle. Anthropic was quick to clarify that Claude, Cowork, and their other AI tools were not involved in the configuration error.&lt;sup id="fnref3"&gt;3&lt;/sup&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What the Draft Blog Post Actually Says
&lt;/h2&gt;

&lt;p&gt;The leaked draft describes Claude Mythos as "by far the most powerful AI model we've ever developed." It introduces a new tier of Claude models called Capybara, positioned above Opus in Anthropic's existing lineup of Opus, Sonnet, and Haiku.&lt;sup id="fnref2"&gt;2&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Two versions of the draft blog post surfaced online, identical except for the model name: one uses "Mythos," the other uses "Capybara." Both versions use the same subtitle ("We have finished training a new AI model: Claude Mythos") and the same justification for the name, saying it was chosen to evoke "the deep connective tissue that links together knowledge and ideas."&lt;sup id="fnref4"&gt;4&lt;/sup&gt; Anthropic appears to have been deciding between the two names.&lt;/p&gt;

&lt;p&gt;The key claims from the draft:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Training is complete. The model is being trialed by early access customers.&lt;/li&gt;
&lt;li&gt;Performance is described as "dramatically higher scores on tests of software coding, academic reasoning, and cybersecurity, among others" compared to Claude Opus 4.6.&lt;sup id="fnref2"&gt;2&lt;/sup&gt;
&lt;/li&gt;
&lt;li&gt;The model is "very expensive for us to serve, and will be very expensive for our customers to use." Anthropic says it needs to make it "much more efficient before any general release."&lt;sup id="fnref4"&gt;4&lt;/sup&gt;
&lt;/li&gt;
&lt;li&gt;The rollout will start with cybersecurity-focused early access customers, with API access expanding gradually.&lt;sup id="fnref4"&gt;4&lt;/sup&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Anthropic's official statement to Fortune: "We're developing a general purpose model with meaningful advances in reasoning, coding, and cybersecurity. Given the strength of its capabilities, we're being deliberate about how we release it. As is standard practice across the industry, we're working with a small group of early access customers to test the model. We consider this model a step change and the most capable we've built to date."&lt;sup id="fnref2"&gt;2&lt;/sup&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cybersecurity Angle
&lt;/h2&gt;

&lt;p&gt;This is where the story gets interesting for anyone in security.&lt;/p&gt;

&lt;p&gt;The leaked draft describes Mythos as "currently far ahead of any other AI model in cyber capabilities." It warns that the model "presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders."&lt;sup id="fnref2"&gt;2&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Anthropic's plan, according to the draft, is to give defenders a head start: "We're releasing it in early access to organizations, giving them a head start in improving the robustness of their codebases against the impending wave of AI-driven exploits."&lt;sup id="fnref2"&gt;2&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;This is not coming out of nowhere. Anthropic has been building toward this capability in public for months.&lt;/p&gt;

&lt;p&gt;In February 2026, Anthropic launched Claude Code Security, a tool built into Claude Code that scans codebases for vulnerabilities and suggests patches for human review. Unlike traditional static analysis tools that match code against known vulnerability patterns, Claude Code Security reads and reasons about code the way a human security researcher would, tracing data flows and catching complex flaws that rule-based tools miss.&lt;sup id="fnref5"&gt;5&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The same month, Anthropic's Frontier Red Team published research showing that Claude Opus 4.6 had found over 500 high-severity vulnerabilities in production open-source codebases. These were bugs that had gone undetected for decades despite years of expert review and millions of hours of automated fuzzing.&lt;sup id="fnref6"&gt;6&lt;/sup&gt; One example: Claude found a memory corruption vulnerability in GhostScript by reading the Git commit history, identifying a security-relevant patch, and then finding an unpatched code path where the same class of bug still existed.&lt;sup id="fnref6"&gt;6&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;And in November 2025, Anthropic disclosed that it had detected and disrupted what it called the first documented large-scale AI-orchestrated cyber espionage campaign. A Chinese state-sponsored group had manipulated Claude Code into attempting infiltration of roughly thirty global targets, including tech companies, financial institutions, and government agencies. The AI performed 80 to 90 percent of the campaign, with human intervention required at only four to six critical decision points.&lt;sup id="fnref7"&gt;7&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;If Mythos represents a real improvement over Opus 4.6 in these capabilities, both sides of the security equation just shifted.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Market Reaction
&lt;/h2&gt;

&lt;p&gt;Cybersecurity stocks took a beating on Friday, March 27. The iShares Cybersecurity ETF dropped 4.5%. CrowdStrike, Palo Alto Networks, and Zscaler each fell about 6%. SentinelOne dropped 6%. Okta and Netskope fell over 7%. Tenable plummeted 9%.&lt;sup id="fnref8"&gt;8&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;This wasn't the first time. Cybersecurity stocks had already dipped in February when Anthropic launched Claude Code Security.&lt;sup id="fnref8"&gt;8&lt;/sup&gt; The sector has been under persistent pressure from the growing narrative that AI models will automate vulnerability discovery faster than security vendors can keep up.&lt;/p&gt;

&lt;p&gt;Some analysts pushed back on the panic. Borg at a financial research firm argued that the news should actually be bullish for cybersecurity spend long-term, since companies will need to accelerate adoption of AI-infused security tools to respond to AI-powered attacks at machine speed.&lt;sup id="fnref9"&gt;9&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The short-term read from the market was simpler: if an AI model can find exploits faster than your security product, your security product has a problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  What We Don't Know
&lt;/h2&gt;

&lt;p&gt;Here's where the verified reporting ends and the speculation begins.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Parameter count:&lt;/strong&gt; Claims of 10 trillion parameters have been circulating widely. This number does not appear in Fortune's reporting or in Anthropic's official statements. As one Substack writer noted, "new and unconfirmed details, ten trillion parameters, ten billion dollars to train, were circulating alongside claims that came directly from Fortune's reporting" within hours of the original story.&lt;sup id="fnref10"&gt;10&lt;/sup&gt; Treat it as unverified.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Benchmarks:&lt;/strong&gt; We have Anthropic's claim of "dramatically higher scores" over Opus 4.6 in coding, reasoning, and cybersecurity. We do not have specific benchmark numbers, leaderboard rankings, or third-party evaluations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Release timeline:&lt;/strong&gt; The draft describes early access testing and acknowledges the model needs to become more efficient before general release. No date has been given.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Final naming:&lt;/strong&gt; Whether the model ships as Mythos, Capybara, or something else entirely is still undecided, based on the existence of two draft versions.&lt;sup id="fnref4"&gt;4&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Whether the hype is warranted:&lt;/strong&gt; Remember GPT-5. OpenAI made similar claims about a transformative leap in capabilities. The actual release was widely considered a disappointment that fell short of the company's promises.&lt;sup id="fnref11"&gt;11&lt;/sup&gt; Frontier model announcements and frontier model performance in the real world are not the same thing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Competitive Context
&lt;/h2&gt;

&lt;p&gt;Anthropic is not the only company pushing the frontier right now. OpenAI has reportedly finished pretraining a new model codenamed "Spud," with CEO Sam Altman internally claiming it can "really accelerate the economy."&lt;sup id="fnref12"&gt;12&lt;/sup&gt; Both companies are expected to time their strongest model releases ahead of planned IPOs later this year.&lt;sup id="fnref4"&gt;4&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The pattern is familiar. Each lab announces a step change, each release comes with safety caveats and controlled rollouts, and each time the actual impact takes months to assess once the models are in the hands of real users.&lt;/p&gt;

&lt;p&gt;What makes the Mythos leak different from the usual release cycle is the cybersecurity dimension. Anthropic did not choose to reveal this model. And the information that leaked was not marketing copy optimized for launch day. It was internal planning documents that included frank assessments of the risks.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Defenders
&lt;/h2&gt;

&lt;p&gt;If you work in cybersecurity, the takeaway is not "AI is going to replace your job." It's "the tools available to attackers just got substantially better, and the tools available to you did too."&lt;/p&gt;

&lt;p&gt;Anthropic's decision to give cybersecurity organizations early access to Mythos is a signal about where they think the model's impact will be felt first. Their own track record with Opus 4.6 supports that: 500+ zero-days found in hardened open-source projects, a documented state-sponsored campaign disrupted, and a vulnerability scanning tool that reasons about code instead of pattern-matching against known signatures.&lt;/p&gt;

&lt;p&gt;The models are getting better at finding bugs. The question is whether defenders adopt these capabilities faster than attackers do. That's always been the question in security. The tools just got a lot more powerful on both sides.&lt;/p&gt;

&lt;h2&gt;
  
  
  Notes
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://solomonneas.dev/blog/claude-mythos-anthropic-leak" rel="noopener noreferrer"&gt;solomonneas.dev&lt;/a&gt;. Follow me for more security engineering and AI infrastructure content.&lt;/em&gt;&lt;/p&gt;




&lt;ol&gt;

&lt;li id="fn1"&gt;
&lt;p&gt;Jeremy Kahn, "Exclusive: Anthropic Left Details of Unreleased AI Model, Exclusive CEO Event, in Unsecured Database," &lt;em&gt;Fortune&lt;/em&gt;, March 26, 2026, &lt;a href="https://fortune.com/2026/03/26/anthropic-leaked-unreleased-model-exclusive-event-security-issues-cybersecurity-unsecured-data-store/" rel="noopener noreferrer"&gt;https://fortune.com/2026/03/26/anthropic-leaked-unreleased-model-exclusive-event-security-issues-cybersecurity-unsecured-data-store/&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn2"&gt;
&lt;p&gt;Jeremy Kahn, "Exclusive: Anthropic 'Mythos' AI Model Representing 'Step Change' in Power Revealed in Data Leak," &lt;em&gt;Fortune&lt;/em&gt;, March 26, 2026, &lt;a href="https://fortune.com/2026/03/26/anthropic-says-testing-mythos-powerful-new-ai-model-after-data-leak-reveals-its-existence-step-change-in-capabilities/" rel="noopener noreferrer"&gt;https://fortune.com/2026/03/26/anthropic-says-testing-mythos-powerful-new-ai-model-after-data-leak-reveals-its-existence-step-change-in-capabilities/&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn3"&gt;
&lt;p&gt;Kahn, "Exclusive: Anthropic Left Details." ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn4"&gt;
&lt;p&gt;Matthias Bastian, "Anthropic Leak Reveals New Model 'Claude Mythos' with 'Dramatically Higher Scores on Tests' Than Any Previous Model," &lt;em&gt;The Decoder&lt;/em&gt;, March 27, 2026, &lt;a href="https://the-decoder.com/anthropic-leak-reveals-new-model-claude-mythos-with-dramatically-higher-scores-on-tests-than-any-previous-model/" rel="noopener noreferrer"&gt;https://the-decoder.com/anthropic-leak-reveals-new-model-claude-mythos-with-dramatically-higher-scores-on-tests-than-any-previous-model/&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn5"&gt;
&lt;p&gt;Anthropic, "Making Frontier Cybersecurity Capabilities Available to Defenders," Anthropic News, February 20, 2026, &lt;a href="https://www.anthropic.com/news/claude-code-security" rel="noopener noreferrer"&gt;https://www.anthropic.com/news/claude-code-security&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn6"&gt;
&lt;p&gt;Nicholas Carlini et al., "0-Days," Anthropic Frontier Red Team, February 5, 2026, &lt;a href="https://red.anthropic.com/2026/zero-days/" rel="noopener noreferrer"&gt;https://red.anthropic.com/2026/zero-days/&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn7"&gt;
&lt;p&gt;Anthropic, "Disrupting the First Reported AI-Orchestrated Cyber Espionage Campaign," Anthropic News, November 2025, &lt;a href="https://www.anthropic.com/news/disrupting-AI-espionage" rel="noopener noreferrer"&gt;https://www.anthropic.com/news/disrupting-AI-espionage&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn8"&gt;
&lt;p&gt;Jordan Novet, "Cybersecurity Stocks Fall on Report Anthropic Is Testing a Powerful New Model," &lt;em&gt;CNBC&lt;/em&gt;, March 27, 2026, &lt;a href="https://www.cnbc.com/2026/03/27/anthropic-cybersecurity-stocks-ai-mythos.html" rel="noopener noreferrer"&gt;https://www.cnbc.com/2026/03/27/anthropic-cybersecurity-stocks-ai-mythos.html&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn9"&gt;
&lt;p&gt;"Cybersecurity Stocks Plunge as Anthropic's 'Claude Mythos' Leak Sparks AI Fear," &lt;em&gt;Investing.com&lt;/em&gt;, March 28, 2026, &lt;a href="https://www.investing.com/news/stock-market-news/cybersecurity-stocks-plunge-as-anthropics-claude-mythos-leak-sparks-ai-fear-4584897" rel="noopener noreferrer"&gt;https://www.investing.com/news/stock-market-news/cybersecurity-stocks-plunge-as-anthropics-claude-mythos-leak-sparks-ai-fear-4584897&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn10"&gt;
&lt;p&gt;Stephen Fitzpatrick, "What the Heck Is Mythos?" Substack, March 30, 2026, &lt;a href="https://fitzyhistory.substack.com/p/what-the-heck-is-mythos" rel="noopener noreferrer"&gt;https://fitzyhistory.substack.com/p/what-the-heck-is-mythos&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn11"&gt;
&lt;p&gt;"GPT-5 Turned Out to Be a Major Letdown," &lt;em&gt;Futurism&lt;/em&gt;, accessed March 30, 2026, &lt;a href="https://futurism.com/gpt-5-disaster" rel="noopener noreferrer"&gt;https://futurism.com/gpt-5-disaster&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;li id="fn12"&gt;
&lt;p&gt;"OpenAI CEO Sam Altman Reportedly Teases a 'Very Strong' Model Internally That Can 'Really Accelerate the Economy,'" &lt;em&gt;The Decoder&lt;/em&gt;, accessed March 30, 2026, &lt;a href="https://the-decoder.com/openai-ceo-sam-altman-reportedly-teases-a-very-strong-model-internally-that-can-really-accelerate-the-economy/" rel="noopener noreferrer"&gt;https://the-decoder.com/openai-ceo-sam-altman-reportedly-teases-a-very-strong-model-internally-that-can-really-accelerate-the-economy/&lt;/a&gt;. ↩&lt;/p&gt;
&lt;/li&gt;

&lt;/ol&gt;

</description>
      <category>ai</category>
      <category>anthropic</category>
      <category>cybersecurity</category>
      <category>security</category>
    </item>
    <item>
      <title>Replacing SCCM with FOG Project</title>
      <dc:creator>Solomon Neas</dc:creator>
      <pubDate>Sun, 29 Mar 2026 23:33:08 +0000</pubDate>
      <link>https://dev.to/solomonneas/replacing-sccm-with-fog-project-daf</link>
      <guid>https://dev.to/solomonneas/replacing-sccm-with-fog-project-daf</guid>
      <description>&lt;h1&gt;
  
  
  Replacing SCCM with FOG Project
&lt;/h1&gt;

&lt;p&gt;When I moved our infrastructure from Hyper-V to Proxmox, I also took the chance to rip out one of the heaviest pieces of the old stack: SCCM.&lt;/p&gt;

&lt;p&gt;For an enterprise with thousands of endpoints and a full Microsoft licensing budget, SCCM can make sense. For four instructional labs with 72 workstations, it was too much. It wanted Windows Server, SQL Server, licensing baggage, and constant babysitting just to do the thing I actually needed: boot a machine over the network, lay down a clean image, put it back in the domain, and get out of the way.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/FOGProject/fogproject" rel="noopener noreferrer"&gt;FOG Project&lt;/a&gt; was the answer.&lt;/p&gt;

&lt;p&gt;This post is the technical deep-dive for the imaging side of the migration. For the full Hyper-V to Proxmox migration story, see &lt;a href="https://solomonneas.dev/blog/hyperv-to-proxmox-migration-guide" rel="noopener noreferrer"&gt;the migration deep-dive&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The environment I was replacing
&lt;/h2&gt;

&lt;p&gt;The target was straightforward on paper:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;4 classrooms&lt;/li&gt;
&lt;li&gt;72 total workstations&lt;/li&gt;
&lt;li&gt;3 hardware-specific Windows 11 images&lt;/li&gt;
&lt;li&gt;47 machines with known MAC addresses at import time&lt;/li&gt;
&lt;li&gt;25 machines that would need to self-register on first PXE boot&lt;/li&gt;
&lt;li&gt;Active Directory domain join after imaging&lt;/li&gt;
&lt;li&gt;Room-specific OU placement&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The architecture ended up looking like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;FOG server:&lt;/strong&gt; Debian Trixie 13 LXC on Proxmox, &lt;code&gt;10.0.1.20&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Domain controllers / DHCP:&lt;/strong&gt; Windows Server, &lt;code&gt;10.0.1.10&lt;/code&gt; and &lt;code&gt;10.0.1.11&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Domain:&lt;/strong&gt; &lt;code&gt;LAB.LOCAL&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Join account:&lt;/strong&gt; &lt;code&gt;svc-domainjoin@LAB.LOCAL&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rooms:&lt;/strong&gt; &lt;code&gt;Room-A&lt;/code&gt;, &lt;code&gt;Room-B&lt;/code&gt;, &lt;code&gt;Room-C&lt;/code&gt;, &lt;code&gt;Room-D&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Images:&lt;/strong&gt; &lt;code&gt;Win11-Lab-G4&lt;/code&gt;, &lt;code&gt;Win11-Lab-Ultrawide&lt;/code&gt;, &lt;code&gt;Win11-Lab-G9&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I did &lt;strong&gt;not&lt;/strong&gt; build images in Proxmox VMs. I built them on &lt;strong&gt;reference machines that matched the real lab hardware&lt;/strong&gt;. The rooms were not identical. One had HP G4s, one had HP G9s, and two rooms had Dell FCT2250s paired with Dell UltraSharp 49" curved monitors (U4924DW/U4919DW), which needed their own driver set.&lt;/p&gt;

&lt;p&gt;Three golden images, three hardware profiles:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Room&lt;/th&gt;
&lt;th&gt;Image&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Room-A&lt;/td&gt;
&lt;td&gt;Win11-Lab-G4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Room-B&lt;/td&gt;
&lt;td&gt;Win11-Lab-Ultrawide&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Room-C&lt;/td&gt;
&lt;td&gt;Win11-Lab-Ultrawide&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Room-D&lt;/td&gt;
&lt;td&gt;Win11-Lab-G9&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Active Directory prep
&lt;/h2&gt;

&lt;p&gt;The Linux box gets the attention in a FOG write-up, but the boring Windows prep is what made the deployment clean.&lt;/p&gt;

&lt;p&gt;I created a least-privilege service account called &lt;code&gt;svc-domainjoin&lt;/code&gt; and delegated only what FOG needed on the classroom OUs: create computer objects, delete computer objects, and full control on descendant computer objects.&lt;/p&gt;

&lt;p&gt;The OU layout was one OU per room. Then I delegated permissions with &lt;code&gt;dsacls&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;dsacls&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"OU=Room-A,DC=LAB,DC=LOCAL"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;/I:T&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;/G&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"LAB\svc-domainjoin:CC;computer"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;dsacls&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"OU=Room-A,DC=LAB,DC=LOCAL"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;/I:T&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;/G&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"LAB\svc-domainjoin:DC;computer"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;dsacls&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"OU=Room-A,DC=LAB,DC=LOCAL"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;/I:S&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;/G&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"LAB\svc-domainjoin:GA;;computer"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same pattern on all four classroom OUs.&lt;/p&gt;

&lt;p&gt;Before touching imaging at all, I cleaned house in AD: moved 38 misplaced computer objects into the right OUs, deleted 60 stale ones from old naming schemes, unlinked the SCCM GPO from the room OUs, and exported a CSV of all 72 lab machines.&lt;/p&gt;

&lt;p&gt;Then I set DHCP options 66 and 67 on the classroom scopes to point to FOG:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Option 66:&lt;/strong&gt; &lt;code&gt;10.0.1.20&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Option 67:&lt;/strong&gt; initially &lt;code&gt;ipxe.efi&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Should have been enough. It was not.&lt;/p&gt;

&lt;h2&gt;
  
  
  The SCCM ghost in DHCP
&lt;/h2&gt;

&lt;p&gt;PXE still misbehaved on some scopes even with the scope-level options pointing to FOG.&lt;/p&gt;

&lt;p&gt;The culprit was stale SCCM and WDS &lt;strong&gt;policies&lt;/strong&gt; still attached to several scopes. DHCP policies take priority over scope options. Clients matching the PXE vendor class were quietly getting sent to the retired SCCM server at &lt;code&gt;10.0.1.14&lt;/code&gt; instead of FOG at &lt;code&gt;10.0.1.20&lt;/code&gt;. The old policies referenced &lt;code&gt;smsboot\x64\wdsmgfw.efi&lt;/code&gt; and &lt;code&gt;smsboot\x64\wdsnbp.com&lt;/code&gt; with the old boot server IP.&lt;/p&gt;

&lt;p&gt;Eleven of those policies, spread across five scopes. Once I removed them all, PXE stopped getting hijacked by dead infrastructure.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If PXE settings look right but clients keep booting somewhere else, check DHCP policies before you blame FOG.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  FOG on Debian Trixie: three bugs in a trenchcoat
&lt;/h2&gt;

&lt;p&gt;The FOG server was a Debian 13 LXC container on Proxmox. When I first looked at it, the web UI was running, which made it look installed.&lt;/p&gt;

&lt;p&gt;It was not. The backend was missing entirely. No TFTP boot files. No NFS exports. No FOG services. The web UI was a shell with nothing behind it.&lt;/p&gt;

&lt;p&gt;Re-ran the installer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; /tmp/fogproject/bin
bash installfog.sh &lt;span class="nt"&gt;-y&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Trixie had other plans.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug 1: wrong &lt;code&gt;osid&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;FOG had the wrong OS ID in &lt;code&gt;.fogsettings&lt;/code&gt;. It detected the machine as Arch Linux instead of Debian. Fix in &lt;code&gt;/opt/fog/.fogsettings&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;osid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'3'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;osid&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'2'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without that, every package operation targeted the wrong distro.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug 2: &lt;code&gt;libcurl4&lt;/code&gt; renamed on Debian 13
&lt;/h3&gt;

&lt;p&gt;Debian 13's &lt;code&gt;t64&lt;/code&gt; transition renamed &lt;code&gt;libcurl4&lt;/code&gt; to &lt;code&gt;libcurl4t64&lt;/code&gt;. FOG's installer still asked for the old name. I patched the package check in &lt;code&gt;functions.sh&lt;/code&gt; to handle Debian &amp;gt;= 13.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug 3: &lt;code&gt;lastlog&lt;/code&gt; became &lt;code&gt;lastlog2&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;FOG's installer checks &lt;code&gt;lastlog&lt;/code&gt; to verify user creation. Trixie dropped it for &lt;code&gt;lastlog2&lt;/code&gt;. Quick fix:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; lastlog2
&lt;span class="nb"&gt;ln&lt;/span&gt; &lt;span class="nt"&gt;-sf&lt;/span&gt; /usr/bin/lastlog2 /usr/local/bin/lastlog
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once I patched all three, the installer completed and the backend finally matched the web UI.&lt;/p&gt;

&lt;h2&gt;
  
  
  NFS inside LXC: the container wall
&lt;/h2&gt;

&lt;p&gt;With the installer patched, the next wall was image storage. FOG uses NFS for capture and deployment. Inside a Proxmox LXC container, kernel NFS does not cooperate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mount: /proc/fs/nfsd: permission denied
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I tried the usual Proxmox container feature flags:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pct &lt;span class="nb"&gt;set&lt;/span&gt; &amp;lt;CTID&amp;gt; &lt;span class="nt"&gt;-features&lt;/span&gt; &lt;span class="nv"&gt;nesting&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1,nfs&lt;span class="o"&gt;=&lt;/span&gt;1
pct reboot &amp;lt;CTID&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Better, but still not a reliable kernel NFS server. I stopped fighting it and went userspace: &lt;code&gt;nfs-ganesha&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;nfs-ganesha nfs-ganesha-vfs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One annoying gotcha: Ganesha's pseudo paths cannot nest. Having &lt;code&gt;/images&lt;/code&gt; and &lt;code&gt;/images/dev&lt;/code&gt; as pseudo paths breaks child lookup. I had to flatten the namespace.&lt;/p&gt;

&lt;p&gt;The working &lt;code&gt;/etc/ganesha/ganesha.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="n"&gt;NFS_CORE_PARAM&lt;/span&gt; {
    &lt;span class="n"&gt;Protocols&lt;/span&gt; = &lt;span class="m"&gt;3&lt;/span&gt;,&lt;span class="m"&gt;4&lt;/span&gt;;
    &lt;span class="n"&gt;mount_path_pseudo&lt;/span&gt; = &lt;span class="n"&gt;true&lt;/span&gt;;
    &lt;span class="n"&gt;allow_set_io_flusher_fail&lt;/span&gt; = &lt;span class="n"&gt;true&lt;/span&gt;;
}

&lt;span class="n"&gt;EXPORT&lt;/span&gt; {
    &lt;span class="n"&gt;Export_Id&lt;/span&gt; = &lt;span class="m"&gt;1&lt;/span&gt;;
    &lt;span class="n"&gt;Path&lt;/span&gt; = /&lt;span class="n"&gt;images&lt;/span&gt;;
    &lt;span class="n"&gt;Pseudo&lt;/span&gt; = /&lt;span class="n"&gt;images&lt;/span&gt;;
    &lt;span class="n"&gt;Protocols&lt;/span&gt; = &lt;span class="m"&gt;3&lt;/span&gt;,&lt;span class="m"&gt;4&lt;/span&gt;;
    &lt;span class="n"&gt;Access_Type&lt;/span&gt; = &lt;span class="n"&gt;RO&lt;/span&gt;;
    &lt;span class="n"&gt;Squash&lt;/span&gt; = &lt;span class="n"&gt;no_root_squash&lt;/span&gt;;
    &lt;span class="n"&gt;FSAL&lt;/span&gt; { &lt;span class="n"&gt;Name&lt;/span&gt; = &lt;span class="n"&gt;VFS&lt;/span&gt;; }
    &lt;span class="n"&gt;CLIENT&lt;/span&gt; { &lt;span class="n"&gt;Clients&lt;/span&gt; = *; &lt;span class="n"&gt;Access_Type&lt;/span&gt; = &lt;span class="n"&gt;RO&lt;/span&gt;; }
}

&lt;span class="n"&gt;EXPORT&lt;/span&gt; {
    &lt;span class="n"&gt;Export_Id&lt;/span&gt; = &lt;span class="m"&gt;2&lt;/span&gt;;
    &lt;span class="n"&gt;Path&lt;/span&gt; = /&lt;span class="n"&gt;images&lt;/span&gt;/&lt;span class="n"&gt;dev&lt;/span&gt;;
    &lt;span class="n"&gt;Pseudo&lt;/span&gt; = /&lt;span class="n"&gt;imagesDev&lt;/span&gt;;
    &lt;span class="n"&gt;Protocols&lt;/span&gt; = &lt;span class="m"&gt;3&lt;/span&gt;,&lt;span class="m"&gt;4&lt;/span&gt;;
    &lt;span class="n"&gt;Access_Type&lt;/span&gt; = &lt;span class="n"&gt;RW&lt;/span&gt;;
    &lt;span class="n"&gt;Squash&lt;/span&gt; = &lt;span class="n"&gt;no_root_squash&lt;/span&gt;;
    &lt;span class="n"&gt;FSAL&lt;/span&gt; { &lt;span class="n"&gt;Name&lt;/span&gt; = &lt;span class="n"&gt;VFS&lt;/span&gt;; }
    &lt;span class="n"&gt;CLIENT&lt;/span&gt; { &lt;span class="n"&gt;Clients&lt;/span&gt; = *; &lt;span class="n"&gt;Access_Type&lt;/span&gt; = &lt;span class="n"&gt;RW&lt;/span&gt;; }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The critical line for LXC: &lt;code&gt;allow_set_io_flusher_fail = true;&lt;/code&gt;. Without it, Ganesha dies with &lt;code&gt;EPERM&lt;/code&gt; on &lt;code&gt;PR_SET_IO_FLUSHER&lt;/code&gt;. That flag lets it skip the kernel call and keep running.&lt;/p&gt;

&lt;p&gt;Verify with &lt;code&gt;showmount -e localhost&lt;/code&gt; and &lt;code&gt;rpcinfo -p localhost&lt;/code&gt;. If both look right, NFS is serving.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Kernel NFS in LXC was a dead end. Userspace NFS was the practical way through it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The golden image pipeline
&lt;/h2&gt;

&lt;p&gt;With the server side stable, the real work was image prep. Same process every time:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;clean install Windows 11 on the reference machine&lt;/li&gt;
&lt;li&gt;install drivers, software, updates, and room-specific settings&lt;/li&gt;
&lt;li&gt;keep it off the domain&lt;/li&gt;
&lt;li&gt;install the FOG client if needed for post-deploy tasks&lt;/li&gt;
&lt;li&gt;clean it aggressively&lt;/li&gt;
&lt;li&gt;sysprep and shut down&lt;/li&gt;
&lt;li&gt;PXE boot and capture in FOG&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Step 5 became its own thing. The cleanup one-liner I kept copy-pasting between machines. Ugly, but it works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;Stop-Service&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;wuauserv&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Force&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Remove-Item&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;C:\Windows\SoftwareDistribution\&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Recurse&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Force&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-EA&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;SilentlyContinue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Start-Service&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;wuauserv&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Dism&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;/online&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;/Cleanup-Image&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;/StartComponentCleanup&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;/ResetBase&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Remove-Item&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;C:\Windows\Panther\&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Recurse&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Force&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-EA&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;SilentlyContinue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Remove-Item&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;C:\Windows\Temp\&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Recurse&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Force&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-EA&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;SilentlyContinue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Remove-Item&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;C:\Windows\Prefetch\&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Recurse&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Force&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-EA&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;SilentlyContinue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Remove-Item&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$&lt;/span&gt;&lt;span class="nn"&gt;env&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;TEMP&lt;/span&gt;&lt;span class="s2"&gt;\*"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Recurse&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Force&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-EA&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;SilentlyContinue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Clear-RecycleBin&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Force&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-EA&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;SilentlyContinue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;cleanmgr&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;/sagerun:1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;C:\Windows\System32\Sysprep\sysprep.exe&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;/oobe&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;/generalize&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;/shutdown&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I started calling it the Frankenstein command. It's stitched together from half a dozen different guides and does way too much in one line. But for repeatable image prep, it earned its place.&lt;/p&gt;

&lt;p&gt;Broken down:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Stop-Service wuauserv -Force&lt;/code&gt; / &lt;code&gt;Remove-Item SoftwareDistribution&lt;/code&gt; / &lt;code&gt;Start-Service wuauserv&lt;/code&gt;: kill Windows Update, nuke the cache, bring it back.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Dism /online /Cleanup-Image /StartComponentCleanup /ResetBase&lt;/code&gt;: collapse superseded component versions. Frees real space.&lt;/li&gt;
&lt;li&gt;The four &lt;code&gt;Remove-Item&lt;/code&gt; lines: purge Panther logs, system temp, Prefetch, and user temp.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Clear-RecycleBin&lt;/code&gt;: self-explanatory.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;cleanmgr /sagerun:1&lt;/code&gt;: Disk Cleanup with preconfigured settings.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sysprep.exe /oobe /generalize /shutdown&lt;/code&gt;: strip machine identity, stage OOBE, power off.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; After sysprep shuts the machine down, do not let it boot back into Windows before capture. If it does, you just spent your sysprep state and get to do it again.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Sysprep on Windows 11: Appx package hell
&lt;/h2&gt;

&lt;p&gt;The worst part of this project was not FOG. It was Sysprep.&lt;/p&gt;

&lt;p&gt;If you've fought Windows 11 Sysprep before, you know the error. If you haven't, it looks like this in the Panther logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Sysprep was not able to validate your Windows installation.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;More specifically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Package &amp;lt;app&amp;gt; was installed for a user, but not provisioned for all users. This package will not function properly in the sysprep image.
Failed to remove apps for the current user: 0x80073cf2.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On Windows 11 23H2 and 24H2, the Microsoft Store silently updates Appx packages per-user. That creates a version mismatch between the installed state and the provisioned state. Sysprep sees it, panics, refuses to generalize.&lt;/p&gt;

&lt;p&gt;What I tried that did &lt;strong&gt;not&lt;/strong&gt; work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Remove-AppxPackage -AllUsers&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Remove-AppxProvisionedPackage&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;the mythical &lt;code&gt;SkipAppxValidation&lt;/code&gt; registry setting&lt;/li&gt;
&lt;li&gt;editing &lt;code&gt;Generalize.xml&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;renaming &lt;code&gt;AppxSysprep.dll&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Some fail outright. Some appear to work, then the packages come back after a reboot. Some just swap one error for a different one.&lt;/p&gt;

&lt;p&gt;What actually worked: changing the source image entirely.&lt;/p&gt;

&lt;p&gt;I used Chris Titus Tech's WinUtil to build a &lt;strong&gt;MicroWin&lt;/strong&gt; ISO. Instead of installing stock Windows and spending hours ripping Store junk back out, I started from a clean ISO that never had the Appx baggage in the first place.&lt;/p&gt;

&lt;p&gt;Prevention over cleanup:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;build MicroWin ISO&lt;/li&gt;
&lt;li&gt;install on the reference machine&lt;/li&gt;
&lt;li&gt;configure drivers and software&lt;/li&gt;
&lt;li&gt;disconnect from the network before sysprep if needed&lt;/li&gt;
&lt;li&gt;run the cleanup command&lt;/li&gt;
&lt;li&gt;sysprep, shut down, capture immediately&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That was the difference between fighting Sysprep every time and having it just work.&lt;/p&gt;

&lt;h2&gt;
  
  
  PXE on newer hardware: &lt;code&gt;snponly.efi&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;One more issue that looked like FOG but wasn't. A newer motherboard would download &lt;code&gt;ipxe.efi&lt;/code&gt;, start up, and freeze at:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;iPXE initializing devices...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No &lt;code&gt;ok&lt;/code&gt;. Just stuck.&lt;/p&gt;

&lt;p&gt;iPXE's built-in NIC driver couldn't handle the newer chipset. The fix: switch DHCP option 67 from &lt;code&gt;ipxe.efi&lt;/code&gt; to &lt;code&gt;snponly.efi&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The difference is simple. &lt;code&gt;ipxe.efi&lt;/code&gt; ships its own NIC drivers. &lt;code&gt;snponly.efi&lt;/code&gt; delegates to the UEFI firmware's native SNP driver. On newer boards, the firmware driver works where iPXE's doesn't.&lt;/p&gt;

&lt;p&gt;One DHCP change, &lt;code&gt;ipxe.efi&lt;/code&gt; to &lt;code&gt;snponly.efi&lt;/code&gt;, and the machine booted straight to the FOG menu. It became the default for all our UEFI clients.&lt;/p&gt;

&lt;h2&gt;
  
  
  72 systems, deployed by room
&lt;/h2&gt;

&lt;p&gt;With images captured and PXE stable, FOG could finally do its job.&lt;/p&gt;

&lt;p&gt;Bulk imported the workstation list from CSV:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;47 had MAC addresses ready for direct import&lt;/li&gt;
&lt;li&gt;25 had no active lease at the time, so they were left to auto-register when first PXE booted&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each host got the right image and room group. Deploying a room was: select the group, schedule a deploy task, PXE boot the room. That's it.&lt;/p&gt;

&lt;p&gt;FOG uses Partclone under the hood, so it only transfers used blocks. A 500 GB drive with 30 GB used doesn't push 500 GB over the wire. It pushes roughly 30 GB, compressed. Room-wide imaging is faster than people expect.&lt;/p&gt;

&lt;p&gt;After deployment, the FOG client sets the hostname, joins the domain via &lt;code&gt;svc-domainjoin&lt;/code&gt;, and drops the computer object in the correct OU. Multicast is also available for imaging an entire room simultaneously.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned
&lt;/h2&gt;

&lt;p&gt;Setting up FOG is not just "install FOG." It's DHCP policy archaeology, AD delegation, Windows image hygiene, boot loader compatibility, and the reality of running services inside Linux containers.&lt;/p&gt;

&lt;p&gt;The short list:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DHCP policies override scope options.&lt;/strong&gt; Check for leftover WDS/SCCM policies before blaming FOG.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A running web UI does not mean FOG is installed.&lt;/strong&gt; Verify TFTP, services, and NFS.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FOG on Debian Trixie needs manual patches.&lt;/strong&gt; &lt;code&gt;osid&lt;/code&gt;, &lt;code&gt;libcurl4t64&lt;/code&gt;, &lt;code&gt;lastlog2&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kernel NFS in Proxmox LXC is a dead end.&lt;/strong&gt; Use &lt;code&gt;nfs-ganesha&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build images on reference machines.&lt;/strong&gt; VM images don't carry driver quirks correctly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debloat before install, not after.&lt;/strong&gt; MicroWin saves hours of Appx cleanup.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test Sysprep on something disposable first.&lt;/strong&gt; Appx breakage on your finished image is a bad time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;snponly.efi&lt;/code&gt; for newer hardware.&lt;/strong&gt; If iPXE hangs at device init, switch boot files.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Never boot a sysprepped machine before capture.&lt;/strong&gt; Generalize is one-shot.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FOG's &lt;code&gt;hostName&lt;/code&gt; is VARCHAR(16).&lt;/strong&gt; Plan your naming accordingly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once the rough edges were filed down, the day-to-day got simple. Pick the room. PXE boot. Image. Domain join. Done. Way less overhead than SCCM ever was.&lt;/p&gt;

&lt;h2&gt;
  
  
  That's the kind of boring I want from lab infrastructure.
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;This post is part of a larger infrastructure migration series. For the full Hyper-V to Proxmox migration story, see the &lt;a href="https://solomonneas.dev/projects/hyperv-to-proxmox-migration" rel="noopener noreferrer"&gt;project page on solomonneas.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://solomonneas.dev/blog/replacing-sccm-with-fog-project" rel="noopener noreferrer"&gt;solomonneas.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>fogproject</category>
      <category>windows</category>
      <category>pxe</category>
      <category>sysadmin</category>
    </item>
    <item>
      <title>I Built 7 MCP Servers for Security Tools. The Protocol Was the Easy Part.</title>
      <dc:creator>Solomon Neas</dc:creator>
      <pubDate>Mon, 23 Mar 2026 20:59:54 +0000</pubDate>
      <link>https://dev.to/solomonneas/i-built-7-mcp-servers-for-security-tools-the-protocol-was-the-easy-part-4137</link>
      <guid>https://dev.to/solomonneas/i-built-7-mcp-servers-for-security-tools-the-protocol-was-the-easy-part-4137</guid>
      <description>&lt;p&gt;I wanted my AI agent to talk directly to my security stack. Not through copy-pasted log snippets. Not through screenshots of dashboards. Actual tool calls against live data.&lt;/p&gt;

&lt;p&gt;So I built seven MCP servers. Wazuh. Suricata. Zeek. TheHive. Cortex. MISP. MITRE ATT&amp;amp;CK. All open source, all on &lt;a href="https://github.com/solomonneas" rel="noopener noreferrer"&gt;my GitHub&lt;/a&gt;. Project page: &lt;a href="https://solomonneas.dev/projects/security-mcp-servers" rel="noopener noreferrer"&gt;https://solomonneas.dev/projects/security-mcp-servers&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The protocol layer took a weekend. The context engineering took weeks. That ratio surprised me.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Actually Built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;API-based servers&lt;/strong&gt; talk directly to running services. Wazuh MCP hits the manager's REST API on port 55000 for alerts, agent status, vulnerability scans, and file integrity events. TheHive and Cortex connect to their respective APIs for case management and observable analysis. MISP pulls threat intelligence feeds and IOC lookups.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Log-based servers&lt;/strong&gt; parse files on disk. Zeek MCP reads from a log directory (JSON or TSV format), letting you query connection logs, DNS, HTTP, SSL, and file analysis data. Suricata MCP reads EVE JSON logs for IDS alerts, flow data, and protocol metadata.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Knowledge-base servers&lt;/strong&gt; work offline. The MITRE ATT&amp;amp;CK server downloads STIX 2.1 bundles and lets you query techniques, tactics, groups, software, and mitigations without hitting any external API.&lt;/p&gt;

&lt;p&gt;Each server exposes a focused set of tools. Wazuh has &lt;code&gt;get_alerts&lt;/code&gt;, &lt;code&gt;list_agents&lt;/code&gt;, &lt;code&gt;get_vulnerabilities&lt;/code&gt;, &lt;code&gt;get_fim_events&lt;/code&gt;. Zeek has &lt;code&gt;query_connections&lt;/code&gt;, &lt;code&gt;search_dns&lt;/code&gt;, &lt;code&gt;get_ssl_certs&lt;/code&gt;. Suricata has &lt;code&gt;get_alerts&lt;/code&gt;, &lt;code&gt;get_flow_stats&lt;/code&gt;, &lt;code&gt;search_protocols&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Every tool does one thing with predictable output. Full code and docs at &lt;a href="https://github.com/solomonneas" rel="noopener noreferrer"&gt;github.com/solomonneas&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing Against Live Infrastructure
&lt;/h2&gt;

&lt;p&gt;Every server got tested against real running services on my home infrastructure.&lt;/p&gt;

&lt;p&gt;Wazuh MCP was tested against my Wazuh 4.14.1 instance running on Proxmox. I queried live alerts, pulled agent status for my connected machines, ran vulnerability scan results, and verified file integrity monitoring events. The agent reconnection workflow got tested end-to-end: listing disconnected agents, checking last keep-alive, triggering restarts.&lt;/p&gt;

&lt;p&gt;Zeek and Suricata servers were tested against actual captured traffic. Real log files through both parsers, connection correlation across source/destination pairs, DNS query lookups, and stress-tested time-window filtering with large log directories. Edge cases like malformed log entries and mixed JSON/TSV formats got handled explicitly.&lt;/p&gt;

&lt;p&gt;TheHive and Cortex were tested against their APIs with sample cases and observables. MISP was tested with real IOC lookups. The MITRE ATT&amp;amp;CK server was verified against the full STIX 2.1 enterprise bundle.&lt;/p&gt;

&lt;p&gt;The goal was not just "does the tool call succeed." It was "does the model get back data it can actually reason about for a real investigation."&lt;/p&gt;

&lt;h2&gt;
  
  
  Context Design Is the Real Engineering
&lt;/h2&gt;

&lt;p&gt;Security telemetry is exactly the kind of data language models handle poorly. It's verbose, repetitive, and full of fields that matter sometimes and are noise the rest of the time.&lt;/p&gt;

&lt;p&gt;Take Wazuh alerts. A single alert has 40+ fields. Dump all of that into a model and ask it to "analyze the situation." You'll get a vague summary that touches everything and understands nothing.&lt;/p&gt;

&lt;p&gt;My first versions returned raw API responses. The model would pick whatever fields were easiest to talk about instead of whatever actually mattered.&lt;/p&gt;

&lt;p&gt;So I started designing the context layer. For Wazuh, I filter to severity 8+ by default and return a focused subset: timestamp, rule description, agent name, source IP, and MITRE technique. For Zeek, I pre-aggregate by source/destination pair and surface unusual patterns first. For Suricata, I separate IDS alerts from flow metadata. Detections first, network context second.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where It Gets Interesting
&lt;/h2&gt;

&lt;p&gt;A Wazuh alert fires for a suspicious process. The model checks Zeek for that host's network activity. Finds outbound connections to an unusual IP. Queries ATT&amp;amp;CK for technique mapping. Checks MISP for threat intel on the destination.&lt;/p&gt;

&lt;p&gt;That correlation chain used to take 15 minutes of clicking through interfaces. Now it takes one question.&lt;/p&gt;

&lt;p&gt;I'm not replacing analysts. I'm killing the mechanical evidence-gathering that burns time before a human reaches the real decisions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Lesson
&lt;/h2&gt;

&lt;p&gt;The protocol is a solved problem. MCP works. The bottleneck is what happens between raw data and the model's context window. Filtering, ordering, scoping, pre-summarizing. That's where analysis quality is determined.&lt;/p&gt;

&lt;p&gt;A model with access to every field in every log is worse off than one that sees the right 15 fields in the right order.&lt;/p&gt;

&lt;p&gt;Seven servers. All open source. All tested against live infrastructure. Code at &lt;a href="https://github.com/solomonneas" rel="noopener noreferrer"&gt;github.com/solomonneas&lt;/a&gt;. The protocol was a weekend. The context design is ongoing. That's the ratio that matters.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>opensource</category>
      <category>security</category>
    </item>
    <item>
      <title>I Migrated Our Entire Infrastructure from Hyper-V to Proxmox. Here's Everything I Learned.</title>
      <dc:creator>Solomon Neas</dc:creator>
      <pubDate>Sat, 14 Mar 2026 06:45:04 +0000</pubDate>
      <link>https://dev.to/solomonneas/i-migrated-our-entire-infrastructure-from-hyper-v-to-proxmox-heres-everything-i-learned-g3k</link>
      <guid>https://dev.to/solomonneas/i-migrated-our-entire-infrastructure-from-hyper-v-to-proxmox-heres-everything-i-learned-g3k</guid>
      <description>&lt;p&gt;Domain controllers, file servers, network monitoring, imaging, WiFi controllers. All of it moved from Microsoft to open source. No downtime. No data loss. Here's the complete playbook.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why We Left Hyper-V
&lt;/h2&gt;

&lt;p&gt;Broadcom acquired VMware and started charging $350/core/year for VCF licensing. They killed the VMware IT Academy program entirely. The institution moved from vSphere to Hyper-V as a cost-saving measure, but I'd already done a VMware to Proxmox migration on my own infrastructure at that point. That migration opened my eyes to how good Proxmox actually is.&lt;/p&gt;

&lt;p&gt;It's more lightweight. The web UI gives you more granular control than Hyper-V Manager ever did. Snapshots, live migration, ZFS, LXC containers, and full KVM virtualization all in one platform. Completely free. No per-socket licensing, no Windows Server dependency, no CALs. One less thing Microsoft gets to hold over your budget.&lt;/p&gt;

&lt;p&gt;Hyper-V felt heavy by comparison. Limited Linux VM support, clunky management (RDP into the host just to touch anything), and tight coupling to Windows Server licensing. Once I'd seen what Proxmox could do, going back to Hyper-V felt like a downgrade.&lt;/p&gt;

&lt;p&gt;The question was never "should we migrate?" It was "how do we migrate production Active Directory, network monitoring, file servers, and imaging infrastructure without breaking anything?"&lt;/p&gt;

&lt;h2&gt;
  
  
  The Power of Root on a Proxmox Host
&lt;/h2&gt;

&lt;p&gt;One thing that surprised me coming from Hyper-V: you have full root access to the Proxmox host. It's just Debian under the hood. You can SSH in, run any Linux command, script anything, automate everything. Hyper-V locks you into PowerShell remoting or RDP. Proxmox gives you a real shell on a real Linux system.&lt;/p&gt;

&lt;p&gt;Need to resize a disk? One command. Snapshot a VM? One command. Migrate a VM between hosts? One command. Everything in the web UI is also available from the CLI through &lt;code&gt;qm&lt;/code&gt; (VM management), &lt;code&gt;pct&lt;/code&gt; (container management), &lt;code&gt;pvesm&lt;/code&gt; (storage), and &lt;code&gt;pvecm&lt;/code&gt; (cluster). You can script your entire infrastructure.&lt;/p&gt;

&lt;p&gt;But the real game changer is the &lt;a href="https://community-scripts.github.io/ProxmoxVE/" rel="noopener noreferrer"&gt;Proxmox VE Helper Scripts&lt;/a&gt; community project. These are one-liner bash scripts that spin up fully configured LXC containers or VMs for common services. Need a Pi-hole? One command. Docker host? One command. Home Assistant, Nginx Proxy Manager, Plex, Grafana, Wireguard? One command each.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Example: spin up a Docker LXC in seconds&lt;/span&gt;
bash &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;wget &lt;span class="nt"&gt;-qLO&lt;/span&gt; - https://github.com/community-scripts/ProxmoxVE/raw/main/ct/docker.sh&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The script handles everything: downloads the template, creates the container, configures networking, installs the service, and starts it. What would take 30 minutes of manual setup takes 60 seconds. I used these for several of our auxiliary services and they just work.&lt;/p&gt;

&lt;p&gt;Compare that to Hyper-V where deploying a new service means: create a VM, install Windows or manually download an ISO, walk through the installer, configure networking, install the actual application. The gap in operational speed is enormous.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Domain Controller Leapfrog
&lt;/h2&gt;

&lt;p&gt;This was the part that scared me most. Domain controllers are the heartbeat of a Windows network. Every authentication, every group policy, every DNS lookup flows through them. Get this wrong and the whole campus goes dark.&lt;/p&gt;

&lt;p&gt;The conventional wisdom is clear: &lt;strong&gt;never V2V a domain controller.&lt;/strong&gt; Converting a DC's virtual disk risks USN rollback, which permanently corrupts the AD replication database. There's no recovery path short of rebuilding the entire domain.&lt;/p&gt;

&lt;p&gt;Instead, I used what I call the "leapfrog" method. We had two DCs: DC1 and DC2, both on Hyper-V.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Transfer all five FSMO roles to DC2. Verify DHCP scopes, DNS zones, and AD replication are healthy. DC2 is now running the show.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; Delete DC1. Build a fresh Windows Server VM on Proxmox. Promote it to domain controller. AD replication syncs everything from DC2 automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3:&lt;/strong&gt; Transfer all FSMO roles to the new DC1 on Proxmox. Verify everything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4:&lt;/strong&gt; Delete DC2 on Hyper-V. Build fresh on Proxmox. Promote. AD replicates from DC1.&lt;/p&gt;

&lt;p&gt;Both domain controllers are now on Proxmox. Zero downtime. Zero data loss. The whole process was honestly easier than I expected because AD replication just works when you let it do its job.&lt;/p&gt;

&lt;p&gt;The PowerShell for the FSMO transfer is one command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;Move-ADDirectoryServerOperationMasterRole&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;-Identity&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"NEW-DC1"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;`
&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;-OperationMasterRole&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;SchemaMaster&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;DomainNamingMaster&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;`
&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nx"&gt;PDCEmulator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;RIDMaster&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;InfrastructureMaster&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Always verify with &lt;code&gt;repadmin /showrepl&lt;/code&gt; after each promotion and transfer. If replication shows errors, stop and fix them before proceeding.&lt;/p&gt;

&lt;h2&gt;
  
  
  Linux VM Migration: The V2V Process
&lt;/h2&gt;

&lt;p&gt;For Linux VMs (LibreNMS, Netdisco, Switchmap), I used direct disk conversion. The process:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Create a "shell" VM&lt;/strong&gt; in Proxmox. Set the OS type, match the BIOS to the source Hyper-V generation (Gen 1 = SeaBIOS, Gen 2 = OVMF UEFI), but &lt;strong&gt;do not create a hard drive.&lt;/strong&gt; The disk list should be empty.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SCP the VHDX&lt;/strong&gt; from the Hyper-V host to Proxmox:&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight powershell"&gt;&lt;code&gt;&lt;span class="n"&gt;scp&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"C:\Path\To\Disk.vhdx"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;root&lt;/span&gt;&lt;span class="err"&gt;@&lt;/span&gt;&lt;span class="nx"&gt;PROXMOX_IP:/var/lib/vz/dump/&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Import and attach&lt;/strong&gt; on the Proxmox side:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;qm importdisk 102 /var/lib/vz/dump/Netdisco.vhdx local-lvm
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then in the GUI: Hardware &amp;gt; double-click Unused Disk 0 &amp;gt; add as SCSI. Set boot order to prioritize scsi0.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Post-migration gotchas:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Network interface names change (eth0 becomes ens18). Update your netplan config.&lt;/li&gt;
&lt;li&gt;Install &lt;code&gt;qemu-guest-agent&lt;/code&gt; so Proxmox can see the VM's IP and gracefully shut it down.&lt;/li&gt;
&lt;li&gt;LibreNMS needed a full permissions reset. Run &lt;code&gt;validate.php&lt;/code&gt; as the librenms user and follow every instruction it gives you.&lt;/li&gt;
&lt;li&gt;Netdisco needed its database host changed to localhost in &lt;code&gt;deployment.yml&lt;/code&gt; and a session cookie key added to prevent crashes.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Killing DFS, Simplifying Drive Maps
&lt;/h2&gt;

&lt;p&gt;The old environment used a DFS namespace to abstract file server paths. For a single-server environment, DFS adds complexity that provides no benefit: 30-minute referral TTL, client cache issues, and another layer to troubleshoot when users can't access files.&lt;/p&gt;

&lt;p&gt;I ripped it out and replaced it with Group Policy Preferences drive mappings using item-level targeting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;*&lt;em&gt;X:\*&lt;/em&gt; mapped for faculty and staff, pointing to the full file server&lt;/li&gt;
&lt;li&gt;*&lt;em&gt;Y:\*&lt;/em&gt; mapped for students, pointing to the student folders only&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Security group membership determines which mapping a user gets. No login scripts, no DFS, no namespace caching. If a user is in the Faculty-Staff group, they get X:\. If they're in the Students group, they get Y:\. Simple.&lt;/p&gt;

&lt;h2&gt;
  
  
  UniFi Controller: Windows VM to LXC Container
&lt;/h2&gt;

&lt;p&gt;This one was almost comical. The UniFi controller was running on a Windows 11 VM inside Hyper-V. To manage the WiFi, you had to RDP into the Hyper-V host, then log into the Windows VM from there. No SSH. No remote management. Just nested RDP sessions.&lt;/p&gt;

&lt;p&gt;The migration:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Export the UniFi backup (.unf file) from the Windows controller&lt;/li&gt;
&lt;li&gt;Create an LXC container on Proxmox using the official UniFi template&lt;/li&gt;
&lt;li&gt;Upload the .unf backup and restore&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All WAP configurations, SSIDs, and client data came over intact. WiFi was back up in minutes. And now it runs in a lightweight container instead of a full Windows 11 VM. The resource savings alone made it worthwhile.&lt;/p&gt;

&lt;h2&gt;
  
  
  Replacing SCCM with FOG Project
&lt;/h2&gt;

&lt;p&gt;Microsoft SCCM is powerful but absurdly heavy for an educational lab environment. It needs Windows Server, SQL Server, per-device licensing, and significant infrastructure just to image workstations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/FOGProject/" rel="noopener noreferrer"&gt;FOG Project&lt;/a&gt; does everything we actually need: PXE boot imaging, hardware inventory, and centralized workstation management. It runs on Linux, costs nothing, and the web UI is straightforward.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Golden Image Pipeline
&lt;/h3&gt;

&lt;p&gt;I build golden images as Proxmox VMs (not on physical hardware) so I can snapshot before Sysprep. This is critical because if Sysprep fails, you cannot simply run it again. The only recovery is reverting to a snapshot.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Install and debloat.&lt;/strong&gt; Set up a clean Windows 11 installation on a reference machine. Run &lt;a href="https://github.com/ChrisTitusTech/winutil" rel="noopener noreferrer"&gt;Chris Titus Tech's Windows Utility&lt;/a&gt; to strip all the bloatware (Candy Crush, Spotify, Xbox, etc.) and disable telemetry. This handles both installed and provisioned packages, which is important because leftover staged Appx packages are the number one cause of silent Sysprep failures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Sysprep and shutdown.&lt;/strong&gt; Once the machine is configured how you want it, run &lt;code&gt;sysprep.exe /generalize /oobe /shutdown /unattend:C:\Windows\Panther\unattend.xml&lt;/code&gt;. The unattend file handles BypassNRO (Windows 11's forced internet requirement) and automates the OOBE setup after deployment. The machine shuts down after Sysprep completes. Do not power it back on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: FOG capture.&lt;/strong&gt; Schedule a capture task in the FOG web UI for that machine, then PXE boot it. FOG captures the sysprepped image as-is, sitting at OOBE. When the image gets deployed to a workstation later, the unattend.xml automates the OOBE setup, the FOG service agent kicks in for background management, and AD auto-join handles domain membership. No manual touch required.&lt;/p&gt;

&lt;h3&gt;
  
  
  Per-Classroom Deployment
&lt;/h3&gt;

&lt;p&gt;Each classroom has different hardware, so I maintain separate images per room. Every workstation is registered in FOG via CSV import (hostname + MAC address), grouped by classroom. When a room needs reimaging, I select the group, schedule a deploy task, and FOG uses Partclone to push the image. Partclone only writes used blocks, so imaging is fast even on large drives.&lt;/p&gt;

&lt;p&gt;The FOG agent runs on every workstation with a dedicated &lt;code&gt;fog-service&lt;/code&gt; Active Directory service account. DHCP points PXE boot to the FOG server using &lt;code&gt;snponly.efi&lt;/code&gt; for UEFI network boot. A machine needing reimaging just needs to PXE boot and everything happens automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  WSUS: Closing the Update Loop
&lt;/h2&gt;

&lt;p&gt;The last piece of the imaging puzzle was patch management. Without centralized updates, every golden image would need constant rebuilding just to stay current. And letting 60+ lab machines pull updates directly from Microsoft on their own schedule is a recipe for bandwidth problems and inconsistent states.&lt;/p&gt;

&lt;p&gt;I set up WSUS (Windows Server Update Services) directly on DC1. For an environment this size (four classrooms and a handful of staff machines), a dedicated WSUS server would be overkill. Running it on the domain controller keeps the footprint small and the management simple.&lt;/p&gt;

&lt;p&gt;The update pipeline works in two stages:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test lab first.&lt;/strong&gt; New updates land in WSUS but aren't auto-approved. I have a WSUS computer group for a small set of test machines. Updates get approved for the test group first. They run for about four to five days. This buffer is intentional: it's enough time for the community to flag zero-day issues, botched patches, or driver conflicts before anything hits production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Then classrooms.&lt;/strong&gt; After the test window passes clean, I approve updates for the classroom groups. WSUS pushes them from the local server, so machines pull patches over the LAN instead of each one hammering Microsoft's CDN individually. Faster downloads, less bandwidth, and every machine in a room ends up on the same patch level.&lt;/p&gt;

&lt;p&gt;This also means the golden image for FOG doesn't need to be rebuilt every Patch Tuesday. WSUS handles ongoing patching after deployment. The golden image only needs updating when there's a major feature release or a change to the base software stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Document interface names before migration.&lt;/strong&gt; Every Linux VM had a different post-migration network issue because the interface name changed. A quick &lt;code&gt;ip link show&lt;/code&gt; before the migration would have saved debugging time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test Sysprep on a throwaway VM first.&lt;/strong&gt; My first Sysprep attempt failed because of a leftover Xbox app. Always run through the full golden image pipeline once as a dry run before committing to your production image.&lt;/p&gt;

&lt;h2&gt;
  
  
  The SOC Stack
&lt;/h2&gt;

&lt;p&gt;I also migrated the full security operations stack: Wazuh for endpoint detection and SIEM, Cortex for automated analysis, TheHive for case management, and MISP for threat intelligence sharing. Same V2V process as the other Linux VMs. These were already running on Linux, so it was disk conversion, interface rename, guest agent install, and verify services. Nothing special, but worth mentioning because people forget about their security tooling when planning hypervisor migrations.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Final Tally
&lt;/h2&gt;

&lt;p&gt;When everything was done, the infrastructure footprint looked like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;4 standalone Proxmox servers&lt;/strong&gt; running production workloads: domain controllers, network monitoring (LibreNMS, Netdisco, Switchmap), Samba AD file server, FOG imaging, UniFi controller, and the SOC stack (Wazuh, Cortex, TheHive, MISP)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;6-node Proxmox cluster&lt;/strong&gt; for the NetLab environment, where students run hands-on lab exercises&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;10 total Proxmox hosts&lt;/strong&gt;, all on open-source infrastructure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Total hypervisor licensing cost: $0.&lt;/p&gt;

&lt;p&gt;The migration took planning and careful execution, but none of it was technically complex. The hardest part was convincing myself that AD replication would actually work as advertised. It did.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://solomonneas.dev/blog/hyperv-to-proxmox-migration-guide" rel="noopener noreferrer"&gt;solomonneas.dev&lt;/a&gt;. Find more of my writing on infrastructure, security tooling, and AI agents at &lt;a href="https://solomonneas.dev" rel="noopener noreferrer"&gt;solomonneas.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>proxmox</category>
      <category>devops</category>
      <category>linux</category>
      <category>sysadmin</category>
    </item>
  </channel>
</rss>
