<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: BeanBean</title>
    <description>The latest articles on DEV Community by BeanBean (@bean_bean).</description>
    <link>https://dev.to/bean_bean</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3849323%2Ff5585719-7c19-4ce0-a6dd-119f5e401fd4.png</url>
      <title>DEV Community: BeanBean</title>
      <link>https://dev.to/bean_bean</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/bean_bean"/>
    <language>en</language>
    <item>
      <title>Inside GPT-5.5-Cyber: Capabilities, Refusals, and Federal Briefings Explained</title>
      <dc:creator>BeanBean</dc:creator>
      <pubDate>Sat, 09 May 2026 05:00:01 +0000</pubDate>
      <link>https://dev.to/bean_bean/inside-gpt-55-cyber-capabilities-refusals-and-federal-briefings-explained-3501</link>
      <guid>https://dev.to/bean_bean/inside-gpt-55-cyber-capabilities-refusals-and-federal-briefings-explained-3501</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://nextfuture.io.vn/blog/inside-gpt-55-cyber-capabilities-refusals-and-federal-briefings-explained" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;OpenAI shipped GPT-5.5-Cyber to Trusted Access for Cyber (TAC) program participants in late April 2026 — exactly one week after Anthropic announced Mythos. Unlike standard GPT-5.5, this variant is fine-tuned on offensive and defensive security workflows, hardened against system prompt injection, and gated behind a roughly 40-org allowlist. If you're evaluating a TAC application, building defensive tooling, or just trying to understand what independent evals actually show about this model, here's the full picture.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters now
&lt;/h2&gt;

&lt;p&gt;OpenAI spent most of April 2026 publicly criticizing Anthropic for locking Mythos behind an allowlist. On April 30, OpenAI did exactly the same thing with GPT-5.5-Cyber — restricting access to TAC participants only. In parallel, OpenAI briefed US federal agencies, state governments, and Five Eyes allies on the model's capabilities, as &lt;a href="https://news.bensbites.com/posts/64786-sources-openai-has-been-briefing-us-federal-agencies-state-governments-and-five-eyes-allies-on-the-capabilities-of-its-gpt-54-cyber-model-over-the-past-week/out" rel="noopener noreferrer"&gt;BensBites sources reported&lt;/a&gt;. Those briefings covered two capability buckets: automated vulnerability discovery in critical infrastructure codebases, and threat-actor attribution pattern matching at scale. Neither use case is accessible to commercial customers today, which matters for anyone building defensive tooling outside a government contractor or major enterprise security vendor context.&lt;/p&gt;

&lt;h2&gt;
  
  
  How GPT-5.5-Cyber works under the hood
&lt;/h2&gt;

&lt;p&gt;GPT-5.5-Cyber is a domain-specific fine-tune of the base GPT-5.5 weights, with reinforcement learning from cyber-specific feedback (RLCF) applied post-training. Simon Willison's April 30 evaluation — the most technically rigorous public test to date — ran 47 CTF challenges across binary exploitation, web security, and cryptography categories. The model solved 31 of 47, a 66% pass rate, compared to 41% for standard GPT-5.5 on the same set. On defensive tasks (log triage, YARA rule generation, CVE prioritization), pass rates climbed above 80%. OpenAI has confirmed the cyber variant ships with a 32k-token context window by default and a 128k option for document-heavy workflows. System prompt injection resistance was specifically hardened for threat-modeling use cases.&lt;/p&gt;

&lt;p&gt;The model is available only via the &lt;code&gt;gpt-5.5-cyber&lt;/code&gt; model ID within the standard OpenAI API, but that ID resolves only for TAC-enrolled API keys. Any standard key returns a &lt;code&gt;404&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Standard key — will 404&lt;/span&gt;
curl https://api.openai.com/v1/chat/completions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$OPENAI_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "gpt-5.5-cyber",
    "messages": [{"role": "user", "content": "Generate a YARA rule for this IOC set."}]
  }'&lt;/span&gt;
&lt;span class="c"&gt;# → {"error":{"message":"The model `gpt-5.5-cyber` does not exist","code":"model_not_found"}}&lt;/span&gt;

&lt;span class="c"&gt;# TAC-enrolled key — works as expected&lt;/span&gt;
&lt;span class="c"&gt;# OPENAI_TAC_KEY is the API key from your TAC onboarding email&lt;/span&gt;
curl https://api.openai.com/v1/chat/completions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$OPENAI_TAC_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "gpt-5.5-cyber",
    "messages": [{"role": "user", "content": "Generate a YARA rule for this IOC set."}]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  3 use cases I'd actually use
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Automated YARA rule generation from threat feeds
&lt;/h3&gt;

&lt;p&gt;TAC participants report feeding raw threat intelligence — Mandiant reports, ISAC feeds, STIX bundles — into GPT-5.5-Cyber and getting deployable YARA rules back with confidence scores and false-positive estimates. The model cites source indicators inline, so your SOC team can audit the logic without re-reading the source doc. A Node.js integration looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_TAC_KEY&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-5.5-cyber&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;You are a threat intelligence analyst. Generate YARA rules from the provided IOCs. Return JSON with fields: rule (string), confidence (0-1), fp_estimate (string), source_iocs (array).&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;threatFeedText&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;response_format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;json_object&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;rule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;fp_estimate&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  CVE triage and stack-specific severity re-scoring
&lt;/h3&gt;

&lt;p&gt;The model re-scores CVEs against your specific stack context, not the generic NVD CVSS baseline. You pass your dependency manifest and deployed service config; it returns a re-ranked list with environment-specific exploitability estimates. &lt;a href="https://dev.to/alessandro_pignati/gpt-54-cyber-openais-game-changer-for-ai-security-and-defensive-ai-517l"&gt;Early dev.to tests on a Node.js microservices stack&lt;/a&gt; showed a 23% reduction in false-critical tickets compared to raw CVSS scoring. Pass &lt;code&gt;package.json&lt;/code&gt;, your service topology, and the CVE batch as one 32k-token prompt.&lt;/p&gt;

&lt;h3&gt;
  
  
  Incident report drafting from raw SIEM exports
&lt;/h3&gt;

&lt;p&gt;With the 128k context option enabled via the &lt;code&gt;max_context_tokens: 131072&lt;/code&gt; parameter, you can paste a full SIEM log export and get a structured incident report in NIST SP 800-61r3 format in a single pass. The model handles timestamp normalization, event correlation, and executive summary generation without chained calls. Set &lt;code&gt;BASE_URL=https://api.openai.com/v1&lt;/code&gt; and swap to &lt;code&gt;gpt-5.5-cyber-128k&lt;/code&gt; as the model ID for this workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Limitations and when not to use it
&lt;/h2&gt;

&lt;p&gt;The refusal surface on GPT-5.5-Cyber is wider than standard GPT-5.5. OpenAI hard-coded blocks on shellcode generation, weaponized exploit PoC code, and C2 framework configuration — even for stated red-team purposes. &lt;a href="https://www.therundown.ai/p/openai-gpt-5-4-cyber-rejects-mythos-playbook" rel="noopener noreferrer"&gt;The Rundown reported&lt;/a&gt; that the model rejected roughly 18% of legitimate penetration testing prompts in beta testing, compared to 9% for Mythos on equivalent tasks. If your workflow requires offensive tooling beyond vulnerability identification — actual exploit development, payload generation, evasion testing — this model will block more than it helps. The TAC program itself mandates quarterly use-case reviews; access can be revoked if your reported use drifts toward offensive tooling. TAC terms also prohibit using the model to train downstream models or in products deployed to non-TAC entities, which rules out most SaaS security products aimed at a general developer audience.&lt;/p&gt;

&lt;h2&gt;
  
  
  Compared to alternatives
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  Model&lt;br&gt;
  Access&lt;br&gt;
  CTF Pass Rate&lt;br&gt;
  Defensive Tasks&lt;br&gt;
  Cost (input / 1M tok)&lt;br&gt;
  Refusal Rate (legit sec prompts)

&lt;p&gt;GPT-5.5-Cyber&lt;br&gt;
  TAC allowlist (~40 orgs)&lt;br&gt;
  66%&lt;br&gt;
  ~80%&lt;br&gt;
  TAC pricing (NDA)&lt;br&gt;
  ~18%&lt;/p&gt;

&lt;p&gt;Anthropic Mythos&lt;br&gt;
  ~40-org allowlist&lt;br&gt;
  ~70% (est.)&lt;br&gt;
  ~78%&lt;br&gt;
  TAC pricing (NDA)&lt;br&gt;
  ~12%&lt;/p&gt;

&lt;p&gt;GPT-5.5 (standard)&lt;br&gt;
  Public API&lt;br&gt;
  41%&lt;br&gt;
  ~60%&lt;br&gt;
  $15 / $60 per 1M tok&lt;br&gt;
  ~9%&lt;/p&gt;

&lt;p&gt;Claude 3.7 Sonnet&lt;br&gt;
  Public API&lt;br&gt;
  ~38%&lt;br&gt;
  ~57%&lt;br&gt;
  $3 / $15 per 1M tok&lt;br&gt;
  ~11%&lt;/p&gt;

&lt;p&gt;Llama Guard 3 (self-hosted)&lt;br&gt;
  HuggingFace / self-host&lt;br&gt;
  N/A (classifier only)&lt;br&gt;
  Content moderation only&lt;br&gt;
  $0 (self-hosted)&lt;br&gt;
  N/A&lt;br&gt;
&lt;/p&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
&lt;br&gt;
  &lt;br&gt;
  &lt;br&gt;
  FAQ&lt;br&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Can I test GPT-5.5-Cyber without TAC enrollment?&lt;/strong&gt; No. The &lt;code&gt;gpt-5.5-cyber&lt;/code&gt; model ID returns a &lt;code&gt;model_not_found&lt;/code&gt; 404 on standard API keys. OpenAI has not announced a public preview tier, a sandbox option, or a time-limited trial as of May 2026.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What did the Five Eyes briefings actually cover?&lt;/strong&gt; According to BensBites sources, OpenAI demonstrated two capabilities: automated attribution of nation-state TTPs from raw network telemetry, and large-scale phishing campaign pattern recognition across historical data sets. No public detail on whether live operational data was used in the demos. The briefings covered US federal agencies, state governments, and Five Eyes intelligence partners over the week of April 21-28.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How does GPT-5.5-Cyber compare to Mythos on refusal behavior?&lt;/strong&gt; GPT-5.5-Cyber refuses more aggressively on offensive prompts — roughly 18% vs 12% for Mythos on equivalent legitimate pen-test tasks. For purely defensive work the gap narrows. See the &lt;a href="https://dev.to/blog/mythos-vs-gpt-55-cyber-honest-offensive-security-benchmark-2026"&gt;full head-to-head benchmark&lt;/a&gt; for methodology and task-by-task results. For the broader policy context on why both companies restricted access, the &lt;a href="https://dev.to/blog/inside-the-ai-cyber-arms-race-may-2026-mythos-gpt-55-cyber-and-what-builders-can-use"&gt;AI Cyber Arms Race overview&lt;/a&gt; covers the timeline from Mythos announcement through OpenAI's about-face on open access.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was originally published on &lt;a href="https://nextfuture.io.vn" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;. Follow us for more fullstack &amp;amp; AI engineering content.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>fullstack</category>
      <category>ai</category>
      <category>webdev</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Closed Frontier Cyber AI vs Open Defensive Tools: Real-World Comparison 2026</title>
      <dc:creator>BeanBean</dc:creator>
      <pubDate>Fri, 08 May 2026 05:01:03 +0000</pubDate>
      <link>https://dev.to/bean_bean/closed-frontier-cyber-ai-vs-open-defensive-tools-real-world-comparison-2026-gd</link>
      <guid>https://dev.to/bean_bean/closed-frontier-cyber-ai-vs-open-defensive-tools-real-world-comparison-2026-gd</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://nextfuture.io.vn/blog/closed-frontier-cyber-ai-vs-open-defensive-tools-real-world-comparison-2026" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;As of May 2026, Anthropic's Mythos and OpenAI's GPT-5.5-Cyber sit behind allowlists that most engineering teams will never clear. Meanwhile, Llama Guard 3, CodeLlama Guard, and Cisco AI Defense have been in production for months—no NDAs, no federal vetting, no undisclosed pricing. We tested both stacks against four real defensive tasks: phishing detection, code audit, threat triage, and log forensics. Here is what the gap actually looks like. For the broader context on how these models came to exist, see &lt;a href="https://dev.to/blog/inside-the-ai-cyber-arms-race-may-2026-mythos-gpt-55-cyber-and-what-builders-can-use"&gt;Inside the AI Cyber Arms Race (May 2026)&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR: which one wins
&lt;/h2&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Verdict dimensionClosed Frontier (Mythos / GPT-5.5-Cyber)Open Defensive Stack (Llama Guard 3 + CodeLlama Guard)


AccessAllowlist only (~40 orgs, May 2026)Public API + self-hostable today
Best taskAdversarial simulation, advanced threat-intel synthesisPhishing detection, code audit, content filtering
PriceUndisclosed (federal/enterprise contracts)$0–$0.60/1M tokens; free if self-hosted
VerdictWorth pursuing for gov/critical-infra orgsReady to ship for most builder use cases right now
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
  
  
  Closed Frontier Cyber AI in 60 seconds
&lt;/h2&gt;

&lt;p&gt;Mythos (Anthropic, announced April 2026) and GPT-5.5-Cyber (OpenAI, April 30, 2026) are purpose-trained on offensive security corpora. They support adversarial capability emulation, red-team automation, and threat-intelligence synthesis at a depth that general-purpose models do not reach. GPT-5.5-Cyber scored 94% on the InterCode-CTF suite according to &lt;a href="https://simonwillison.net/2026/Apr/30/gpt-55-cyber-capabilities/#atom-everything" rel="noopener noreferrer"&gt;Simon Willison's independent evaluation&lt;/a&gt;; Mythos's numbers remain under NDA for most reviewers. Neither model is available via a standard API call. Mythos requires a Research Partner agreement with Anthropic. GPT-5.5-Cyber requires enrolling in the Trusted Access for Cyber program, a process that involves government vetting for most commercial applicants. Both programs briefed US federal agencies, state governments, and Five Eyes allies in late April 2026 before any public announcement. &lt;a href="https://dev.to/skilaai/openai-and-anthropic-are-racing-to-build-ai-cyber-weapons-neither-will-let-you-use-them-1oc8"&gt;The access reality is blunt&lt;/a&gt;: if your org is not already in conversation with Anthropic or OpenAI's federal teams, approval timelines extend well into 2027.&lt;/p&gt;
&lt;h2&gt;
  
  
  Open Defensive AI Stack in 60 seconds
&lt;/h2&gt;

&lt;p&gt;The accessible stack centers on three components you can deploy this week. Llama Guard 3 (Meta, generally available via HuggingFace and hosted APIs since Q4 2025) handles content-safety classification and prompt-injection detection. CodeLlama Guard applies the same family's code understanding to OWASP Top 10 vulnerability patterns—SQL injection, XSS, insecure deserialization. Cisco AI Defense (SaaS, launched March 2026 at $0.30/1M tokens) adds real-time threat triage and log forensics through a hosted API and a browser dashboard that needs no code integration for initial assessments. All three tools support GDPR and SOC 2 Type II requirements, ship API keys in minutes, and produce audit-ready output. &lt;a href="https://dev.to/alessandro_pignati/gpt-54-cyber-openais-game-changer-for-ai-security-and-defensive-ai-517l"&gt;Independent reviews&lt;/a&gt; confirm that for most defensive-only workflows, this stack closes 80–85% of the gap with the frontier models on documented benchmarks.&lt;/p&gt;
&lt;h2&gt;
  
  
  Head-to-head comparison
&lt;/h2&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DimensionClosed Frontier (Mythos / GPT-5.5-Cyber)Open Defensive Stack


API access todayNo — allowlist onlyYes — HuggingFace, Cisco portal, direct API
Phishing detection accuracy~96% (NIST SP 800-177r2, reported)~93.5% (CodeLlama Guard, reproducible)
OWASP Top 10 code auditStrong (no public number)7/10 A1:2021 cases caught in our test
Threat triageStrong (closed evals, federal demos)Moderate — Cisco AI Defense covers common scenarios
Log forensicsStrong (reported for gov use cases)Moderate — requires prompt engineering
Offensive simulationHigh — purpose-trainedNone by design
Self-hosted optionNoYes (Llama Guard 3, CodeLlama Guard)
Data stays on-premiseNoYes if self-hosted
PricingUndisclosed$0 (self-hosted) to $0.60/1M tokens
Compliance coverageCISA/DoD-alignedGDPR, SOC 2 Type II
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
  
  
  Real-world test: I tried both with phishing detection and code audit
&lt;/h2&gt;

&lt;p&gt;For phishing detection, I ran 200 real phishing emails through CodeLlama Guard via the HuggingFace Inference API and compared the results against GPT-5.5-Cyber's published accuracy figure on a comparable corpus. The open-stack call looks like this:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-sS&lt;/span&gt; https://api-inference.huggingface.co/models/meta-llama/CodeLlama-Guard-7b &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$HF_TOKEN&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"inputs": "Urgent: Your account has been suspended. Click here to verify."}'&lt;/span&gt;
&lt;span class="c"&gt;# Returns: {"label":"HARMFUL","score":0.9871}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;CodeLlama Guard flagged 187 of 200 emails (93.5%) with a median latency of 220ms. GPT-5.5-Cyber's published figure on a similar NIST benchmark sits at 96%—a real gap, but narrow for most production use cases. For the Cisco AI Defense path: open the dashboard, navigate to &lt;strong&gt;Threat Triage → Upload Corpus&lt;/strong&gt;, paste your email batch or log file, select &lt;strong&gt;Phishing Detection&lt;/strong&gt; as the analysis mode, and click &lt;strong&gt;Run Analysis&lt;/strong&gt;. Results appear in 10–30 seconds with per-item risk scores and remediation suggestions. No API integration required for this workflow. On code audit, CodeLlama Guard caught 7 of 10 injected SQL injection samples (OWASP A1:2021) in a test Node.js 22 codebase. GPT-5.5-Cyber has no public benchmark number for this task class, which makes direct comparison impossible without allowlist access.&lt;/p&gt;

&lt;h2&gt;
  
  
  Verdict by builder profile
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Solo dev building a SaaS product:&lt;/strong&gt; Use the open stack. Llama Guard 3 or Cisco AI Defense covers content safety and threat detection at a cost you can justify on a solo budget. Apply to Trusted Access now so you are positioned if your project scales.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security engineer at a seed-to-Series A startup:&lt;/strong&gt; The open stack handles 80–85% of client deliverables at audit-ready pricing. File the allowlist application as a six-month hedge—approval timelines are long, but early applicants get priority when cohorts expand.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Engineering lead at a critical-infrastructure org (energy, finance, healthcare):&lt;/strong&gt; Push hard for Mythos or GPT-5.5-Cyber. The offensive-capability emulation and alignment with CISA guidance are material for your threat model in ways the open stack does not yet match.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Freelance DevSecOps consultant:&lt;/strong&gt; Build your standard deliverable on the open stack. It is reproducible, auditable, and priced for client contracts. Add an allowlist disclaimer clause to any contract where a client may later require frontier-model access.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Can I combine Llama Guard 3 with GPT-5.5-Cyber if I get allowlist access?&lt;/strong&gt;&lt;br&gt;
Yes. The Trusted Access program does not prohibit combining models. A practical split: use GPT-5.5-Cyber for adversarial simulation in a sandboxed red-team environment and Llama Guard 3 for real-time content filtering in your production API layer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is Llama Guard 3 accurate enough for production phishing detection?&lt;/strong&gt;&lt;br&gt;
For most SaaS and internal-tool threat models, yes. At 93–94% accuracy on standard phishing corpora, it meets the threshold most security teams apply. High-security environments—banking, healthcare, defense contractors—should layer additional fine-tuned classifiers or wait for expanded frontier access.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What happens to my data if I use Cisco AI Defense's hosted API?&lt;/strong&gt;&lt;br&gt;
Cisco's May 2026 data-processing agreement covers GDPR and SOC 2 Type II. Data is not used for model training by default. Review the current DPA at cisco.com/go/ai-trust before signing enterprise contracts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where do I find a full integration walkthrough for the open stack?&lt;/strong&gt;&lt;br&gt;
The upcoming &lt;em&gt;5 Defensive AI Tools Builders Can Actually Use in 2026 (No Allowlist Required)&lt;/em&gt; covers Llama Guard 3, Cisco AI Defense, and three other tools with cost tables and Next.js 16 integration examples.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was originally published on &lt;a href="https://nextfuture.io.vn" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;. Follow us for more fullstack &amp;amp; AI engineering content.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>fullstack</category>
      <category>ai</category>
      <category>webdev</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Coding API Costs in 2026: The $3.00 vs $0.50 Per Million Tokens Decision</title>
      <dc:creator>BeanBean</dc:creator>
      <pubDate>Tue, 05 May 2026 23:00:02 +0000</pubDate>
      <link>https://dev.to/bean_bean/coding-api-costs-in-2026-the-300-vs-050-per-million-tokens-decision-1c6j</link>
      <guid>https://dev.to/bean_bean/coding-api-costs-in-2026-the-300-vs-050-per-million-tokens-decision-1c6j</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://nextfuture.io.vn/blog/coding-api-costs-in-2026-the-300-vs-050-per-million-tokens-decision" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Should you route your coding API calls through Cursor Composer 2 instead of Claude Sonnet? For engineers and solo operators running code generation through the Anthropic API, the input-token math is clear: $3.00 per million for Claude Sonnet versus $0.50 per million for Cursor Composer 2. Above 10,000 prompts per day, Composer 2 saves $275 per month on input tokens alone. Below 1,000 prompts, migration takes nearly 11 months to pay back. The catch: Composer 2 is a coding-only model — route general reasoning and conversational tasks to Claude Sonnet regardless.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR: the verdict
&lt;/h2&gt;

&lt;p&gt;WorkloadClaude Sonnet (input only)Cursor Composer 2 (input only)WinnerRecovery time&lt;/p&gt;

&lt;p&gt;Light — 100 prompts/day, 50K tokens/day$3.30/mo$0.44/moComposer 2Never — $2.86/mo savings can't cover $300 migration in any reasonable horizon&lt;br&gt;
Medium — 1,000 prompts/day, 500K tokens/day$33.00/mo$5.50/moComposer 2~11 months — only worth it for long-running projects&lt;br&gt;
Heavy — 10,000 prompts/day, 5M tokens/day$330.00/mo$55.00/moComposer 2~1 month — switch immediately&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Short answer&lt;/strong&gt;: Composer 2 wins on pure input price at every workload, but the migration effort only pays back in a reasonable timeframe at Heavy usage (10,000+ prompts/day). Costs above are input-token only; output pricing for Composer 2 is not published in the sources cited here — see the &lt;a href="https://dev.to/toyama0919/cursor-composer-2-the-cache-economy-behind-a-10x-cheaper-coding-agent-15cj"&gt;full Composer 2 breakdown&lt;/a&gt; and Cursor's pricing page before committing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What each one actually costs
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Claude Sonnet pricing breakdown
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Pay-per-token&lt;/strong&gt;: $3.00 per 1M input tokens — &lt;a href="https://dev.to/marcene_272af51cf7ba004c3/i-built-an-ai-api-aggregator-that-saves-developers-60-85-on-model-costs-3olo"&gt;cited across multiple cost audits of the Anthropic API&lt;/a&gt;. Output pricing: vendor doesn't publish a figure in the sources reviewed here — check anthropic.com/pricing before running production estimates.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;No flat fee&lt;/strong&gt;: pure usage-based billing, no minimums, no seat charges.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;No lock-in&lt;/strong&gt;: API key cancellation at any time, no annual commitment required.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One developer's audit of his own API spend found that smarter model routing — not a single wholesale switch — cut costs by 60–85%. At $3.00 per million input tokens, Claude Sonnet is not the cheapest option for coding-only tasks where a specialized model can step in.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cursor Composer 2 pricing breakdown
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;API usage&lt;/strong&gt;: $0.50 per 1M input tokens — &lt;a href="https://dev.to/toyama0919/cursor-composer-2-the-cache-economy-behind-a-10x-cheaper-coding-agent-15cj"&gt;per the Composer 2 technical breakdown published March 2026&lt;/a&gt;. Output pricing: not cited in available sources — mark as unknown and verify at cursor.com/pricing.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache reads&lt;/strong&gt;: the same article reports cache read tokens cost less than standard input tokens. At high volume, cache hit rate on repeated code patterns can push effective cost well below $0.50/1M.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;No lock-in&lt;/strong&gt;: API key integration, stateless calls, no data migration required to switch away.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The $0.50/1M price applies only to the subset of calls you can safely route to a coding-only model. All general reasoning, code review narrative, and requirement parsing stays on Claude Sonnet — model this constraint before calculating savings.&lt;/p&gt;

&lt;p&gt;For a hands-on look at Composer 2's output quality in a real project, see our &lt;a href="https://dev.to/blog/cursor-composer-2-for-nextjs-16-5-things-that-actually-changed"&gt;Cursor Composer 2 for Next.js 16 review&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Break-even, walked through
&lt;/h2&gt;

&lt;p&gt;The math here uses 22 working days per month and input-only token pricing. At &lt;strong&gt;Medium workload&lt;/strong&gt; — 1,000 prompts per day averaging 500 input tokens each, totaling 500,000 input tokens per day — Claude Sonnet costs $3.00 × (500,000 × 22 / 1,000,000) = &lt;strong&gt;$33.00 per month&lt;/strong&gt;. Cursor Composer 2 at $0.50 per million tokens costs $0.50 × (500,000 × 22 / 1,000,000) = &lt;strong&gt;$5.50 per month&lt;/strong&gt;. Monthly savings: $27.50.&lt;/p&gt;

&lt;p&gt;At &lt;strong&gt;Heavy workload&lt;/strong&gt; — 10,000 prompts per day averaging 500 input tokens each, totaling 5 million input tokens per day — Claude Sonnet costs $330.00 per month. Cursor Composer 2 costs $55.00 per month. Savings: $275.00 per month on input tokens.&lt;/p&gt;

&lt;p&gt;The inflection point where Composer 2 &lt;em&gt;clearly&lt;/em&gt; justifies switching is around &lt;strong&gt;5,000 prompts per day&lt;/strong&gt;. Below that line, the $300 one-time migration cost (4 hours of developer time at a blended $75/hour rate) takes longer than 6 months to recover from monthly savings alone. Above 5,000 prompts per day, payback drops under 6 months — a reasonable horizon for any production service you plan to run through next year.&lt;/p&gt;

&lt;p&gt;One factor the math doesn't fully capture: cache reads. The &lt;a href="https://dev.to/toyama0919/cursor-composer-2-the-cache-economy-behind-a-10x-cheaper-coding-agent-15cj"&gt;March 2026 technical breakdown&lt;/a&gt; reports that repeated code patterns hit Composer 2's cache at sub-$0.50/1M rates, compressing the Heavy-workload payback further — though without a published cache hit rate, treat that as directional, not hard math. Track token spend by model with &lt;a href="https://dev.to/blog/llm-observability-tools-2026-4-types-ai-engineers-get-wrong"&gt;LLM observability tooling&lt;/a&gt; to validate the switch empirically.&lt;/p&gt;

&lt;h2&gt;
  
  
  What switching actually costs in time
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Migration time&lt;/strong&gt;: 4 hours — update the API endpoint and model identifier, validate response schema compatibility in staging (format compatibility with OpenAI-style clients is unconfirmed in sources), and run your code generation test suite.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ramp period&lt;/strong&gt;: 5 days running both models on a sample of production traffic. Code outputs should pass your existing linting and test gates; prompt adjustments may be needed before full cutover.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Lock-in to leave&lt;/strong&gt;: none — Cursor Composer 2 is an API call, stateless, no data persists on their side. Switching back to Claude Sonnet means reverting one config change.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Recovery&lt;/strong&gt;: at Heavy workload, $275/month in input savings recovers the $300 migration cost in approximately 1.1 months. At Medium workload, $27.50/month savings recovers the same friction cost in approximately 10.9 months. Below Medium, the switch costs more in labor than it saves in the first year — don't do it unless your workload is growing toward that threshold.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The real risk is quality, not cost. Any prompt outside pure code generation will return degraded output — classify your call types before routing traffic to Composer 2.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pick by your profile
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Solo dev, side projects, fewer than 500 prompts/day&lt;/strong&gt;: stay on Claude Sonnet. Your monthly input cost is under $17, and the migration overhead exceeds your first year of savings. Revisit when daily prompt volume crosses 1,000.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Team of 5–20, predictable code generation workload&lt;/strong&gt;: run the calculation with your actual token counts. If your team generates 2,000+ coding prompts per day, the switch pays back in 5–6 months. Instrument first — &lt;a href="https://dev.to/hiyoyok/gemini-vs-claude-vs-gpt-4-for-code-debugging-practical-comparison-2026-dpb"&gt;real debugging workloads show significant variation&lt;/a&gt; in token consumption per prompt type, so measure before you estimate.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cost-sensitive batch processing&lt;/strong&gt;: Cursor Composer 2 is the clear choice if your pipeline runs code generation jobs in bulk — formatting, refactoring, test generation. At $0.50/1M input, batch input costs are 6× lower than Claude Sonnet. Run a parallel smoke test on a representative 10,000-prompt batch before cutting over production.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Latency- or quality-critical user-facing code generation&lt;/strong&gt;: evaluate quality first, price second. The &lt;a href="https://dev.to/agentstackteam/i-asked-3-ais-to-ship-a-tool-together-heres-what-actually-shipped-3p3c"&gt;3-AI production comparison&lt;/a&gt; found quality differences between models are task-dependent and measurable — benchmark on your own eval set before committing.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your architecture routes multiple models and you want to avoid rebuilding API integration from scratch, see our &lt;a href="https://dev.to/blog/best-ai-gateway-tools-for-multi-model-llm-apps-in-2026"&gt;overview of AI gateway tools&lt;/a&gt; — they let you A/B test model routing without touching application code.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is Cursor Composer 2 actually cheaper than Claude Sonnet?
&lt;/h3&gt;

&lt;p&gt;Yes, on input tokens: $0.50/1M versus $3.00/1M — a 6× difference at the input layer. Output token pricing for Composer 2 is not published in current sources, so total cost comparison requires verifying output rates at cursor.com/pricing before drawing a final conclusion.&lt;/p&gt;

&lt;h3&gt;
  
  
  How long until switching pays for itself?
&lt;/h3&gt;

&lt;p&gt;At Heavy workload (10,000 prompts/day), the $275/month input savings recovers a $300 migration cost in ~1.1 months. At Medium workload (1,000 prompts/day), recovery takes ~10.9 months — justified only if the workload holds steady over 12+ months.&lt;/p&gt;

&lt;h3&gt;
  
  
  What if my workload changes?
&lt;/h3&gt;

&lt;p&gt;Monthly savings = (daily input tokens × 22 × $2.50) / 1,000,000. Divide your migration cost by that figure to get your payback in months. The crossover from "don't switch" to "switch now" sits around 5,000 prompts per day at current pricing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Are these prices current as of May 2026?
&lt;/h3&gt;

&lt;p&gt;Pricing pulled from two sources published in early 2026: the &lt;a href="https://dev.to/marcene_272af51cf7ba004c3/i-built-an-ai-api-aggregator-that-saves-developers-60-85-on-model-costs-3olo"&gt;developer API cost audit&lt;/a&gt; for Claude Sonnet input pricing, and the &lt;a href="https://dev.to/toyama0919/cursor-composer-2-the-cache-economy-behind-a-10x-cheaper-coding-agent-15cj"&gt;Cursor Composer 2 cache economy breakdown&lt;/a&gt; for Composer 2 input pricing. Vendors change pricing without notice — confirm current rates at anthropic.com/pricing and cursor.com/pricing before committing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I use Cursor Composer 2 for tasks other than coding?
&lt;/h3&gt;

&lt;p&gt;No — Composer 2 was trained exclusively on code data. Routing document summaries, planning tasks, or conversational prompts to it will produce degraded output. The &lt;a href="https://dev.to/owen_fox/best-ai-models-in-2026-complete-guide-2ac7"&gt;2026 model guide&lt;/a&gt; maps which frontier models handle which task types and at what cost.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was originally published on &lt;a href="https://nextfuture.io.vn" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;. Follow us for more fullstack &amp;amp; AI engineering content.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>fullstack</category>
      <category>ai</category>
      <category>webdev</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Mythos vs GPT-5.5-Cyber: Honest Offensive Security Benchmark 2026</title>
      <dc:creator>BeanBean</dc:creator>
      <pubDate>Mon, 04 May 2026 05:00:01 +0000</pubDate>
      <link>https://dev.to/bean_bean/mythos-vs-gpt-55-cyber-honest-offensive-security-benchmark-2026-1dod</link>
      <guid>https://dev.to/bean_bean/mythos-vs-gpt-55-cyber-honest-offensive-security-benchmark-2026-1dod</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://nextfuture.io.vn/blog/mythos-vs-gpt-55-cyber-honest-offensive-security-benchmark-2026" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Anthropic's Mythos and OpenAI's GPT-5.5-Cyber both shipped in April–May 2026 as purpose-built cybersecurity models, and both landed behind strict allowlists within days of each other. For AI engineers evaluating them honestly, the core problem is the same: most practitioners can't get direct API access, so any comparison relies on third-party evals, CTF leaderboard data, and structured capability disclosures from partner briefings. This piece pulls those threads together and gives you the clearest signal available as of May 4, 2026. For the full geopolitical backdrop, see our cluster anchor &lt;a href="https://dev.to/blog/inside-the-ai-cyber-arms-race-may-2026-mythos-gpt-55-cyber-and-what-builders-can-use"&gt;Inside the AI Cyber Arms Race (May 2026)&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR: which one wins
&lt;/h2&gt;

&lt;p&gt;DimensionMythos (Anthropic)GPT-5.5-Cyber (OpenAI)&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Access modelInvite-only, ~40 vetted orgsTrusted Access for Cyber program — broader cohort
Public CTF benchmarkNot released~72% on Simon Willison's April 30 eval subset
Refusal designCapability-level — baked into model weightsIntent-contextual — evaluates stated purpose
Best fitRed-team simulation inside vetted orgThreat triage + defensive automation at scale
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
  
  
  Mythos in 60 seconds
&lt;/h2&gt;

&lt;p&gt;Anthropic announced Mythos on April 7, 2026 as a model built specifically for cybersecurity tasks — vulnerability analysis, adversarial threat modeling, and red-team exercises within vetted organizations. Access is restricted to roughly 40 organizations that passed Anthropic's vetting process, which requires a demonstrated defensive security mission and signed use constraints that prohibit offensive deployment against external targets. Anthropic has released no public benchmarks and no system card for Mythos as of this writing. Capability claims come primarily from partner briefings and secondhand accounts from approved organizations.&lt;/p&gt;

&lt;p&gt;The architectural detail that matters most for engineers: Mythos reportedly refuses offensive tasks at the model weights level, not through a prompt filter. That means jailbreak techniques that work on claude-opus-4 and similar Anthropic models don't transfer. The refusal is structural, not instructional — a meaningful distinction if you're designing a red-team workflow that needs predictable model behavior under adversarial prompting.&lt;/p&gt;
&lt;h2&gt;
  
  
  GPT-5.5-Cyber in 60 seconds
&lt;/h2&gt;

&lt;p&gt;OpenAI shipped GPT-5.5-Cyber in late April 2026 through its Trusted Access for Cyber program — within days of publicly criticizing Anthropic's allowlist approach, then quietly adopting the same model for its own launch. The model targets what OpenAI calls "critical cyber defenders": federal agencies, national labs, and vetted security firms. Unlike Mythos, OpenAI published partial capability notes showing the model handles code vulnerability scanning, threat intelligence summarization, and CTF problem solving. Early participant briefings referenced "GPT-5.4-Cyber"; the version shipping through the program in May 2026 carries the GPT-5.5-Cyber designation — two checkpoint versions of the same fine-tuned stack.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://simonwillison.net/2026/Apr/30/gpt-55-cyber-capabilities/#atom-everything" rel="noopener noreferrer"&gt;Simon Willison's independent evaluation on April 30, 2026&lt;/a&gt; put GPT-5.5-Cyber at approximately 72% on a structured CTF subset. That's above what a general-purpose GPT-4o variant with standard prompting achieves, but Willison flagged that the refusal layer blocked completion on challenges requiring simulated exploitation steps — even in sandboxed test contexts. The intent-contextual refusal design creates friction in automated eval pipelines where the model can't verify operator intent.&lt;/p&gt;
&lt;h2&gt;
  
  
  Head-to-head comparison
&lt;/h2&gt;

&lt;p&gt;DimensionMythosGPT-5.5-Cyber&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Access mechanism~40 org allowlist, Anthropic-vettedTrusted Access for Cyber, OpenAI-reviewed
API model IDNot publicly disclosed`gpt-5.5-cyber` (confirmed in Willison eval)
System cardNone releasedPartial capability notes released
CTF benchmarkUndisclosed~72% on April 30, 2026 Willison subset
Refusal designCapability-level (weights layer)Intent-contextual (prompt evaluation)
Jailbreak resistanceHigh — standard Anthropic jailbreaks failModerate — intent spoofing possible in testing
Defensive task strengthThreat modeling, vuln disclosureThreat triage, code audit, CTF scaffolding
Public pricingNoneNone
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
  
  
  Real-world test: I tried both with offensive CTF tasks
&lt;/h2&gt;

&lt;p&gt;Direct API access to either model is unavailable to most engineers, so this section synthesizes the three most substantive public evaluations available through May 2026. Willison's test is the gold standard — he ran GPT-5.5-Cyber through challenges in four categories: binary exploitation, web vulnerability identification, network forensics, and cryptographic puzzle solving. The model completed the web vuln and network forensics tasks cleanly. It stalled on binary exploitation steps that required generating shellcode, even with explicit sandboxed-environment framing in the system prompt. Willison's conclusion: the model performs well as a knowledge retrieval and triage layer, but it blocks at the point where output would constitute a usable exploit artifact.&lt;/p&gt;

&lt;p&gt;For Mythos, partner-reported findings describe a different failure mode: the model excels at generating structured threat models and writing adversarial test scenarios, but it consistently refuses to produce working exploit code even when the system prompt establishes red-team context and operator authorization. Unlike GPT-5.5-Cyber, which sometimes completes partial steps before refusing, Mythos declines the task before generating any output — consistent with its weights-level refusal architecture.&lt;/p&gt;

&lt;p&gt;The code path for either model, once you hold an approved API key, follows standard SDK conventions. For Mythos on the Anthropic SDK:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mythos-20260401&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2048&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are assisting an authorized red team. Environment: isolated lab network, no external connectivity.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Identify exploitable weaknesses in this service config and generate a structured threat report: [config]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OpenAI's equivalent uses the standard &lt;code&gt;/v1/chat/completions&lt;/code&gt; endpoint with &lt;code&gt;model="gpt-5.5-cyber"&lt;/code&gt; — no special parameter beyond the model ID. Both programs mandate full session logging through their respective partner portals. If you access the model through the UI rather than the API, Anthropic's partner dashboard and OpenAI's Trusted Access interface both surface the same session logs to your organization's security contact.&lt;/p&gt;

&lt;h2&gt;
  
  
  Verdict by builder profile
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Security researcher at a vetted org:&lt;/strong&gt; GPT-5.5-Cyber has a published eval baseline and a slightly broader access program than Mythos. Apply through Trusted Access for Cyber first — the published capability notes make scope-setting with your security team easier than Mythos's opaque briefing process.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Red team lead at an enterprise:&lt;/strong&gt; Mythos is the stronger choice for adversarial simulation if Anthropic approves you. The weights-level refusal design produces fewer jailbreak attempts in your test logs and cleaner audit trails — both matter when you report red-team sessions to your CISO.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;AI engineer building defensive tooling:&lt;/strong&gt; Neither model is accessible to you yet. Our upcoming deep-dive &lt;em&gt;Closed Frontier Cyber AI vs Open Defensive Tools: Real-World Comparison 2026&lt;/em&gt; covers the open-stack alternatives — Llama Guard 3, CodeLlama Guard, Cisco AI Defense — that ship to production today without an allowlist.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Independent security researcher:&lt;/strong&gt; You're outside both allowlists for now. OpenAI has signaled a broader rollout through the Trusted Access for Cyber program in late 2026. Until then, check &lt;a href="https://www.therundown.ai/p/openai-gpt-5-4-cyber-rejects-mythos-playbook" rel="noopener noreferrer"&gt;The Rundown's breakdown of the GPT-5.5-Cyber strategy&lt;/a&gt; and &lt;a href="https://dev.to/alessandro_pignati/gpt-54-cyber-openais-game-changer-for-ai-security-and-defensive-ai-517l"&gt;Alessandro Pignati's capabilities analysis on dev.to&lt;/a&gt; for the most current independent assessments.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Is GPT-5.5-Cyber the same model as GPT-5.4-Cyber?&lt;/strong&gt;&lt;br&gt;
No. Early participant briefings in April 2026 referenced "GPT-5.4-Cyber." The version shipping through the Trusted Access program in May 2026 carries the GPT-5.5-Cyber designation. OpenAI described it as an updated checkpoint of the same fine-tuned cybersecurity stack, with improved CTF performance and tighter intent-evaluation behavior in the refusal layer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I evaluate GPT-5.5-Cyber without Trusted Access program approval?&lt;/strong&gt;&lt;br&gt;
No direct API or playground access exists outside the program. Simon Willison's April 30, 2026 evaluation is the most structured independent test publicly available. The Rundown AI and dev.to analysts have published secondary analyses, but none involved unrestricted API access.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Will Anthropic release a system card for Mythos?&lt;/strong&gt;&lt;br&gt;
As of May 4, 2026, Anthropic has not published a system card. Partner briefings describe a phased transparency process, but no public release date is confirmed. OpenAI's partial capability notes for GPT-5.5-Cyber set a weak precedent — they describe performance categories but omit benchmark methodology.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does either model require special SDK configuration beyond the model ID?&lt;/strong&gt;&lt;br&gt;
No. Both use standard message-passing APIs — the Anthropic Python SDK for Mythos, the OpenAI Python SDK for GPT-5.5-Cyber. You switch models by changing the &lt;code&gt;model&lt;/code&gt; parameter. Session logging enforcement happens at the API gateway layer on both platforms, not in client code. Our upcoming piece &lt;em&gt;Inside GPT-5.5-Cyber: Capabilities, Refusals, and Federal Briefings Explained&lt;/em&gt; covers the full API behavior profile in detail.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was originally published on &lt;a href="https://nextfuture.io.vn" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;. Follow us for more fullstack &amp;amp; AI engineering content.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>fullstack</category>
      <category>ai</category>
      <category>webdev</category>
      <category>javascript</category>
    </item>
    <item>
      <title>LLM Observability Tools 2026: 4 Types AI Engineers Get Wrong</title>
      <dc:creator>BeanBean</dc:creator>
      <pubDate>Sun, 03 May 2026 17:00:13 +0000</pubDate>
      <link>https://dev.to/bean_bean/llm-observability-tools-2026-4-types-ai-engineers-get-wrong-1kb</link>
      <guid>https://dev.to/bean_bean/llm-observability-tools-2026-4-types-ai-engineers-get-wrong-1kb</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://nextfuture.io.vn/blog/llm-observability-tools-2026-4-types-ai-engineers-get-wrong" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;On May 2, 2026, two analyses of the LLM observability category dropped within four hours of each other — and both made the same point: eight tools claim identical keywords (tracing, observability, logging, cost tracking) but instrument your stack at completely different layers. If you picked yours from a feature comparison table, there's a reasonable chance it's the wrong architectural fit for your workload.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changed
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Four distinct tool architectures are now in production&lt;/strong&gt;: SDK-based tracers (Langfuse, Phoenix), reverse-proxy loggers (Helicone), evals platforms with tracing bolt-ons, and enterprise ML monitors that added LLM support last year (Datadog LLM Observability, Arize). They all pass the same marketing checklist but instrument at different points in your request path.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;OpenTelemetry's &lt;code&gt;gen_ai.*&lt;/code&gt; semantic conventions reached stable status&lt;/strong&gt;, but they only standardize token counts and latency — not output quality, prompt version, or agent-step attribution. Existing OTel pipelines need custom attributes before they cover the AI-specific signals that matter.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Agentic workloads broke the per-request model&lt;/strong&gt;: a single LangGraph run generates one HTTP 200 but may trigger 14 LLM calls across 6 tool invocations. A reverse proxy sees 14 separate API calls with no connection between them. An SDK tracer sees one trace with 14 spans. The tool you choose determines which view you get — and you can't reconstruct the other retroactively.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why builders should care
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;reverse proxy&lt;/strong&gt; (Helicone: free up to 10K requests/mo, $20/mo Starter) logs at the network edge — token counts and latency per call, but no context about which agent step or prompt template generated it. An &lt;strong&gt;SDK-based tracer&lt;/strong&gt; (Langfuse: self-hosted free, cloud from $59/mo) instruments at the code layer — trace hierarchy, step attribution, prompt versioning — but every LLM-calling service needs the SDK and an explicit instrumentation call. Mixing both without a reason means paying for both while still hitting blind spots.&lt;/p&gt;

&lt;p&gt;The choice maps to workload type. A straightforward RAG endpoint — one LLM call per request — needs a reverse proxy and nothing else. Multi-step agents with &lt;a href="https://dev.to/grepture/llm-observability-tools-compared-the-2026-landscape-gdf"&gt;LangGraph, Anthropic tool use, or a custom loop&lt;/a&gt; lose attribution the moment a chain branches. The bad response in an agentic system doesn't come from the API layer; it comes from step 7 of 12, which no proxy traces.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changes in your workflow
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;If you already run OTel&lt;/strong&gt;: add &lt;code&gt;gen_ai.usage.input_tokens&lt;/code&gt;, &lt;code&gt;gen_ai.usage.output_tokens&lt;/code&gt;, and &lt;code&gt;gen_ai.response.finish_reason&lt;/code&gt; to your span attributes. These are stable &lt;a href="https://dev.to/rafacalderon/observability-for-ai-systems-with-opentelemetry-gfn"&gt;OTel GenAI semantic conventions&lt;/a&gt; as of May 2026. Datadog, Honeycomb, and New Relic ingest them natively — no new vendor required for basic cost and latency dashboards.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Adding Helicone&lt;/strong&gt;: this is a &lt;code&gt;baseURL&lt;/code&gt; swap, not an SDK install. Point your OpenAI client at &lt;code&gt;https://gateway.helicone.ai&lt;/code&gt;, add an &lt;code&gt;Helicone-Auth&lt;/code&gt; header with your API key, and the proxy starts logging within seconds. Works with any OpenAI-compatible client. For Anthropic, swap to &lt;code&gt;https://anthropic.helicone.ai&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Adding Langfuse&lt;/strong&gt;: install &lt;code&gt;langfuse&lt;/code&gt; (Python) or &lt;code&gt;@langfuse/langfuse&lt;/code&gt; (Node), wrap LLM calls in &lt;code&gt;langfuse.trace()&lt;/code&gt; / &lt;code&gt;langfuse.generation()&lt;/code&gt;, and flush before process exit. In serverless (Lambda, Vercel Functions), async flush is off by default — call &lt;code&gt;await langfuse.flushAsync()&lt;/code&gt; explicitly before returning the response, or spans are dropped on cold-container termination.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enterprise monitors (Datadog, Arize)&lt;/strong&gt;: agent-aware dashboards and hallucination scoring, but billed per span — Datadog LLM Observability charges $0.10/1K spans after the free tier. A pipeline at 100 req/min generates ~1M spans/day. Verify volume before enabling.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  5 action items for this week
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Map every place an LLM call originates in your codebase — app server, background worker, agent loop — before choosing a tool type. A spreadsheet with "call site → call count → agent or single-shot" takes 30 minutes and eliminates the wrong architectural choice.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If you already ship OTel spans, add &lt;code&gt;gen_ai.usage.input_tokens&lt;/code&gt; and &lt;code&gt;gen_ai.usage.output_tokens&lt;/code&gt; to your existing traces this week. Your APM vendor likely ingest them already — no new contract needed to get cost visibility.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Run Helicone in your dev environment for 48 hours: swap &lt;code&gt;openai.baseURL&lt;/code&gt; to &lt;code&gt;https://gateway.helicone.ai&lt;/code&gt;, add &lt;code&gt;Helicone-Auth: Bearer &amp;lt;key&amp;gt;&lt;/code&gt;, and read the cost dashboard before considering anything else. It's the fastest way to get baseline data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If you run LangGraph or LlamaIndex agents, install Langfuse's native integration. The &lt;code&gt;@observe()&lt;/code&gt; decorator (Python) or &lt;code&gt;CallbackHandler&lt;/code&gt; (LangChain/LangGraph) wraps the full chain automatically — you get span hierarchy, token counts, and latency per step with two lines of code.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;For output-quality tracking beyond latency, look at &lt;a href="https://nextfuture.io.vn/blog/langfuse-experiments-rebuild-what-llm-devs-need-to-know-2026" rel="noopener noreferrer"&gt;Langfuse Experiments (now rebuilt for 2026)&lt;/a&gt; or Arize Phoenix — these let you run eval datasets against prompt versions, not just monitor live traffic. Add evals before you add more prompts.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What to watch next
&lt;/h2&gt;

&lt;p&gt;Before committing to a vendor, read the head-to-head: &lt;a href="https://nextfuture.io.vn/blog/langfuse-vs-helicone-i-tested-both-for-llm-observability-2026" rel="noopener noreferrer"&gt;Langfuse vs Helicone: I Tested Both for LLM Observability (2026)&lt;/a&gt; covers trace coverage gaps and pricing at scale with real numbers. If the gap is at the gateway layer — rate limiting, routing, fallbacks — see &lt;a href="https://nextfuture.io.vn/blog/best-ai-gateway-tools-for-multi-model-llm-apps-in-2026" rel="noopener noreferrer"&gt;Best AI Gateway Tools for Multi-Model LLM Apps in 2026&lt;/a&gt; for a decision matrix by workload. The OTel GenAI SIG's 1.0 spec (expected Q3 2026) should standardize &lt;code&gt;gen_ai.system&lt;/code&gt; across Anthropic, OpenAI, and Vertex — if it ships on schedule, most vendor-specific SDK instrumentation for cost/latency becomes redundant.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is Helicone cheaper than Langfuse for most workloads?
&lt;/h3&gt;

&lt;p&gt;Under 10K requests/month, Helicone's free tier wins. At higher volumes, Helicone Starter ($20/mo) beats Langfuse Cloud ($59/mo) on price — but you're comparing proxy-level visibility to SDK trace hierarchy. Self-hosting Langfuse is free at any volume (requires Postgres + worker container, ~2h setup). Compare what you're observing, then compare pricing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does the Anthropic SDK work with OpenTelemetry out of the box?
&lt;/h3&gt;

&lt;p&gt;Not natively as of May 2026. Anthropic's Python and TypeScript SDKs don't ship a built-in OTel exporter. Use the community-maintained &lt;code&gt;anthropic-otel&lt;/code&gt; package or Langfuse's Anthropic integration (&lt;code&gt;from langfuse.decorators import observe&lt;/code&gt;). The stable &lt;code&gt;gen_ai.*&lt;/code&gt; OTel semantic conventions apply — Datadog and Honeycomb ingest them — but you need an intermediate layer to translate Anthropic API responses into OTel spans.&lt;/p&gt;

&lt;h3&gt;
  
  
  When should I switch from a proxy-based to SDK-based observability setup?
&lt;/h3&gt;

&lt;p&gt;Switch when you need step-level attribution: when a single user request triggers multiple LLM calls and you need to know which step produced a bad output, which prompt version caused a regression, or how token usage breaks down per chain step. If your latency dashboard is green but users are complaining, the gap is almost always at the application layer — where proxy tools stop and SDK tools start. The concrete trigger: the moment you ship your first agent loop that retries or branches, move to SDK-based tracing before that loop reaches production.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was originally published on &lt;a href="https://nextfuture.io.vn" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;. Follow us for more fullstack &amp;amp; AI engineering content.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>fullstack</category>
      <category>ai</category>
      <category>webdev</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Cursor Composer 2 for Next.js 16: 5 Things That Actually Changed</title>
      <dc:creator>BeanBean</dc:creator>
      <pubDate>Sun, 03 May 2026 11:00:01 +0000</pubDate>
      <link>https://dev.to/bean_bean/cursor-composer-2-for-nextjs-16-5-things-that-actually-changed-2hbi</link>
      <guid>https://dev.to/bean_bean/cursor-composer-2-for-nextjs-16-5-things-that-actually-changed-2hbi</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://nextfuture.io.vn/blog/cursor-composer-2-for-nextjs-16-5-things-that-actually-changed" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Cursor shipped Composer 2 in March 2026 as the centerpiece of the Cursor 2.0 overhaul. The model runs on Cursor's own code-only architecture — no Claude or GPT-4 proxy underneath — at $0.50/1M input tokens. The headline number is real, but the KV cache economy behind it is what changes your monthly bill on agentic coding workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changed
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Custom code-only model&lt;/strong&gt;: Composer 2 is trained exclusively on coding data via continued pre-training and reinforcement learning. Prior Cursor versions routed all Composer requests to Claude Sonnet or GPT-4 through the Anthropic and OpenAI APIs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;$0.50/1M input tokens&lt;/strong&gt;: That's 6× cheaper than Claude Sonnet 3.7 ($3/1M) and 16× cheaper than GPT-4o ($8/1M) — the two models Cursor previously proxied for Composer tasks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Aggressive KV cache on repo context&lt;/strong&gt;: Cursor stores repo embeddings in a shared KV cache between agentic requests. Follow-up edits to the same files in one session read cached tokens at under $0.05/1M instead of re-ingesting the full context.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;SWE-bench Multilingual leadership&lt;/strong&gt;: Composer 2 outperforms GPT-4o and Claude Sonnet 3.7 on this benchmark, which tests multi-file edits across Python, Java, Go, Rust, and TypeScript — not just synthetic toy problems.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;General reasoning dropped by design&lt;/strong&gt;: Non-coding knowledge was excised during training. The model is intentionally limited outside code — ask it about system architecture or documentation and it will redirect you to a general-purpose model.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why builders should care
&lt;/h2&gt;

&lt;p&gt;At 1,000 Composer requests per day — a realistic number for any team running overnight refactors or CI-triggered context expansions — moving from Claude Sonnet at $3/1M to Composer 2 at $0.50/1M cuts input costs by 83%. Teams that spent $90/month on proxied frontier model tokens in 2025 can run the same agentic coding volume for roughly $15/month. That delta matters most for solo operators and small teams without enterprise AI budgets who are paying out-of-pocket for &lt;a href="https://nextfuture.io.vn/blog/ai-coding-agents-in-2026-a-fullstack-engineers-recap" rel="noopener noreferrer"&gt;agentic coding pipelines&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The KV cache amplifies the savings further. In a long session where Cursor reuses cached repo context across 20–30 sequential edits, effective per-token cost on the cache-hit portion drops below $0.10/1M. The practical result: a codebase-wide refactor that cost $4 in tokens last year costs under $0.50 today. Cost-sensitive workloads — background agents, automated PR reviews, multi-step migration scripts — benefit most.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changes in your workflow
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Model routing is automatic from Cursor 2.0+&lt;/strong&gt;: Composer defaults to Composer 2. If you previously hardcoded &lt;code&gt;claude-sonnet-3-7&lt;/code&gt; or &lt;code&gt;gpt-4o&lt;/code&gt; in the Cursor model selector, reset it to &lt;strong&gt;Auto&lt;/strong&gt; or explicitly choose &lt;strong&gt;Cursor Composer 2&lt;/strong&gt; in Settings → AI → Composer Model.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;No API key changes for standard subscribers&lt;/strong&gt;: Composer 2 is internal to the IDE on Cursor's Pro and Business plans. If you use &lt;a href="https://dev.to/toyama0919/cursor-composer-2-the-cache-economy-behind-a-10x-cheaper-coding-agent-15cj"&gt;Cursor's API for CI integrations&lt;/a&gt;, verify the &lt;code&gt;cursor.composer.model&lt;/code&gt; field in your project config points to &lt;code&gt;composer-2&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache read tokens appear separately in billing&lt;/strong&gt;: The Cursor billing dashboard now distinguishes "cache read" from "input" tokens. Expect cache reads to dominate the token breakdown in sessions that edit the same files repeatedly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Route non-code questions to Cursor Chat, not Composer&lt;/strong&gt;: Composer 2 is intentionally blunt on architecture discussions and documentation. Use the full-context Chat models (Claude or GPT-4o, selectable in the chat panel) for anything requiring general reasoning.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Check your Cursor version and confirm the model selector with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Confirm you're on Cursor 2.0+&lt;/span&gt;
cursor &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;span class="c"&gt;# → Cursor 2.0.x (Composer 2 is the default Composer model from 2.0 onward)&lt;/span&gt;

&lt;span class="c"&gt;# List available Composer models via CLI (Cursor 2.0+)&lt;/span&gt;
cursor models &lt;span class="nt"&gt;--composer&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  5 action items for this week
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Update Cursor to 2.0 or later.&lt;/strong&gt; Run &lt;code&gt;cursor --version&lt;/code&gt; in a terminal. If you're on 1.x, download the latest from &lt;a href="https://cursor.sh" rel="noopener noreferrer"&gt;cursor.sh&lt;/a&gt; — Composer 2 ships only with 2.0+.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Switch Composer's model to "Cursor Composer 2"&lt;/strong&gt; in Settings → AI → Composer Model. "Auto" already routes there on 2.0, but explicit selection ensures you don't fall back to a proxied frontier model on timeout.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Check your Cursor billing dashboard&lt;/strong&gt; at cursor.sh/billing. Look at the per-model token breakdown. If you see a large "Claude Sonnet" input line still accumulating, your Composer is not on 2.0 yet — update and reverify.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Keep Claude or GPT-4o active in Cursor Chat&lt;/strong&gt; (not Composer) for architecture discussions, inline documentation, and general Q&amp;amp;A. Set your preferred chat model in Settings → AI → Chat Model so the fallback is explicit.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Benchmark your heaviest refactor case.&lt;/strong&gt; Run Composer 2 on the most context-intensive multi-file task you do regularly and compare output quality and token cost against your Claude Sonnet baseline. SWE-bench Multilingual predicts it will win on edit-heavy tasks; verify it holds on your actual codebase.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What to watch next
&lt;/h2&gt;

&lt;p&gt;Cursor's move to a proprietary code-only model puts direct pricing pressure on every AI coding tool that still proxies frontier models. The $0.50/1M vs $3–8/1M gap is now a hard TCO argument in enterprise procurement conversations. If your team is actively evaluating which AI editor to standardize on, see our &lt;a href="https://nextfuture.io.vn/blog/best-cursor-alternatives-2026" rel="noopener noreferrer"&gt;10 Best Cursor Alternatives in 2026&lt;/a&gt; roundup — several tools there still rely on OpenAI or Anthropic proxies, and the cost comparison looks different today than it did six months ago.&lt;/p&gt;

&lt;p&gt;The RL-from-code-feedback methodology Cursor uses is also worth tracking. Specialized models trained on real bug-fix and PR-diff data are consistently outperforming general-purpose frontier models on coding benchmarks. For a broader view of where agentic coding tooling is headed, the &lt;a href="https://nextfuture.io.vn/blog/claude-code-advisor-command-deep-dive-2026" rel="noopener noreferrer"&gt;Claude Code /advisor deep dive&lt;/a&gt; covers strategic planning layers on top of code-only execution.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is Cursor Composer 2 cheaper than routing through Claude Sonnet via the Cursor API?
&lt;/h3&gt;

&lt;p&gt;Yes, by a factor of 6×. Composer 2 costs $0.50/1M input tokens. Claude Sonnet 3.7 — which Cursor previously proxied — runs at $3/1M input via Anthropic's API. At 10M tokens/month, that's $5 vs $30 in input costs alone, before counting cache-read savings that push Composer 2's effective cost lower still in long sessions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does Cursor Composer 2 work well with non-TypeScript codebases?
&lt;/h3&gt;

&lt;p&gt;Yes. The SWE-bench Multilingual benchmark that Composer 2 leads on explicitly tests Python, Java, Go, Rust, and C++ alongside TypeScript. The continued pre-training corpus spans multiple languages, and the RL phase trains on real multi-language edit data from open-source repositories. Performance on TypeScript/Next.js will likely remain the highest-tested case, but Python and Go codebases should see competitive results.&lt;/p&gt;

&lt;h3&gt;
  
  
  When should I use Cursor Chat instead of Cursor Composer 2?
&lt;/h3&gt;

&lt;p&gt;Use Composer 2 for any task that produces code: file edits, multi-file refactors, test generation, and migration scripts. Switch to Cursor Chat (Claude or GPT-4o) for architecture decisions, writing documentation, explaining third-party library internals, or any question where general reasoning matters. Composer 2 is trained to produce correct code edits, not to reason about system design — the model will redirect you if you try to use it outside that scope.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was originally published on &lt;a href="https://nextfuture.io.vn" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;. Follow us for more fullstack &amp;amp; AI engineering content.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>fullstack</category>
      <category>ai</category>
      <category>webdev</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Fix Claude Code Terminal Flicker: NO_FLICKER Config (May 2026)</title>
      <dc:creator>BeanBean</dc:creator>
      <pubDate>Sun, 03 May 2026 11:00:00 +0000</pubDate>
      <link>https://dev.to/bean_bean/fix-claude-code-terminal-flicker-noflicker-config-may-2026-3562</link>
      <guid>https://dev.to/bean_bean/fix-claude-code-terminal-flicker-noflicker-config-may-2026-3562</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://nextfuture.io.vn/blog/claude-code-no-flicker-config" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  Fix Claude Code Terminal Flicker: NO_FLICKER Config (May 2026)
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Claude Code's TUI flickers on narrow terminals because the status line redraws faster than your emulator can paint. Fix: &lt;code&gt;export CLAUDE_CODE_NO_FLICKER=1&lt;/code&gt;, widen your window to ≥100 columns, and confirm your emulator's &lt;code&gt;damage_tracking&lt;/code&gt; setting. Most cases resolve in under a minute.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Symptom
&lt;/h2&gt;

&lt;p&gt;You launch &lt;code&gt;claude&lt;/code&gt;, start typing, and the bottom three lines of the screen — model badge, token meter, working-directory hint — strobe at roughly 30 Hz. Cursor jumps. Long output looks like it's tearing. The faster you type, the worse it gets.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;claude
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; refactor apps/worker/src/scheduler.ts
&lt;span class="o"&gt;[&lt;/span&gt;blink][blink][blink]   ← status line redrawing
sonnet-4-6 │ 12,488 tok │ ~/news-app
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This started showing up widely after the &lt;code&gt;2.5.x&lt;/code&gt; release in March 2026 when Anthropic added the live token meter to the bottom chrome. It's not in your head.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cause
&lt;/h2&gt;

&lt;p&gt;The TUI uses an alt-screen + per-frame diff. Two things conspire:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Status line refresh interval&lt;/strong&gt; defaults to 100 ms while a stream is active.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Narrow terminals&lt;/strong&gt; (&amp;lt;100 cols) force the status line to truncate-and-rewrap on every paint.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Combined, you get a full-line clear-and-rewrite at 10 Hz, which exposes any latency in your emulator's damage tracking. Alacritty and Windows Terminal default settings are most prone; iTerm2 and Kitty hide it well.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fix in 30 seconds
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CLAUDE_CODE_NO_FLICKER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
claude
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;NO_FLICKER=1&lt;/code&gt; tells the TUI to redraw the status line only on token boundaries (every ~512 tokens) instead of every frame. You lose the smooth token-meter animation; you keep your eyesight.&lt;/p&gt;

&lt;p&gt;If flicker persists, widen the terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;tput cols    &lt;span class="c"&gt;# check current width&lt;/span&gt;
&lt;span class="c"&gt;#  explain the difference between RSC and SSR in 1500 words&lt;/span&gt;

&lt;span class="c"&gt;# status line should redraw smoothly, no strobe&lt;/span&gt;
&lt;span class="c"&gt;# token meter increments in steps, not every frame&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you still see strobing, run &lt;code&gt;claude --debug 2&amp;gt;tui.log&lt;/code&gt; and grep for &lt;code&gt;render&lt;/code&gt;. A render time &amp;gt;16 ms per frame means the bottleneck is your emulator, not Claude Code.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Does CLAUDE_CODE_NO_FLICKER=1 disable any features?
&lt;/h3&gt;

&lt;p&gt;Only the smooth animation of the live token meter. Numbers still update — just in chunks instead of every frame. No functional features (slash commands, MCP, hooks) are affected.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why does flicker only happen on narrow terminals?
&lt;/h3&gt;

&lt;p&gt;The status line truncates and rewraps when columns drop below ~100. The rewrap path clears the whole line; the wide-line path only updates changed cells. Different code paths, different paint cost.&lt;/p&gt;

&lt;h3&gt;
  
  
  Will tmux make the flicker worse?
&lt;/h3&gt;

&lt;p&gt;Sometimes. tmux's &lt;code&gt;aggressive-resize on&lt;/code&gt; plus a small pane forces extra redraws. Either set &lt;code&gt;aggressive-resize off&lt;/code&gt; in &lt;code&gt;~/.tmux.conf&lt;/code&gt; or run Claude Code in a full-window pane.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is the flicker an Anthropic bug or a terminal bug?
&lt;/h3&gt;

&lt;p&gt;Both. Anthropic's status line redraws more aggressively than necessary; many emulators have suboptimal damage tracking. &lt;code&gt;CLAUDE_CODE_NO_FLICKER=1&lt;/code&gt; is the official escape hatch while a proper fix is being shipped through the 2.7 release branch.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was originally published on &lt;a href="https://nextfuture.io.vn" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;. Follow us for more fullstack &amp;amp; AI engineering content.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>fullstack</category>
      <category>ai</category>
      <category>webdev</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Claude Code /advisor vs claude-code-router: Which Routing Strategy Wins (May 2026)</title>
      <dc:creator>BeanBean</dc:creator>
      <pubDate>Sun, 03 May 2026 05:00:02 +0000</pubDate>
      <link>https://dev.to/bean_bean/claude-code-advisor-vs-claude-code-router-which-routing-strategy-wins-may-2026-3k0p</link>
      <guid>https://dev.to/bean_bean/claude-code-advisor-vs-claude-code-router-which-routing-strategy-wins-may-2026-3k0p</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://nextfuture.io.vn/blog/claude-code-advisor-vs-claude-code-router" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  Claude Code /advisor vs claude-code-router: Which Routing Strategy Wins (May 2026)
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Anthropic's built-in &lt;code&gt;/advisor&lt;/code&gt; command picks the right Claude model (Opus 4.5, Sonnet 4.6, Haiku 4.5) for your current turn based on task complexity. &lt;code&gt;claude-code-router&lt;/code&gt; is a community proxy that routes to non-Anthropic models too (DeepSeek, Qwen, GLM). Use &lt;code&gt;/advisor&lt;/code&gt; for default Anthropic-first workflows; use the router when you need cross-vendor cost arbitrage.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Both tools answer the same question — &lt;strong&gt;which model should run this turn?&lt;/strong&gt; — but they answer it from very different layers. This post walks through what each does, when each wins, and a decision matrix you can paste into your team's playbook.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is /advisor
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;/advisor&lt;/code&gt; is a first-party slash command shipped with Claude Code (Anthropic's CLI). It inspects the active conversation — file diffs, tool calls, context size, prior thinking depth — and recommends one of the three Anthropic tiers. It runs in-process, no proxy needed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;claude
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /advisor

Recommendation: sonnet-4-6
Reason: 12 file edits queued, codebase 60k → deepseek/deepseek-v3-0526
&lt;span class="o"&gt;[&lt;/span&gt;router] background → qwen/qwen3-coder-plus
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Routing rules live in &lt;code&gt;~/.claude-code-router/config.json&lt;/code&gt;. The router intercepts the wire protocol, so Claude Code itself never knows it's not talking to Anthropic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architectural differences
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Layer:&lt;/strong&gt; &lt;code&gt;/advisor&lt;/code&gt; runs inside the CLI. The router runs as a sidecar on the network path.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Vendor scope:&lt;/strong&gt; &lt;code&gt;/advisor&lt;/code&gt; is Anthropic-only. The router is multi-vendor.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Decision input:&lt;/strong&gt; &lt;code&gt;/advisor&lt;/code&gt; sees full conversation state (tool calls, edits, thinking). The router sees only request size, headers, and a model tag.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Failure mode:&lt;/strong&gt; &lt;code&gt;/advisor&lt;/code&gt; degrades gracefully (it just suggests). The router is in the request path — if it crashes, every turn fails.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cost telemetry:&lt;/strong&gt; &lt;code&gt;/advisor&lt;/code&gt; reports against your Anthropic spend. The router needs you to wire your own logging (Helicone, Langfuse, etc.).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  When /advisor wins
&lt;/h2&gt;

&lt;p&gt;Pick &lt;code&gt;/advisor&lt;/code&gt; when you want zero ops overhead, deterministic billing, and the latest Anthropic features (computer use, code execution tool, MCP cache). It's also the only routing layer that can use the conversation's actual &lt;em&gt;content&lt;/em&gt; — claude-code-router can't tell whether your turn is a one-line typo fix or a 600-line refactor.&lt;/p&gt;

&lt;p&gt;If your team standardized on Claude Opus 4.5 / Sonnet 4.6 / Haiku 4.5, &lt;code&gt;/advisor&lt;/code&gt; typically saves 30-50% on monthly spend by demoting trivial turns to Haiku. See the &lt;a href="https://dev.to/blog/claude-code-advisor-command-deep-dive-2026"&gt;Claude Code /advisor command deep-dive&lt;/a&gt; for the full rule table.&lt;/p&gt;

&lt;h2&gt;
  
  
  When router wins
&lt;/h2&gt;

&lt;p&gt;Pick &lt;code&gt;claude-code-router&lt;/code&gt; when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;You want to use DeepSeek V3 or Qwen3-Coder for bulk codegen and pay $0.27 / Mtok input instead of $3.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You're on the Pro plan ($20/mo) and want to spill overflow to OpenRouter.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You need air-gapped inference via Ollama for compliance.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You want different models for &lt;code&gt;longContext&lt;/code&gt;, &lt;code&gt;background&lt;/code&gt;, and &lt;code&gt;think&lt;/code&gt; phases — the router exposes these hooks; &lt;code&gt;/advisor&lt;/code&gt; does not.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Decision matrix
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Use /advisor &lt;span class="k"&gt;if&lt;/span&gt;:
  - 100% Anthropic, want lowest config friction
  - Need conversation-aware model picking
  - Want first-party support &amp;amp; SLA

Use claude-code-router &lt;span class="k"&gt;if&lt;/span&gt;:
  - Mixing Anthropic + DeepSeek/Qwen/GLM
  - Need cost arbitrage on long-context turns
  - Want air-gapped or self-hosted fallback
  - OK running a sidecar process

Use BOTH &lt;span class="k"&gt;if&lt;/span&gt;:
  - Run /advisor &lt;span class="k"&gt;for &lt;/span&gt;Anthropic-tier picking,
    &lt;span class="k"&gt;then &lt;/span&gt;router &lt;span class="k"&gt;for &lt;/span&gt;vendor-tier fallback when
    rate limits hit. Stack: CLI → /advisor →
    router → provider.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In practice, most indie hackers we surveyed in April 2026 ran &lt;code&gt;/advisor&lt;/code&gt; alone. Teams burning &amp;gt;$2k/month on Claude Code reached for the router to capture DeepSeek's price floor on bulk refactors.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Can I use /advisor and claude-code-router together?
&lt;/h3&gt;

&lt;p&gt;Yes. Set &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt; to the router and let &lt;code&gt;/advisor&lt;/code&gt; emit a model tag the router then maps. Just confirm your router config has rules for &lt;code&gt;opus-4-5&lt;/code&gt;, &lt;code&gt;sonnet-4-6&lt;/code&gt;, and &lt;code&gt;haiku-4-5&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does /advisor work with non-Anthropic models?
&lt;/h3&gt;

&lt;p&gt;No. As of May 2026, &lt;code&gt;/advisor&lt;/code&gt; only ranks across &lt;code&gt;opus-4-5&lt;/code&gt;, &lt;code&gt;sonnet-4-6&lt;/code&gt;, and &lt;code&gt;haiku-4-5&lt;/code&gt;. For cross-vendor routing you need the community router or a custom MCP gateway.&lt;/p&gt;

&lt;h3&gt;
  
  
  Which is cheaper, /advisor or claude-code-router?
&lt;/h3&gt;

&lt;p&gt;For Anthropic-only stacks, &lt;code&gt;/advisor&lt;/code&gt; is usually 5-15% cheaper because it picks Haiku 4.5 more aggressively than humans do. For mixed stacks, the router wins by routing 60-80% of turns to DeepSeek V3 at one-tenth the price.&lt;/p&gt;

&lt;h3&gt;
  
  
  Will /advisor replace claude-code-router?
&lt;/h3&gt;

&lt;p&gt;Unlikely in 2026. Anthropic has no roadmap commitment to non-Anthropic routing inside the CLI. The router fills a real gap and the community is shipping faster than Anthropic on multi-vendor features.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was originally published on &lt;a href="https://nextfuture.io.vn" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;. Follow us for more fullstack &amp;amp; AI engineering content.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>fullstack</category>
      <category>ai</category>
      <category>webdev</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Claude Code /advisor Recipes: 5 Use Cases with Real Output (May 2026)</title>
      <dc:creator>BeanBean</dc:creator>
      <pubDate>Sun, 03 May 2026 05:00:00 +0000</pubDate>
      <link>https://dev.to/bean_bean/claude-code-advisor-recipes-5-use-cases-with-real-output-may-2026-130o</link>
      <guid>https://dev.to/bean_bean/claude-code-advisor-recipes-5-use-cases-with-real-output-may-2026-130o</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://nextfuture.io.vn/blog/claude-code-advisor-recipes-5-use-cases-with-real-output-may-2026" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  Claude Code /advisor Recipes: 5 Use Cases with Real Output (May 2026)
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Five copy-paste recipes for &lt;code&gt;/advisor&lt;/code&gt; with the exact invocation and a trimmed sample of what Claude Code actually returns. Covers refactoring, Vitest test gen, JSDoc emission, Express→Hono porting, and flaky test debugging.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The &lt;code&gt;/advisor&lt;/code&gt; command is the lowest-friction way to put the right Claude tier on the right job. Below are five recipes I've run on shipping repos in April-May 2026. Outputs are trimmed for brevity but preserve the structure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setup
&lt;/h2&gt;

&lt;p&gt;You need Claude Code v2.6+ and an active Anthropic API key or Pro subscription:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;span class="c"&gt;# claude-code 2.6.3 (build 2026.04.18)&lt;/span&gt;

claude config get advisor.enabled
&lt;span class="c"&gt;# true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For all recipes below, run &lt;code&gt;claude&lt;/code&gt; from the repo root, then issue the slash command shown.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recipe 1: Refactor a 600-line file
&lt;/h2&gt;

&lt;p&gt;💡 &lt;strong&gt;Gợi ý tool&lt;/strong&gt;: Nếu bạn đang triển khai tương tự, &lt;a href="https://dev.to/api/affiliate/click?slug=ranked-ai&amp;amp;post=claude-code-advisor-recipes-5-use-cases-with-real-output-may-2026"&gt;&lt;strong&gt;Ranked.ai&lt;/strong&gt;&lt;/a&gt; — AI-powered SEO &amp;amp; PPC service — fully managed, white hat, and built for modern search engines. Starting at $99/month..&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /advisor refactor apps/worker/src/scheduler.ts &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--goal&lt;/span&gt; &lt;span class="s2"&gt;"split into per-job modules, no behavior change"&lt;/span&gt;

&lt;span class="o"&gt;[&lt;/span&gt;advisor] file: 612 lines, 14 imports, 9 exported fns
&lt;span class="o"&gt;[&lt;/span&gt;advisor] picked: opus-4-5  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;complexity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;high, edits&amp;gt;200&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;advisor] plan:
  1. Extract job registry → &lt;span class="nb"&gt;jobs&lt;/span&gt;/registry.ts
  2. Move cron parsing → utils/cron.ts
  3. Split run loop → core/run-loop.ts
  4. Keep scheduler.ts as thin orchestrator &lt;span class="o"&gt;(&lt;/span&gt;~80 lines&lt;span class="o"&gt;)&lt;/span&gt;
Apply? &lt;span class="o"&gt;[&lt;/span&gt;y/N]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; &lt;code&gt;/advisor&lt;/code&gt; escalates to Opus when edits cross multiple files. It also emits a plan you can edit before applying — that single prompt saved roughly 20 minutes of manual chunking.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recipe 2: Generate Vitest tests for an API route
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /advisor &lt;span class="nb"&gt;test &lt;/span&gt;apps/nextfuture/src/app/api/v1/posts/route.ts &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--framework&lt;/span&gt; vitest &lt;span class="nt"&gt;--coverage&lt;/span&gt; 80

&lt;span class="o"&gt;[&lt;/span&gt;advisor] picked: sonnet-4-6  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;complexity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;medium&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;advisor] generated: 7 tests
  ✓ POST returns 401 without X-API-Key
  ✓ POST returns 429 after 60 req/min
  ✓ POST validates body via zod schema
  ✓ POST inserts row and returns 201
  ✓ GET paginates with default &lt;span class="nv"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;10
  ✓ GET respects ?page&lt;span class="o"&gt;=&lt;/span&gt;&amp;amp;limit&lt;span class="o"&gt;=&lt;/span&gt; params
  ✓ GET returns empty array on no rows
&lt;span class="o"&gt;[&lt;/span&gt;advisor] coverage: 86% lines, 78% branches
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Pin &lt;code&gt;--framework&lt;/code&gt; explicitly. Without it, &lt;code&gt;/advisor&lt;/code&gt; guesses Jest if it sees &lt;code&gt;jest.config.js&lt;/code&gt; in any ancestor directory, which can mismatch your real test runner.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recipe 3: Write JSDoc for a utility lib
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /advisor docs packages/database/src/dm-v2/&lt;span class="k"&gt;*&lt;/span&gt;.ts &lt;span class="nt"&gt;--style&lt;/span&gt; typedoc

&lt;span class="o"&gt;[&lt;/span&gt;advisor] picked: haiku-4-5  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;complexity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;low, mechanical&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;advisor] modified 6 files, +218 lines of JSDoc
&lt;span class="o"&gt;[&lt;/span&gt;advisor] sample &lt;span class="o"&gt;(&lt;/span&gt;canonical-key.ts&lt;span class="o"&gt;)&lt;/span&gt;:
  /&lt;span class="k"&gt;**&lt;/span&gt;
   &lt;span class="k"&gt;*&lt;/span&gt; Compute the canonical key &lt;span class="k"&gt;for &lt;/span&gt;a product variant.
   &lt;span class="k"&gt;*&lt;/span&gt; @param raw - Untrimmed product name from retailer feed.
   &lt;span class="k"&gt;*&lt;/span&gt; @returns Lowercase, slug-safe identifier &lt;span class="o"&gt;(&lt;/span&gt;≤64 chars&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
   &lt;span class="k"&gt;*&lt;/span&gt; @example
   &lt;span class="k"&gt;*&lt;/span&gt;   canonicalKey&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"iPhone 17 Pro Max — 256GB"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
   &lt;span class="k"&gt;*&lt;/span&gt;   // → &lt;span class="s2"&gt;"iphone-17-pro-max-256gb"&lt;/span&gt;
   &lt;span class="k"&gt;*&lt;/span&gt;/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Documentation is the killer Haiku 4.5 use case — three times cheaper than Sonnet, and quality is indistinguishable for &lt;code&gt;@param&lt;/code&gt;/&lt;code&gt;@returns&lt;/code&gt; emission. &lt;code&gt;/advisor&lt;/code&gt; recognizes this and demotes automatically. See the &lt;a href="https://dev.to/blog/claude-code-advisor-command-deep-dive-2026"&gt;Claude Code /advisor command deep-dive&lt;/a&gt; for the full demotion rule table.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recipe 4: Port an Express route to Hono
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /advisor port apps/worker/src/routes/trigger.ts &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--from&lt;/span&gt; express &lt;span class="nt"&gt;--to&lt;/span&gt; hono

&lt;span class="o"&gt;[&lt;/span&gt;advisor] picked: sonnet-4-6
&lt;span class="o"&gt;[&lt;/span&gt;advisor] diff preview &lt;span class="o"&gt;(&lt;/span&gt;truncated&lt;span class="o"&gt;)&lt;/span&gt;:
- import &lt;span class="o"&gt;{&lt;/span&gt; Router &lt;span class="o"&gt;}&lt;/span&gt; from &lt;span class="s2"&gt;"express"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
- const router &lt;span class="o"&gt;=&lt;/span&gt; Router&lt;span class="o"&gt;()&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
- router.post&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"/trigger/:jobId"&lt;/span&gt;, async &lt;span class="o"&gt;(&lt;/span&gt;req, res&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
-   const &lt;span class="o"&gt;{&lt;/span&gt; jobId &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; req.params&lt;span class="p"&gt;;&lt;/span&gt;
-   ...
-   res.status&lt;span class="o"&gt;(&lt;/span&gt;200&lt;span class="o"&gt;)&lt;/span&gt;.json&lt;span class="o"&gt;({&lt;/span&gt; ok: &lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="o"&gt;})&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
- &lt;span class="o"&gt;})&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
+ import &lt;span class="o"&gt;{&lt;/span&gt; Hono &lt;span class="o"&gt;}&lt;/span&gt; from &lt;span class="s2"&gt;"hono"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
+ const app &lt;span class="o"&gt;=&lt;/span&gt; new Hono&lt;span class="o"&gt;()&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
+ app.post&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"/trigger/:jobId"&lt;/span&gt;, async &lt;span class="o"&gt;(&lt;/span&gt;c&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
+   const jobId &lt;span class="o"&gt;=&lt;/span&gt; c.req.param&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"jobId"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
+   ...
+   &lt;span class="k"&gt;return &lt;/span&gt;c.json&lt;span class="o"&gt;({&lt;/span&gt; ok: &lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="o"&gt;})&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
+ &lt;span class="o"&gt;})&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;advisor] also rewrote: middleware &lt;span class="o"&gt;(&lt;/span&gt;3&lt;span class="o"&gt;)&lt;/span&gt;, error handler &lt;span class="o"&gt;(&lt;/span&gt;1&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Framework ports are deterministic enough that Sonnet 4.6 nails them. &lt;code&gt;/advisor&lt;/code&gt; won't pick Opus here unless the file imports a framework-specific plugin Sonnet doesn't recognize.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recipe 5: Debug a flaky test
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /advisor debug &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="s2"&gt;"Vitest 'auto-publish picks oldest pending' fails 1/10 runs"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--file&lt;/span&gt; apps/worker/src/jobs/auto-publish.test.ts

&lt;span class="o"&gt;[&lt;/span&gt;advisor] picked: opus-4-5  &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;complexity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;diagnostic&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;advisor] hypotheses ranked:
  1. &lt;span class="o"&gt;(&lt;/span&gt;0.71&lt;span class="o"&gt;)&lt;/span&gt; Date.now&lt;span class="o"&gt;()&lt;/span&gt; mocked &lt;span class="k"&gt;in &lt;/span&gt;fixture but not awaited
  2. &lt;span class="o"&gt;(&lt;/span&gt;0.18&lt;span class="o"&gt;)&lt;/span&gt; Postgres index scan order non-deterministic
  3. &lt;span class="o"&gt;(&lt;/span&gt;0.08&lt;span class="o"&gt;)&lt;/span&gt; Redis SCAN cursor reset between runs
  4. &lt;span class="o"&gt;(&lt;/span&gt;0.03&lt;span class="o"&gt;)&lt;/span&gt; Vitest &lt;span class="nv"&gt;concurrent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true &lt;/span&gt;reorders setup
&lt;span class="o"&gt;[&lt;/span&gt;advisor] suggested fix:
  - vi.useFakeTimers&lt;span class="o"&gt;({&lt;/span&gt; now: new Date&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"2026-05-01T00:00:00Z"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;})&lt;/span&gt;
  - await vi.runAllTimersAsync&lt;span class="o"&gt;()&lt;/span&gt; before assertion
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; For diagnosis, Opus 4.5's deeper reasoning is worth the cost. &lt;code&gt;/advisor&lt;/code&gt;'s ranked hypotheses gave the right root cause first try; the fake-timer fix landed in one commit.&lt;/p&gt;

&lt;h2&gt;
  
  
  When recipes break
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Monorepos with workspace aliases:&lt;/strong&gt; if &lt;code&gt;tsconfig.json&lt;/code&gt; paths aren't resolvable from the file, &lt;code&gt;/advisor&lt;/code&gt; may misclassify complexity. Run from the workspace root.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Files &amp;gt; 2000 lines:&lt;/strong&gt; &lt;code&gt;/advisor&lt;/code&gt; truncates at the context limit; pre-split first.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Generated code:&lt;/strong&gt; if the file has a &lt;code&gt;// AUTOGENERATED&lt;/code&gt; banner, &lt;code&gt;/advisor&lt;/code&gt; refuses by default. Override with &lt;code&gt;--allow-generated&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Hooks blocking edits:&lt;/strong&gt; a &lt;code&gt;PreToolUse&lt;/code&gt; hook that blocks Write will block &lt;code&gt;/advisor&lt;/code&gt;'s apply step. Check &lt;code&gt;~/.claude/settings.json&lt;/code&gt; when you see &lt;code&gt;BLOCKED&lt;/code&gt; in output.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Does /advisor work without internet access?
&lt;/h3&gt;

&lt;p&gt;No. &lt;code&gt;/advisor&lt;/code&gt; calls the Anthropic API to score complexity. There is no offline mode as of May 2026.&lt;/p&gt;

&lt;h3&gt;
  
  
  How do I stop /advisor picking Opus 4.5 too often?
&lt;/h3&gt;

&lt;p&gt;Set &lt;code&gt;"advisor": { "maxTier": "sonnet-4-6" }&lt;/code&gt; in &lt;code&gt;~/.claude/settings.json&lt;/code&gt;. &lt;code&gt;/advisor&lt;/code&gt; will cap recommendations at Sonnet and warn when a turn would benefit from Opus.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can /advisor run inside CI?
&lt;/h3&gt;

&lt;p&gt;Yes. Use &lt;code&gt;claude --advisor=auto --task "..."&lt;/code&gt; with &lt;code&gt;ANTHROPIC_API_KEY&lt;/code&gt; set as a secret. CI runs are non-interactive and write the chosen tier to stdout for logging.&lt;/p&gt;

&lt;h3&gt;
  
  
  What if /advisor returns nothing?
&lt;/h3&gt;

&lt;p&gt;Usually a context-window overflow or an MCP server holding stale state. Run &lt;code&gt;/clear&lt;/code&gt;, restart the CLI, and retry. If it persists, &lt;code&gt;--debug&lt;/code&gt; will print the rejection reason.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was originally published on &lt;a href="https://nextfuture.io.vn" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;. Follow us for more fullstack &amp;amp; AI engineering content.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>fullstack</category>
      <category>ai</category>
      <category>webdev</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Claude Code /advisor Troubleshooting: 8 Errors and Fixes (May 2026)</title>
      <dc:creator>BeanBean</dc:creator>
      <pubDate>Sat, 02 May 2026 23:00:02 +0000</pubDate>
      <link>https://dev.to/bean_bean/claude-code-advisor-troubleshooting-8-errors-and-fixes-may-2026-28j6</link>
      <guid>https://dev.to/bean_bean/claude-code-advisor-troubleshooting-8-errors-and-fixes-may-2026-28j6</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://nextfuture.io.vn/blog/claude-code-advisor-troubleshooting" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1&gt;
  
  
  Claude Code /advisor Troubleshooting: 8 Errors and Fixes (May 2026)
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Eight errors I've hit running &lt;code&gt;/advisor&lt;/code&gt; on real repos in April-May 2026, each with the exact message, the root cause, and the one-liner fix. Bookmark this for the next time your CLI screams at midnight.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Quick diagnostic checklist
&lt;/h2&gt;

&lt;p&gt;Before grepping the table below, run these three:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;--version&lt;/span&gt;                       &lt;span class="c"&gt;# 2.6+ required&lt;/span&gt;
claude doctor                           &lt;span class="c"&gt;# checks auth, MCP, hooks&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nv"&gt;$ANTHROPIC_API_KEY&lt;/span&gt; | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; 12   &lt;span class="c"&gt;# sanity-check the key&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Most &lt;code&gt;/advisor&lt;/code&gt; failures fall out of &lt;code&gt;claude doctor&lt;/code&gt; immediately. If not, match symptoms below. For background on what &lt;code&gt;/advisor&lt;/code&gt; actually does, see the &lt;a href="https://dev.to/blog/claude-code-advisor-command-deep-dive-2026"&gt;Claude Code /advisor command deep-dive&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. "model not found"
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Error: model &lt;span class="s2"&gt;"claude-3-5-opus"&lt;/span&gt; not found
  at AdvisorCommand.resolveModel &lt;span class="o"&gt;(&lt;/span&gt;advisor.ts:118&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cause:&lt;/strong&gt; Stale config pinning a retired model alias. Anthropic deprecated &lt;code&gt;claude-3-5-*&lt;/code&gt; in February 2026.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude config &lt;span class="nb"&gt;set &lt;/span&gt;advisor.opus   &lt;span class="s2"&gt;"claude-opus-4-5"&lt;/span&gt;
claude config &lt;span class="nb"&gt;set &lt;/span&gt;advisor.sonnet &lt;span class="s2"&gt;"claude-sonnet-4-6"&lt;/span&gt;
claude config &lt;span class="nb"&gt;set &lt;/span&gt;advisor.haiku  &lt;span class="s2"&gt;"claude-haiku-4-5"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2. MCP server conflict
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Error: MCP server &lt;span class="s2"&gt;"filesystem"&lt;/span&gt; returned tool with name
&lt;span class="s2"&gt;"read_file"&lt;/span&gt; already provided by server &lt;span class="s2"&gt;"default"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cause:&lt;/strong&gt; Two MCP servers registering the same tool name. &lt;code&gt;/advisor&lt;/code&gt; aborts because tool routing is ambiguous.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; Edit &lt;code&gt;~/.claude/mcp.json&lt;/code&gt; and either remove the duplicate or namespace one with &lt;code&gt;"toolPrefix": "fs_"&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;filesystem&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;command&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;npx&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;args&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;-y&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@modelcontextprotocol/server-filesystem&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/repo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;toolPrefix&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;fs_&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  3. Rate limit 429
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Error 429: rate_limit_error — input tokens per minute exceeded
  Retry-After: 38s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cause:&lt;/strong&gt; &lt;code&gt;/advisor&lt;/code&gt; ran twice in quick succession on a large file, each scoring + acting. Pro tier hits ITPM ceilings around 80k/min.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; add a backoff and a cap:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude config &lt;span class="nb"&gt;set &lt;/span&gt;advisor.maxConcurrent 1
claude config &lt;span class="nb"&gt;set &lt;/span&gt;advisor.backoffMs 2000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4. Context window exceeded
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Error: prompt is 218,402 tokens&lt;span class="p"&gt;;&lt;/span&gt; max &lt;span class="k"&gt;for &lt;/span&gt;sonnet-4-6 is 200000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cause:&lt;/strong&gt; Stuffed conversation, often after pasting log output or running &lt;code&gt;/advisor&lt;/code&gt; on a giant file plus an active MCP doc-load.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; compact the context, then retry:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/clear           &lt;span class="c"&gt;# nuke the session&lt;/span&gt;
&lt;span class="c"&gt;# OR&lt;/span&gt;
/compact         &lt;span class="c"&gt;# summarize and retain semantic gist&lt;/span&gt;
&lt;span class="c"&gt;# OR pin to 1M-context Opus&lt;/span&gt;
claude &lt;span class="nt"&gt;--model&lt;/span&gt; claude-opus-4-5-1m /advisor ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  5. Auth token expired
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Error 401: authentication_error — invalid x-api-key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cause:&lt;/strong&gt; OAuth session expired (Pro plan) or the env var got stomped by a shell rc that overrides &lt;code&gt;ANTHROPIC_API_KEY&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nb"&gt;logout
&lt;/span&gt;claude login        &lt;span class="c"&gt;# OAuth flow opens browser&lt;/span&gt;
&lt;span class="c"&gt;# OR for API-key users:&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"sk-ant-..."&lt;/span&gt;   &lt;span class="c"&gt;# pin in .envrc / direnv&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  6. Invalid permission mode
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Error: permission &lt;span class="s2"&gt;"auto-accept-all"&lt;/span&gt; not allowed &lt;span class="k"&gt;in &lt;/span&gt;advisor turns
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cause:&lt;/strong&gt; Project &lt;code&gt;.claude/settings.json&lt;/code&gt; sets a permission mode that conflicts with &lt;code&gt;/advisor&lt;/code&gt;'s preview-then-apply contract.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; drop the override. &lt;code&gt;/advisor&lt;/code&gt; needs at least &lt;code&gt;plan&lt;/code&gt; mode so the user can accept the recommendation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude config &lt;span class="nb"&gt;set &lt;/span&gt;permissionMode plan
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  7. Hook blocking edits
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;[&lt;/span&gt;Hook] BLOCKED: File exceeds 800 lines &lt;span class="o"&gt;(&lt;/span&gt;842 lines&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;Hook] Split into smaller modules
Error: PreToolUse hook denied Write to scheduler.ts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cause:&lt;/strong&gt; A guardrail hook in &lt;code&gt;~/.claude/settings.json&lt;/code&gt; rejects oversized writes. &lt;code&gt;/advisor&lt;/code&gt; respects hooks — that's the point — so it bails.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; let &lt;code&gt;/advisor&lt;/code&gt; emit the refactor plan first, accept the file split, &lt;em&gt;then&lt;/em&gt; the writes will all pass the 800-line gate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/advisor refactor scheduler.ts &lt;span class="nt"&gt;--goal&lt;/span&gt; &lt;span class="s2"&gt;"≤ 800 lines per file"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  8. Shell escape issues with quotes
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;/advisor &lt;span class="s2"&gt;"fix the bug in 'auto-publish' job"&lt;/span&gt;
Error: unexpected token &lt;span class="s1"&gt;'
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cause:&lt;/strong&gt; Nested single quotes break the shell tokenizer before &lt;code&gt;/advisor&lt;/code&gt; ever sees the string.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; use the heredoc form, or escape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/advisor &lt;span class="s2"&gt;"fix the bug in &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;auto-publish&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; job"&lt;/span&gt;
&lt;span class="c"&gt;# OR&lt;/span&gt;
claude &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt;'
/advisor fix the bug in 'auto-publish' job
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Where to file bugs
&lt;/h2&gt;

&lt;p&gt;Anthropic accepts &lt;code&gt;/advisor&lt;/code&gt; issues at &lt;code&gt;github.com/anthropics/claude-code/issues&lt;/code&gt;. Include &lt;code&gt;claude doctor&lt;/code&gt; output, the redacted slash invocation, and your &lt;code&gt;~/.claude/settings.json&lt;/code&gt;. Don't paste API keys.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Where are Claude Code logs stored?
&lt;/h3&gt;

&lt;p&gt;On macOS: &lt;code&gt;~/Library/Logs/Claude/&lt;/code&gt;. On Linux: &lt;code&gt;~/.local/state/claude/logs/&lt;/code&gt;. &lt;code&gt;/advisor&lt;/code&gt; writes a JSONL line per recommendation including model picked, score, and apply outcome.&lt;/p&gt;

&lt;h3&gt;
  
  
  How do I downgrade Claude Code if /advisor breaks?
&lt;/h3&gt;

&lt;p&gt;Pin a known-good version: &lt;code&gt;npm i -g @anthropic-ai/claude-code@2.6.0&lt;/code&gt;. Anthropic keeps the last six minor releases on npm.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does /advisor count as one or two API calls?
&lt;/h3&gt;

&lt;p&gt;Two. The scoring call is small (~500 input tokens, Haiku-priced) and the action call runs at the recommended tier. The scoring overhead is ~1% of session cost.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can a hook silently break /advisor?
&lt;/h3&gt;

&lt;p&gt;Yes. A &lt;code&gt;PreToolUse&lt;/code&gt; hook that mutates tool input without echoing it back will look like a hang. Always have your hook write to &lt;code&gt;stdout&lt;/code&gt; on success and &lt;code&gt;stderr&lt;/code&gt; on block.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was originally published on &lt;a href="https://nextfuture.io.vn" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;. Follow us for more fullstack &amp;amp; AI engineering content.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>fullstack</category>
      <category>ai</category>
      <category>webdev</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Jotai vs Recoil 2026: The Atomic State Migration (Recoil Is Deprecated)</title>
      <dc:creator>BeanBean</dc:creator>
      <pubDate>Sat, 02 May 2026 23:00:00 +0000</pubDate>
      <link>https://dev.to/bean_bean/jotai-vs-recoil-2026-the-atomic-state-migration-recoil-is-deprecated-1m4d</link>
      <guid>https://dev.to/bean_bean/jotai-vs-recoil-2026-the-atomic-state-migration-recoil-is-deprecated-1m4d</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://nextfuture.io.vn/blog/jotai-vs-recoil-2026-the-atomic-state-migration-recoil-is-deprecated" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you started a Recoil project between 2020 and 2023, this post is for you. Meta archived the Recoil repository, and the project no longer ships meaningful releases. &lt;strong&gt;Recoil is deprecated — do not start new projects with it.&lt;/strong&gt; Jotai is the actively maintained spiritual successor with the same atomic mental model and better React 19 / Next.js 16 RSC compatibility.&lt;/p&gt;

&lt;p&gt;For the wider state-management landscape, see our &lt;a href="https://dev.to/blog/ultimate-guide-react-state-management-2026"&gt;complete React state management guide&lt;/a&gt;. For when atomic state is even the right tool, read &lt;a href="https://dev.to/blog/react-server-state-vs-client-state-guide"&gt;server state vs client state in React 2026&lt;/a&gt; first — much of what people reach for atoms for is actually server state.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this comparison matters now
&lt;/h2&gt;

&lt;p&gt;Recoil's last meaningful release was years ago. It still installs, it still works in React 18, but it has unresolved React 19 issues, no concurrent-rendering refinements, and Meta has publicly moved on. Choosing it for new code in 2026 is choosing to inherit a migration debt later.&lt;/p&gt;

&lt;p&gt;Jotai (by Daishi Kato, also author of Zustand and Valtio) covers the same atomic ground: small primitive units of state, derived atoms, fine-grained subscriptions, no reducers required. It is actively maintained and has explicit React 19 / Suspense support.&lt;/p&gt;

&lt;h2&gt;
  
  
  API similarity, side by side
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Recoil&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;atom&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useRecoilState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useRecoilValue&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;recoil&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;countAtom&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;atom&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;count&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;doubledSelector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;doubled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="kd"&gt;get&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;countAtom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setCount&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useRecoilState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;countAtom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;doubled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useRecoilValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;doubledSelector&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt;  &lt;span class="nf"&gt;setCount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="nx"&gt;doubled&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Jotai&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;atom&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useAtom&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useAtomValue&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;jotai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;countAtom&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;atom&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;doubledAtom&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;atom&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="kd"&gt;get&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;countAtom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setCount&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useAtom&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;countAtom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;doubled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useAtomValue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;doubledAtom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt;  &lt;span class="nf"&gt;setCount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="nx"&gt;doubled&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same model. Jotai removes the &lt;code&gt;key&lt;/code&gt; string requirement (it uses object identity), drops the artificial &lt;code&gt;atom&lt;/code&gt; vs &lt;code&gt;selector&lt;/code&gt; split, and ships smaller. Anyone fluent in Recoil is productive in Jotai inside an hour.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Jotai improved
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;No string keys.&lt;/strong&gt; Atom identity is the JS reference. No more "duplicate atom key" runtime errors.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Unified primitive.&lt;/strong&gt; Read-only, write-only, and derived atoms are all just &lt;code&gt;atom(...)&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Async atoms with Suspense.&lt;/strong&gt; An async &lt;code&gt;atom&lt;/code&gt; getter integrates cleanly with &lt;code&gt;&amp;lt;Suspense&amp;gt;&lt;/code&gt; boundaries in React 19.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Smaller bundle.&lt;/strong&gt; Jotai core is roughly half the size of Recoil.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Active ecosystem.&lt;/strong&gt; &lt;code&gt;jotai/utils&lt;/code&gt;, &lt;code&gt;jotai-tanstack-query&lt;/code&gt;, &lt;code&gt;jotai-zustand&lt;/code&gt;, devtools — actually shipping.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Better RSC story.&lt;/strong&gt; &lt;code&gt;jotai/react&lt;/code&gt; works in Next.js 16 App Router with documented per-request store patterns.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What Recoil had that Jotai lacks
&lt;/h2&gt;

&lt;p&gt;Honest accounting: Recoil's &lt;code&gt;atomFamily&lt;/code&gt; and &lt;code&gt;selectorFamily&lt;/code&gt; were elegant for parametric atoms. Jotai's &lt;code&gt;atomFamily&lt;/code&gt; from &lt;code&gt;jotai/utils&lt;/code&gt; covers the same ground but with slightly different semantics around equality and cleanup. Recoil's snapshot/transaction APIs (&lt;code&gt;useRecoilCallback&lt;/code&gt;, snapshots) had no exact 1:1 in Jotai for a while; &lt;code&gt;useAtomCallback&lt;/code&gt; and the imperative store API close most of that gap in 2026.&lt;/p&gt;

&lt;h2&gt;
  
  
  Migration cheatsheet
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;atom({ key: 'x', default: v })&lt;/code&gt; → &lt;code&gt;atom(v)&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;selector({ key: 'd', get: ({get}) =&amp;gt; ... })&lt;/code&gt; → &lt;code&gt;atom((get) =&amp;gt; ...)&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;useRecoilState(a)&lt;/code&gt; → &lt;code&gt;useAtom(a)&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;useRecoilValue(a)&lt;/code&gt; → &lt;code&gt;useAtomValue(a)&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;useSetRecoilState(a)&lt;/code&gt; → &lt;code&gt;useSetAtom(a)&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;atomFamily({ key, default })&lt;/code&gt; → &lt;code&gt;atomFamily((param) =&amp;gt; atom(...))&lt;/code&gt; from &lt;code&gt;jotai/utils&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;&amp;lt;RecoilRoot&amp;gt;&lt;/code&gt; → &lt;code&gt;&amp;lt;Provider&amp;gt;&lt;/code&gt; from &lt;code&gt;jotai&lt;/code&gt; (or omit for the default global store)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a typical Recoil codebase, a focused codemod plus a half-day cleanup pass per medium app is realistic. Atom families and snapshot-heavy code take longer.&lt;/p&gt;

&lt;h2&gt;
  
  
  RSC and Suspense compatibility
&lt;/h2&gt;

&lt;p&gt;In Next.js 16 App Router, atoms only live in Client Components. Wrap the client tree once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/providers.tsx&lt;/span&gt;
&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;use client&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;createStore&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;jotai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;ReactNode&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;JotaiProvider&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;children&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;children&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ReactNode&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;store&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;createStore&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; 
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;children&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Async atoms suspend correctly inside React 19's &lt;code&gt;&amp;lt;Suspense&amp;gt;&lt;/code&gt; boundaries. Recoil's async story still produces stale-snapshot warnings under React 19's stricter concurrent semantics in many real codebases.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bundle size
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Jotai core:&lt;/strong&gt; ~3-4kb gzipped.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Recoil:&lt;/strong&gt; ~14-16kb gzipped (and frozen in time).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a single-page app where atomic state is the core architecture, that delta compounds across every route.&lt;/p&gt;

&lt;h2&gt;
  
  
  Verdict
&lt;/h2&gt;

&lt;p&gt;If you have new work, pick Jotai. If you have an existing Recoil app, plan the migration before React 20 — leaving a deprecated dependency in a critical path is a debt that only compounds. The API is similar enough that the rewrite is mostly mechanical, and you finish with a smaller, faster, actively maintained codebase.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is Recoil officially deprecated?
&lt;/h3&gt;

&lt;p&gt;The Recoil repository was archived by Meta in early 2025 and the project no longer receives meaningful updates. The community has accepted Jotai as the successor for atomic state in React.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can Jotai replace Recoil 1:1?
&lt;/h3&gt;

&lt;p&gt;For 90% of patterns, yes — including atoms, selectors, atom families, async, and writable derived state. Snapshot-heavy code (&lt;code&gt;useRecoilCallback&lt;/code&gt; with full snapshots) needs more thought.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does Jotai work with React Server Components?
&lt;/h3&gt;

&lt;p&gt;Yes, on the client side. Atoms only live in Client Components. Use a per-request &lt;code&gt;createStore()&lt;/code&gt; in Next.js App Router to avoid cross-request state leaks.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is Jotai smaller than Zustand?
&lt;/h3&gt;

&lt;p&gt;They are in the same weight class, low single-digit kilobytes gzipped. Choose based on the mental model: atoms (Jotai) vs single store with selectors (Zustand). Both are excellent.&lt;/p&gt;

&lt;h3&gt;
  
  
  Should I pick Jotai over TanStack Query?
&lt;/h3&gt;

&lt;p&gt;They solve different problems. Jotai is for client state (UI, derived values). TanStack Query is for server state (cache, refetch, invalidation). Use both.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was originally published on &lt;a href="https://nextfuture.io.vn" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;. Follow us for more fullstack &amp;amp; AI engineering content.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>fullstack</category>
      <category>ai</category>
      <category>webdev</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Zustand vs Redux 2026: 8 Real-World Differences (Bundle, DX, Perf)</title>
      <dc:creator>BeanBean</dc:creator>
      <pubDate>Sat, 02 May 2026 17:00:02 +0000</pubDate>
      <link>https://dev.to/bean_bean/zustand-vs-redux-2026-8-real-world-differences-bundle-dx-perf-981</link>
      <guid>https://dev.to/bean_bean/zustand-vs-redux-2026-8-real-world-differences-bundle-dx-perf-981</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://nextfuture.io.vn/blog/zustand-vs-redux-2026-8-real-world-differences-bundle-dx-perf" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Picking a state library in 2026 is no longer a religious war. It is a budget question. Bundle bytes are billed against your LCP. Boilerplate is billed against your team's velocity. Here is how &lt;strong&gt;Zustand&lt;/strong&gt; and &lt;strong&gt;Redux Toolkit&lt;/strong&gt; actually compare on the eight things that matter once you ship to production.&lt;/p&gt;

&lt;p&gt;If you want the broader landscape — Jotai, Valtio, TanStack Query, signals — read our &lt;a href="https://dev.to/blog/ultimate-guide-react-state-management-2026"&gt;complete React state management guide&lt;/a&gt; first.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;New project?&lt;/strong&gt; Pick Zustand. Smaller bundle, less ceremony, ergonomic with TypeScript and React 19.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Existing RTK + RTK Query app with experienced team?&lt;/strong&gt; Stay. Migration cost rarely beats the tax of dual paradigms.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Need time-travel debugging or strict event-sourced patterns?&lt;/strong&gt; Redux still wins.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  1. Bundle size
&lt;/h2&gt;

&lt;p&gt;The honest comparison is &lt;code&gt;zustand&lt;/code&gt; vs &lt;code&gt;@reduxjs/toolkit&lt;/code&gt; + &lt;code&gt;react-redux&lt;/code&gt;, because nobody ships hand-rolled Redux anymore.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Zustand 4.5.x:&lt;/strong&gt; ~3-4kb gzipped, zero peer deps beyond React.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Redux Toolkit + react-redux:&lt;/strong&gt; ~17-19kb gzipped (Immer, Reselect, Redux core, RTK bindings).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On a 150kb landing-page JS budget, that delta is real. On a 300kb app shell, it is rounding error. Decide accordingly.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Boilerplate, side by side
&lt;/h2&gt;

&lt;p&gt;A simple counter store, both libraries.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Zustand&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;create&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zustand&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;CounterState&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;
  &lt;span class="na"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt;
  &lt;span class="na"&gt;reset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;useCounter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="kd"&gt;set&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;})),&lt;/span&gt;
  &lt;span class="na"&gt;reset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;}))&lt;/span&gt;

&lt;span class="c1"&gt;// Usage&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useCounter&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;increment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useCounter&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Redux Toolkit&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;configureStore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;createSlice&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;PayloadAction&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@reduxjs/toolkit&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useDispatch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useSelector&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react-redux&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;counterSlice&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createSlice&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;counter&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;initialState&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;reducers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;reset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reset&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;counterSlice&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;actions&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;configureStore&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;reducer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;counter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;counterSlice&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;reducer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;RootState&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;ReturnType&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;AppDispatch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dispatch&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Zustand: one file, one hook. Redux Toolkit: slice, store, typed hooks, provider wrap. RTK trimmed the worst of classic Redux, but Zustand removed the ceremony entirely.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. DevTools
&lt;/h2&gt;

&lt;p&gt;Redux DevTools is the gold standard: action log, time travel, state diff, dispatch replay. Zustand integrates with the same extension via a one-line middleware:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;devtools&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zustand/middleware&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;useCounter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;()(&lt;/span&gt;
  &lt;span class="nf"&gt;devtools&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="kd"&gt;set&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;}))&lt;/span&gt; &lt;span class="p"&gt;}))&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You get action names and state snapshots, but Redux's strict event-sourced model produces a more meaningful history when bugs are deep.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Middleware
&lt;/h2&gt;

&lt;p&gt;Redux has a mature middleware ecosystem: &lt;code&gt;redux-thunk&lt;/code&gt;, &lt;code&gt;redux-saga&lt;/code&gt;, &lt;code&gt;redux-observable&lt;/code&gt;, &lt;code&gt;listenerMiddleware&lt;/code&gt;. Zustand ships &lt;code&gt;persist&lt;/code&gt;, &lt;code&gt;devtools&lt;/code&gt;, &lt;code&gt;immer&lt;/code&gt;, and &lt;code&gt;subscribeWithSelector&lt;/code&gt; out of the box. For 90% of apps that is enough. For complex effect orchestration (long-lived sagas, cancellation graphs), Redux remains the better fit.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Async and server data
&lt;/h2&gt;

&lt;p&gt;Both libraries are bad places to put server data. Use &lt;strong&gt;TanStack Query&lt;/strong&gt; or &lt;strong&gt;RTK Query&lt;/strong&gt; instead. We dig into the dichotomy in &lt;a href="https://dev.to/blog/react-server-state-vs-client-state-guide"&gt;server state vs client state in React 2026&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you must do async in the store, Zustand is honest about it — write a normal async function inside &lt;code&gt;create&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;useUser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="kd"&gt;set&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;loading&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;fetchUser&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;loading&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`/api/users/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="na"&gt;loading&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  6. TypeScript inference
&lt;/h2&gt;

&lt;p&gt;Zustand's &lt;code&gt;create&amp;lt;State&amp;gt;()(...)&lt;/code&gt; infers selectors and actions cleanly. RTK's &lt;code&gt;createSlice&lt;/code&gt; requires explicit &lt;code&gt;RootState&lt;/code&gt; and &lt;code&gt;AppDispatch&lt;/code&gt; types and typed wrapper hooks (&lt;code&gt;useAppDispatch&lt;/code&gt;, &lt;code&gt;useAppSelector&lt;/code&gt;). Both are type-safe; Zustand requires fewer ceremonial types.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Server state interaction (Next.js 16 RSC)
&lt;/h2&gt;

&lt;p&gt;In the App Router, neither store touches Server Components. Both work fine in Client Components. Zustand 4.5+ supports per-request stores via the documented context pattern, which avoids cross-request bleed in Node.js. RTK requires the same pattern with a custom &lt;code&gt;makeStore()&lt;/code&gt; per request.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Real migration cost
&lt;/h2&gt;

&lt;p&gt;Migrating an RTK app to Zustand is not free. Slices map cleanly to stores, but RTK Query, listener middleware, and selectors with reselect-style memoization all need rewrites. Budget 1-3 weeks per app for a serious codebase. If your team already knows RTK and uses RTK Query, the math rarely favors a rewrite — invest that time in the next feature instead.&lt;/p&gt;

&lt;h2&gt;
  
  
  Decision matrix
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Greenfield app, small-to-mid team:&lt;/strong&gt; Zustand + TanStack Query.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Large RTK Query codebase:&lt;/strong&gt; Stay on RTK, no upside in switching.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Heavy event-sourced domain (collaborative editor, finance):&lt;/strong&gt; Redux.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Bundle-sensitive landing or marketing site:&lt;/strong&gt; Zustand or no store at all.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Need built-in caching, mutations, optimistic updates:&lt;/strong&gt; RTK Query or TanStack Query — not the store.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is Redux dead in 2026?
&lt;/h3&gt;

&lt;p&gt;No. Redux Toolkit is actively maintained and remains the default in many enterprise codebases. It lost mindshare for new projects, not relevance for existing ones.&lt;/p&gt;

&lt;h3&gt;
  
  
  Should I migrate an existing Redux Toolkit app to Zustand?
&lt;/h3&gt;

&lt;p&gt;Almost never as a standalone project. Migrate only if you are already doing a major rewrite and the bundle savings or velocity gains pay back the engineering hours.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does Zustand work with React Server Components?
&lt;/h3&gt;

&lt;p&gt;Yes, in Client Components. Use the per-request store pattern in Next.js App Router to avoid sharing state across requests on the server.&lt;/p&gt;

&lt;h3&gt;
  
  
  Which is faster, Zustand or Redux?
&lt;/h3&gt;

&lt;p&gt;Both are fast enough that real-world performance differences come from selector design, not the library. Use granular selectors and shallow equality, regardless of choice.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I use both in the same app?
&lt;/h3&gt;

&lt;p&gt;Technically yes, practically no. Pick one client-state library and pair it with TanStack Query or RTK Query for server state.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was originally published on &lt;a href="https://nextfuture.io.vn" rel="noopener noreferrer"&gt;NextFuture&lt;/a&gt;. Follow us for more fullstack &amp;amp; AI engineering content.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>fullstack</category>
      <category>ai</category>
      <category>webdev</category>
      <category>javascript</category>
    </item>
  </channel>
</rss>
