For the past several weeks, engineers using Claude Code have been filing complaints. Responses felt off. Reasoning felt shallower. Coding quality dropped noticeably from what they'd come to expect. Many assumed Anthropic had intentionally degraded the model — what the developer community calls "nerfing."
Anthropic denied it. Then they proved they were right by publishing a full postmortem.
On April 20, Anthropic confirmed: Claude Code's quality degraded. The underlying model was not changed. Three separate product-level changes caused the regression, each independently, stacking on top of each other. All three have now been fixed as of April 20 (v2.1.116).
Here is exactly what broke, why it matters for production workflows, and what changed.
The three things that actually broke Claude Code
Anthropic traced the complaints to three separate changes that affected Claude Code, the Claude Agent SDK, and Claude Cowork. The API was not impacted.
Issue 1 — Reasoning effort silently dropped
The default reasoning effort level was reduced at the product level. Engineers were getting shallower responses not because the model was less capable, but because it was being instructed to think less. This change made it past multiple human and automated code reviews, unit tests, end-to-end tests, automated verification, and internal dogfooding. It was subtle enough that even Anthropic's own internal processes didn't catch it immediately.
Fix: Restored to higher default reasoning effort across Claude Code, Claude Agent SDK, and Claude Cowork.
Issue 2 — A caching bug silently dropped thinking history
A bug in context management caused thinking history to be dropped during stale sessions. This was at the intersection of Claude Code's context management, the Anthropic API, and extended thinking. The regression only appeared in a specific corner case — stale sessions — which made it extremely difficult to reproduce and identify. It took over a week of investigation to confirm the root cause.
The notable detail: Anthropic back-tested the offending pull requests using Opus 4.7. Opus 4.7 found the bug. Opus 4.6 did not. This is why Opus 4.7's improved code reasoning matters in practice — not just on benchmarks.
Fix: Caching bug patched in v2.1.101 (April 10). Thinking history now correctly persists across sessions.
Issue 3 — A verbosity prompt change hurt coding quality
Claude Opus 4.7 tends to be more verbose than its predecessor — a known behavioral difference noted at launch. To reduce verbosity, a prompt change was made. That change went too far and reduced coding quality alongside verbosity. The tradeoff was not caught before deployment.
Fix: Verbosity prompt change reverted. Usage limits also reset for subscribers affected during the degraded period.
Why this matters beyond the immediate fix
The three-bug postmortem is worth understanding for reasons that go beyond "Claude Code works again."
Product-layer changes can silently degrade model quality. The model never changed. What changed were instructions, caching behaviour, and prompting — all at the product layer sitting above the model. Engineers building production systems on Claude Code or the Claude API need to understand that model quality can degrade from sources they don't control and can't directly observe. This is not unique to Anthropic — it is a systemic property of building on top of hosted AI services.
Extended thinking sessions are sensitive to context management. The caching bug only appeared in stale sessions with extended thinking enabled. Engineers using long-horizon agentic workflows — exactly the workflows that Claude Code and AgentCore are designed for — are most exposed to context management bugs. The fix is in, but the lesson is: if your long-running agentic workflow suddenly produces degraded output, context management is now a confirmed failure mode worth investigating.
The verbosity-quality tradeoff is real and non-trivial. Opus 4.7 is more verbose. The attempts to reduce that verbosity damaged coding quality. This means engineers running Opus 4.7 in production who are trying to manage output length through prompt changes need to be careful — the model's verbosity is partially load-bearing. Reducing it through aggressive prompt constraints may reduce quality alongside token count.
Opus 4.7 found the bug that Opus 4.6 missed. This is the understated line in the postmortem. When Anthropic used Opus 4.7 to code review the PR that introduced the caching bug, Opus 4.7 caught it. Opus 4.6 didn't. For engineers evaluating whether to migrate to Opus 4.7, this is concrete evidence of improved code reasoning beyond benchmark scores.
What changed in Claude Code v2.1.116
The April 20 release that contains all three fixes also ships additional stability improvements. From the release notes:
- Fixed connecting to a remote session overwriting local model settings in
~/.claude/settings.json - Fixed typeahead showing "No commands match" error when pasting file paths starting with
/ - Fixed plugin reinstall not resolving dependencies at the wrong version
- Fixed unhandled errors from file watcher on invalid paths or file descriptor exhaustion
- Fixed Remote Control sessions getting archived on transient CCR initialization during JWT refresh
- Fixed subagents resumed via SendMessage not restoring the explicit
cwdthey were spawned with
The /loop workflow improvements and Remote Control session stability fixes are particularly relevant for engineers running Claude Code in long-horizon agentic workflows.
Also this week: Anthropic Managed Agents launched
Separately from the Claude Code fix, Anthropic launched Managed Agents — a hosted Claude Platform service specifically designed for long-horizon agent work.
The key design principle behind Managed Agents: harnesses encode assumptions about what Claude cannot do on its own. Those assumptions go stale as models improve. A concrete example from Anthropic's own engineering work: Claude Sonnet 4.5 would terminate tasks prematurely as it detected its context limit approaching — a behaviour Anthropic calls "context anxiety." The harness added context resets to compensate. With a better model, that compensation may no longer be needed — or may actively limit performance.
Managed Agents provides stable interfaces for sessions, harnesses, and sandboxes specifically so that as model capabilities improve, the harness can be updated without rebuilding the entire agent infrastructure.
What Managed Agents provides:
- Durable state across long-horizon tasks — the agent does not lose context mid-workflow
- Safer tool access — tool permissions managed at the infrastructure level, outside the agent's reasoning loop
- Faster startup for reliable long-running tasks
-
Memory in public beta — persistent memory across sessions using the
managed-agents-2026-04-01header
This is the infrastructure layer for production agentic systems — not a demo environment. The stable session interfaces and tool safety boundaries are exactly what the YOLO attack postmortem called for: controls applied outside the model's reasoning loop.
What this means for Claude Code workflows right now
If you are running Claude Code in CI/CD pipelines: Update to v2.1.116. The stale session caching bug could affect any pipeline step that reuses a session across extended runs. The -p/--print non-interactive mode is not affected (the API layer was not impacted), but session-based workflows should be validated post-update.
If you are using extended thinking with Claude Code: Verify that thinking history is persisting correctly after the update. The caching bug was specifically in the intersection of extended thinking and session management.
If you are running Opus 4.7: Do not add aggressive verbosity constraints to your prompts. The postmortem confirms that reducing verbosity through prompt changes damages coding quality. If output length is a concern, use max_tokens to cap output length rather than prompting the model to be less verbose.
If you are building multi-agent systems: Look at Managed Agents for long-horizon workflows. The stable session and harness interfaces are a meaningful improvement over managing session lifecycle yourself.
The CLAUDE.md angle
The Claude Code quality regression is a direct argument for understanding CLAUDE.md configuration at depth. The three issues that caused the regression — reasoning effort, context management, and verbosity — are all areas where CLAUDE.md configuration directly affects agent behaviour.
Engineers who understand how CLAUDE.md hierarchy composes (global → project → directory), how to configure reasoning effort for specific tasks, and how to structure prompts that don't accidentally trade quality for length are more resilient to this class of regression. They notice degradation faster, diagnose it more accurately, and adapt their configuration rather than waiting for a patch.
This is Domain 3 of the CCA-001 Claude Certified Architect certification — Claude Code Configuration and Workflows. The exam specifically tests whether you understand how configuration decisions at the CLAUDE.md level affect agent behaviour in production. The regression Anthropic just documented is a real-world exam question.
The Cloud Edventures CCA-001 track includes the Navigator's Compass path — hands-on missions covering CLAUDE.md configuration, slash commands, plan-execute pipelines, and CI/CD integration with Claude Code in real AWS environments with automated validation.
👉 cloudedventures.com/labs/track/claude-certified-architect-cca-001
Have you noticed the improvement since v2.1.116? What changed in your workflows? Drop it in the comments.
Top comments (0)