As of May 2026, more than 80% of the code Anthropic ships is written by Claude, not by its human engineers. The company disclosed the figure in an essay called When AI builds itself, with coverage from Tom's Hardware and VentureBeat.
Key facts
- What: Anthropic says more than 80 percent of the code it ships is now written by its own model, Claude, and the more interesting numbers are about judgment.
- When: 2026-06-23
- Primary source: read the source
Two years ago this share sat in the low single digits. The shift accelerated after Anthropic released Claude Code, a tool that lets the model read an entire codebase, make changes, run tests, and fix what breaks without human help. The human role has flipped: engineers used to author the code while the machine assisted; now the machine authors the code and engineers review, approve, reject, and steer. Anthropic reports its typical engineer ships roughly eight times as much code per quarter as a few years ago — not because people type faster, but because they spend their day reviewing the model's output instead of writing from scratch.
Think of it as a newsroom where a tireless junior writer drafts every article and senior editors only sign off. Volume goes way up. But the 80% figure is less impressive than it sounds: a draft that a human must check, fix, and approve is not the same as a writer you can leave unsupervised. Most of those lines still pass through a person. On its own, this number measures effort the machine saves, not work it can be trusted to do without oversight.
The results buried deeper in the essay matter more, because they concern taste rather than volume. Anthropic ran a recurring test where the model chooses the best next step in a research project, then compared its choices against its own scientists. Late last year the model was roughly a coin flip against the humans. By spring 2026, an unreleased internal model was picking the better direction clearly more often than its own researchers. Choosing what to work on next was supposed to be the part that stayed human longest. That is the part that moved.
An even sharper demonstration: Anthropic handed its own agents an unsolved problem in AI safety and let them work it start to finish with no human in the loop. An earlier version closed only a small slice of the gap to human experts. The spring model closed almost all of it. Anthropic frames this not as a stunt but as evidence that the missing ingredient — which it calls judgment — is filling in.
This matters beyond one company's results because it is the clearest first-party signal yet that frontier labs believe a feedback loop is forming: AI helps build better AI, which then helps build better AI again. Anthropic tracks how long a task an AI can handle before a human has to step in. A couple of years ago that was a few minutes of work. By early 2026 it had stretched to a full workday. Independent researchers have measured the same trend on a steady curve, in a widely cited study on how long the tasks AI can finish keep getting longer. If that line keeps bending the way it has, the gap between an assistant and a colleague keeps shrinking.
The honest caveat is a big one. Almost every dramatic figure in the essay comes from an unreleased internal model no outsider can test. A company claiming with its own measurements that its own product is becoming powerful enough to be concerning is exactly the kind of assertion that deserves outside verification before anyone treats it as settled fact. It can be sincere and self-serving at the same time. Anthropic itself adds the line skeptics will want to remember: it says plainly that this is not full self-improvement yet, and that such a future is not inevitable. The volume number is real and checkable. The judgment numbers are the interesting ones, and they are still taking the company's word for it. For the longer arc this fits into, see our earlier story on the model that could rewrite itself but held back, and our primer on what recursive self-improvement actually means.
Originally published on Ground Truth, where every claim is checked against the primary source.
Top comments (0)