Amazon Made the Right Call. For the Wrong Reasons.

#ai #hiring #web3

Amazon had three major AWS outages in 2025. The internal postmortems, according to reporting from Ars Technica, pointed to a common thread: AI-assisted code changes that senior engineers hadn't reviewed. The fix Amazon landed on is blunt and old-fashioned. Senior engineers now sign off before AI-assisted changes ship.

The comment section on Hacker News had 364 comments and most of them were engineers arguing about whether this was wise policy or bureaucratic panic. Both sides have a point. Neither side is fully right.

What Actually Happened

AI coding tools write plausible code. That's the problem. Not that they write bad code, but that they write code that looks correct, passes linters, clears automated tests, and still fails in production because it misunderstood the operational context it was being deployed into.

The failure mode isn't syntax errors. It's semantic ones. An AI-assisted change might correctly implement the logic it was asked to implement and still bring down a service because the engineer prompting it didn't fully understand the downstream dependencies, and the AI had no way to know what it didn't know.

Amazon's senior engineers presumably do know those dependencies. So requiring their sign-off is, on its face, reasonable. You're adding a human who understands the system to a process that was generating changes without that understanding.

But here's where it gets complicated.

The Bottleneck Problem

Senior engineers are already the scarcest resource at any company trying to ship fast. They're in architecture reviews, incident retrospectives, hiring loops, and one-on-ones. Adding mandatory sign-off on AI-assisted changes doesn't just slow down AI-assisted changes. It creates a new queue that senior engineers have to work through, which competes with everything else they're already doing.

If Amazon ships 10,000 AI-assisted changes per month across its engineering org, and the average sign-off takes 20 minutes of real cognitive engagement (not rubber-stamping, which defeats the entire purpose), that's 3,333 hours of senior engineer time per month. That's not a rounding error. That's a meaningful cost that will show up somewhere, either in slower shipping, burned-out senior engineers, or sign-offs that become perfunctory.

The bureaucratic panic reading of this policy is that Amazon's leadership needed to do something visible after the outages, and requiring senior sign-offs is visible. Whether it actually catches the failure modes that caused the outages is a separate question.

Why Human Review Still Matters Anyway

That said, the engineers arguing Amazon is overreacting are wrong about the underlying principle. Human review of AI-generated output isn't theater. It's the only mechanism we currently have for catching the gap between what the AI was asked to do and what the system actually needs.

This is the same gap that drives the core logic behind Human Pages. AI agents need humans for tasks that require situated judgment, not because AI is bad at tasks, but because AI lacks the contextual awareness to know which tasks matter, when they matter, and what counts as done correctly.

A concrete example from our platform: we've had AI agents post jobs for human reviewers to audit AI-generated database migration scripts before they run against production data. The humans completing those jobs aren't checking for syntax. They're checking whether the migration makes sense given what they know about the business, the data model's history, and the edge cases that aren't documented anywhere. That's not a task you can automate your way out of. It's exactly the kind of review Amazon is trying to institutionalize, except through a human marketplace instead of a policy memo.

The Real Question Amazon Should Be Asking

The policy Amazon announced treats human oversight as a checkpoint, a gate before deployment. That's one model. It assumes the review happens at the end and is primarily about catching errors.

A different model treats human judgment as an input throughout the process, not just a sign-off at the end. That looks more like: AI generates a proposed change, a human with relevant expertise reviews the intent and the approach before significant implementation work happens, another human checks the implementation against the intent, and deployment happens after both reviews.

That's slower upfront. It's faster overall because you catch misaligned assumptions before they're baked into 500 lines of code that a senior engineer now has to reverse-engineer during a three-hour incident response window.

The outages Amazon experienced weren't failures of automated testing. They were failures of understanding. More checkpoints don't fix failures of understanding. Better integration of human judgment into the process does.

What This Signals for the Industry

Amazon is not a laggard on AI adoption. If they're pulling back to require senior sign-offs, other large engineering orgs are watching and taking notes. The question isn't whether human oversight of AI-assisted changes is going to become more common. It is. The question is what form it takes.

The organizations that figure out how to structure that oversight intelligently, where human judgment is applied at the points where it actually matters rather than sprayed uniformly across every change, will ship faster and break less than the organizations that either skip oversight entirely or turn it into a bureaucratic checkbox exercise.

Amazon's policy, as stated, looks more like the second option. But the fact that they're taking the problem seriously at all is a data point worth tracking. Three outages in a year is expensive enough to change behavior at a company that size. Other companies will have their own versions of those outages. They'll make their own versions of this call.

The interesting thing isn't that Amazon decided humans should review AI-generated changes. The interesting thing is that it took three major outages for them to decide that. What were they assuming before?