John Reilly Pospos (JP)

Posted on Nov 12 • Originally published at jrpospos.blog

Human-Readable vs Machine-Optimized: The Coming Split in Code Quality

#ai #watercooler #software #softwareengineering

Exploring how AI-assisted development might shift code quality standards from human readability to machine optimization, and what that means for the future of software development.

What does good code look like today?

Ask any senior developer and you’ll hear familiar answers: readable, maintainable, well-documented, follows established patterns, uses meaningful variable names, has clear separation of concerns. We’ve built entire movements around these principles. Clean Code, SOLID, DRY. Every language has evolved its own human-centric standards: Pythonic code, idiomatic Go, Ruby’s principle of least surprise. We conduct code reviews to enforce these standards.

But here’s a question worth exploring: What would good code look like if AI became reliable enough to write, test, debug, and validate most code on its own?

This is purely theoretical. An educated guess based on watching how AI-generated code is evolving. I’m not an expert predicting the future. Just someone exploring what might happen if current trends continue and AI gets good enough.

The Reliability Threshold

The key idea behind this theory: AI reaches a point where it can write code, debug it, find root causes, and fix issues without human help. Not 60% reliable. Not 80% reliable. But reliable enough that humans shift from writing and reading code themselves to validating what the AI produces.

This would be like how senior engineers review junior engineers’ work today. You check the logic and conclusions instead of doing everything from scratch. If AI shows its work (execution traces, state at each step, evidence for its findings), validation becomes more about reviewing the reasoning than reading and understanding implementation details.

We’re not there yet. But if we get there, the implications are worth thinking through.

A Different Set of Priorities

Imagine a future where AI writes the code, reviews it, debugs it, and troubleshoots issues. Humans validate the AI’s work, check that behavior matches expectations, and verify outputs are correct.

In that world, what defines “good” code could change.

The things that might matter most:

Performance
Efficiency
Behavioral correctness
Output validity
Token efficiency

That last point needs explanation. AI systems process code as tokens, and API costs scale with token count. If AI is writing, reviewing, and analyzing code constantly (implementation, security scanning, performance checks, testing, incident response), token costs add up fast. A codebase that’s 30% smaller through shorter variable names and compact syntax could save real money at scale. When AI generates and processes code thousands of times per day but humans review it occasionally, optimizing for the frequent reader and writer makes sense.

Why We Care About Readability Today

We care about code readability, style guides, and clean code principles because we expect humans to write, read, understand, and modify that code.

But if AI writing and reviewing its own code became standard practice, and that AI was reliable enough to trust, would human-centered readability standards still make sense? Code readability has always been a means to an end. It helps human understanding. If humans are neither the primary author nor the primary reader, the optimization target shifts.

This isn’t new. We already make similar tradeoffs:

Nobody reads minified JavaScript. We trust source maps.
Nobody reads compiled binaries. We trust debuggers.
Nobody reads database internals. We trust EXPLAIN plans.
Nobody hand-edits LLVM IR. We trust compilers.

Code could become another intermediate representation. The source of truth becomes business requirements, test suites, and AI reasoning chains. Code itself becomes what binaries are today. Something we trust the toolchain to produce correctly.

Beyond Code Review: What Else Changes?

If AI handles both writing and maintaining code, and human readability becomes less important, the implications go beyond just code style.

We’re already seeing early signs with AI systems like Claude Code, OpenCode, OpenAI Codex, GitHub Copilot CLI, Gemini CLI, and others using Model Context Protocol for tool access. These systems can write code, read logs, trace execution, run tests, and suggest fixes on their own.

In fact, this future might be closer than many realize. Power users of these coding agents are already running armies of autonomous AI systems doing exactly what this article describes: writing code, debugging issues, running tests, and iterating on solutions with minimal human intervention. What seems theoretical to some is already the daily workflow for others.

Picture a production incident at 3am: The code was written by AI. Now AI traces execution paths, connects logs, generates hypotheses, identifies root cause, and implements fixes. Faster than a human team could assemble.

The human role: Validate the evidence. Review the root cause analysis. Confirm the fix doesn’t break business logic. Maintain the AI systems themselves, making sure they stay observable and their reasoning stays transparent.

The Inevitability of AI Dependency

Some might argue this creates dangerous dependency on AI systems. But that framing might miss what’s already happening.

We’re already building AI into our development workflows:

Codebases structured around specific AI tools
Documentation written for AI consumption (.cursorrules, CLAUDE.md files)
Workflows that assume AI availability for testing, refactoring, debugging

In 5-10 years, developing without AI assistance might be like suggesting we go back to punch cards. The dependency isn’t optional. It becomes the baseline. The real questions become: Which AI systems do we depend on? Can we switch between them? Can we verify they’re working correctly?

Regulations Will Evolve Too

It’s easy to assume regulations would prevent this shift. But regulations adapt to technology changes.

Historical precedent:

Aviation moved from “humans must understand every component” to accepting fly-by-wire systems with software doing the work
Medical devices went from purely mechanical to software-controlled with verification frameworks
Financial systems moved from human auditors checking every transaction to automated monitoring with sample checks

Future regulatory frameworks might require:

AI-generated code along with verifiable reasoning chains
Audit logs of AI decision-making processes
Critical path code validated by independent AI systems from different providers
Human verification of AI findings on representative samples with statistical confidence

Not “humans must read all code,” but “humans must verify the verification system works.”

Think about compliance itself. Why would regulatory oversight stay purely human when everything else is AI-assisted? AI could constantly monitor codebases for security holes, privacy violations, and accessibility issues. It could generate compliance reports with evidence and flag potential violations before deployment. Human regulators could review AI-generated summaries and dig deep only on flagged issues.

This might actually increase regulatory coverage. Human regulators can’t review every line of code at every company. AI could, with human oversight on methods and findings.

The Economic Pressure

Here’s where theory meets observable reality: if AI can handle writing code, reviewing it, debugging it, and checking compliance, the money calculation changes fundamentally.

Today, optimizing code for human readability makes sense because humans write and maintain most code. But if AI writes and analyzes code continuously while humans review it occasionally, the economics flip. When AI generates and processes code thousands of times more frequently than humans read it, optimizing for machine efficiency over human comprehension starts to make business sense.

At big enough scale, more compact code means real cost savings. Not just in direct API costs, but in processing speed. Shorter code means faster analysis, which means faster deployment, which means competitive advantage.

The Risks (And How to Manage Them)

The most concerning scenario isn’t the end state. It’s the transition period. Imagine AI that’s 95% reliable. Good enough that teams start depending on it. Not reliable enough to fully trust. You’d end up with code humans can’t easily understand, AI that’s wrong 1 in 20 times, and no one able to catch the errors.

If this transition happens, the real risks are more subtle than simple AI dependency:

Oligopoly control: If 2-3 providers dominate, they control the means of production for all software. Pricing power, feature control, and policy decisions become centralized in ways even cloud providers don’t achieve. This makes multi-provider strategies essential. Organizations will need to maintain relationships with multiple AI providers, similar to multi-cloud strategies today. The ability to switch or run parallel systems becomes a competitive requirement, not a nice-to-have.

Systemic correlation: If everyone uses the same AI models, we get correlated failures. A bug in the model produces bugs across millions of codebases at once. A security flaw in AI reasoning becomes a universal vulnerability. Critical systems will need multi-model verification, requiring validation from AI systems built on different architectures, trained on different data, from different providers. Like how aircraft use redundant systems from different manufacturers. When safety matters, you don’t rely on a single point of failure.

Knowledge concentration: As AI handles more routine work, the broad human ability to deeply understand code could shrink. When AI fails or produces novel bugs, organizations need experts who can debug from first principles. The danger is underinvesting in this specialized expertise because “AI can handle it.” Like GPS navigation. It works great until it doesn’t, and then you’re lost if you never learned to read maps.

But here’s the thing: SMEs (Subject Matter Experts) with deep programming knowledge don’t become less valuable. They become more valuable. Their role shifts from writing all code to validating AI reasoning, handling edge cases, maintaining AI systems, and serving as escalation when AI fails. Organizations that maintain this expertise gain competitive advantage. The skill doesn’t disappear. It concentrates and becomes more strategic.

Verification capability gap: If humans lose the ability to meaningfully verify AI reasoning, trust becomes impossible to check and errors compound undetected. This is where the industry needs to pivot value, effort, and expertise toward verification rather than implementation. The critical skill becomes “can you effectively validate AI reasoning and evidence?” rather than “can you write this code from scratch?”

This is a different skillset but not a lesser one. It requires deep understanding of correctness, edge cases, and system behavior. The developers who master verification in an AI-native world will be as valuable as the architects and senior engineers of today. Organizations that invest in building these verification capabilities during the transition will be better positioned than those that assume “AI can handle it.”

Where This Might Lead

If this theory plays out, we might see a gradual evolution in how we think about code quality:

Short term (now): AI assists most developers with writing and debugging routine code; power users already running autonomous agent workflows. Humans maintain deep understanding and do final reviews.

Medium term (5-10 years): AI writes and debugs most code, humans verify reasoning and outcomes through AI-provided evidence, code readability starts to matter less.

Long term (15+ years): AI-written machine-optimized code becomes standard for many areas, humans verify through abstraction layers (requirements, tests, reasoning chains), regulations adapt to AI-native development.

This could create a split:

Human-readable code for systems where human understanding stays critical: regulated industries with strict oversight, safety-critical systems where failures have serious consequences, long-lived systems needing institutional knowledge.
**
Machine-optimized code** for systems where behavior and performance matter more than readability: internal tools, high-iteration low-stakes systems, performance-critical systems where optimization benefits outweigh readability costs.

The qualities we optimize for wouldn’t disappear. They’d redistribute based on who the primary reader of the code is.

When Might This Become Real?

The shift becomes possible when trusting AI explanation plus cited evidence becomes more reliable than trusting your own ability to read and understand code.

For routine work, we might be approaching that threshold. For high-stakes systems, we’re probably far from it. The path forward likely requires:

AI systems that can reliably explain and justify their reasoning with verifiable evidence
Standards for switching between AI providers
Verification frameworks proving AI reliability at statistical confidence levels
Regulatory changes creating accountability without requiring human understanding of every line
Keeping human baseline knowledge during the transition to catch AI failures

The Bottom Line

Maybe I’m wrong. Maybe human understanding will always stay central. Maybe the verification overhead will exceed efficiency gains. Maybe regulations will require human-readable code regardless of technical ability. Maybe AI reliability will plateau before reaching the threshold this theory requires.

But even if this specific scenario doesn’t happen, the basic question stays worth exploring: When we say “good code,” who are we optimizing for? And if that audience changes, shouldn’t our definition of quality change with it?

The key requirement is reliability. Without AI systems that can consistently and verifiably write, debug, and troubleshoot code on their own, this stays theoretical. But if that threshold is reached, the money pressure and practical advantages might drive this shift faster than we expect.

We’re not there yet. But the direction suggests it’s worth thinking through the implications now, while we still have the knowledge and ability to build the right accountability systems for whatever comes next.

DEV Community