DEV Community

Harry Floyd
Harry Floyd

Posted on • Originally published at harryfloyd.substack.com

What Proves You Can Think?

AI did not just make output cheap. It broke the old contract between effort, competence, and trust.

For developers this is not abstract. When anyone can generate a clean PR, a plausible code review, a working API endpoint, or a competent-looking architecture diagram in seconds, the artefact stops proving what it used to prove. A good solution no longer implies someone wrestled with the problem.

The question underneath the productivity debate is harder: if the work no longer proves I can think, what does?

The old proof contract

Every institution runs on proof contracts. A school asks for essays and exams. A company asks for CVs, interviews, and work samples. A market asks for traction and retention.

None of these signals were ever pure. The CV was always a marketing document. The interview was always distorted by nerves and charm. The portfolio could hide how much help the person had. But they worked well enough because polished surfaces were costly to produce. Cost created friction. Friction created signal.

AI attacks the "expensive enough" part. It compresses the cost of appearing competent. That is enough to break the systems that relied on that cost as a proxy.

The move

When output gets cheap, output quality becomes the opening bid, not the final proof. The important question moves upward:

What does this output prove about the person, team, or system behind it?

Sometimes the answer is: not much. A clean PR may prove someone had access to a good model and enough taste not to paste the first result. A strong CV may prove the candidate knows how hiring filters work.

The useful response is not to ban AI. It is to stop treating AI-polished output as the proof object. The next proof system asks what happened before, during, and after the artefact.

Five questions that separate judgement from output

These work on code, PRs, architecture decisions, interview responses, and your own work.

1. What problem was chosen, and what easier problem was rejected?

The first proof of thought. Bad work often starts with accepting the first fluent frame. Good work usually contains a buried refusal: someone saw the tempting version and did not take it. In a codebase, this looks like choosing the harder but more maintainable abstraction instead of the one-liner that will break in six months.

2. What tradeoff was made under constraint?

Intelligence becomes visible at the boundary. Anyone can claim they value quality, speed, safety, and maintainability. Real judgement appears when not all of them can be maximised at once. The developer who can explain why they chose correctness over latency for this specific endpoint, and what evidence would make them reverse that choice, is showing something the output alone cannot.

3. What did you check that the output itself could not prove?

This is the verification question. It separates people who use AI as a generator from people who use AI inside a judgement loop. The generated code can compile. That is not verification. Verification is the external thing that makes the claim answerable: the edge case test, the production data check, the integration test that proves it works with the real system.

4. What changed after feedback or contact with reality?

Revision is underrated because it is less glamorous than creation. But in an AI world, revision becomes a higher-status signal. The first surface is cheap. The changed surface after a code review, a production incident, or a colleague pointing out the flaw is where more truth appears.

5. Who owns the consequence if this is wrong?

Accountability is the signal machines cannot carry. A model can produce. A person must decide what they are willing to stand behind. The developer who says "I shipped this, I own the pager duty for it, I will be awake if it breaks" is operating in a different category from the one who lets the AI output speak for itself.

What this changes in practice

Hiring: Stop asking only for work samples. Give candidates a plausible AI-generated solution and ask what is wrong, what would break in production, and what they would check before deploying it.

Code review: Stop treating clean diffs as sufficient evidence. Ask what was not generated. Ask which tradeoff the author is defending. Ask what verification would prove the solution wrong.

Your own work: Stop trying to prove value only through polished output. Keep the polish, but attach judgement to it. Show the problem you chose. Show the tradeoff. Show the verification. Show the revision. Show what you will own.


This piece was originally published on The Durability Curve, a newsletter about what lasts when the surface gets cheap. Read the full article for the deeper argument, including the research on algorithmic anxiety that frames why this matters beyond engineering.

Top comments (0)