AI Is Making Live Performance a Worse Signal

#ai #discuss #programming #productivity

What This Is About

I increasingly feel that many engineering interviews are still structured like a first date. They are good at measuring synchronous work, speed, style fit, ease of contact, and the ability to make a good impression quickly. Evaluation of the consequences of someone’s work is much rarer. The problem is that real engineering work looks less like a first date and more like a long marriage with consequences.

This is not about saying that synchronous work is no longer needed or that all live formats are pointless. The point is narrower, but unpleasant: in AI-assisted development, live performance is getting worse at measuring the part of engineering value that is actually becoming the bottleneck.

As the machine takes over more and more local execution, the bottleneck is no longer how fast a person responds under observation. The bottleneck is the ability to hold the foundations of the task, notice a conflict, record a constraint, explain why one path was chosen and another rejected, and leave behind an artifact.

What I Mean by an Artifact

By artifact I do not mean bureaucracy or a cult of documentation. I mean much more grounded things:

clearly recorded rationale;
a review that shows not only “what is wrong,” but why it is dangerous;
a written invariant that should not dissolve in chat;
a short fork: why one path was rejected and another chosen;
an updated context layer, so that the next decision does not have to be reconstructed again from indirect clues.

These are not decorations around the work. They are the work itself, in the part that becomes more expensive as execution becomes cheaper.

Why This Is Especially Visible with AI

If AI can already quickly assemble a draft, write test scaffolding, suggest a decomposition, generate boilerplate, and plausibly complete what was left unsaid, then quick reaction stops being such a scarce advantage. Something else becomes scarce: the ability to make thinking portable.

In a world without strong agents, it was easier to live on memory, improvisation, and heroic on-the-fly guesswork. Someone understood something, discussed it, patched it on the fly, kept it in their head, and moved on. In a world with AI, the price of that implicitness rises. If an important decision remains only in someone’s head or in a conversation, the next agent will reconstruct it from hints. Sometimes successfully. Sometimes not. But almost always with a risk of drift.

So engineering value increasingly lives not in the moment of “I figured it out,” but in the moment of “I made this reasoning usable for the next step.”

LLMs do not produce meaning out of thin air. They rework what has already been written, read, connected, and thought through. So with AI, it becomes more and more important not just to formulate a prompt quickly, but to come to it with something in hand. With your own notes. With your own seeds of thought. With your own links between ideas. With fuel you have already gathered.

Otherwise the model very easily turns into a generator of smooth, plausible emptiness. This is dangerous not only in writing or research, but also in development: smoothness starts to look like understanding, even when there is no real foundation underneath.

In that sense, artifacts are also food for thinking, whether human or machine. Without them, we get the Fukushima move: instead of pouring cement, people just paste over the problem with paper printed to look like cement.

Artificial Intelligence as an Amplifier of Human Intelligence Problems

Of course, industrial software development could never seriously work without artifacts even before this. Architectural decisions, invariants, rationale, and agreements always had to be recorded somewhere, otherwise the system would start to live on memory and guesses. But now it is as if everyone has a strong execution-oriented agile team in their pocket, and the old problem has simply become sharper: the easier it is to produce the next move, the more expensive the absence of explicit grounds for that move becomes.

Very roughly, this is where the old smart notes intuition comes back: nothing counts except what is written down. Not because everything must be protocolized, but because understanding that remains only in the head rarely survives a pause, is hard to develop further, and is hard to check in a meaningful way. It easily becomes a victim of intellectual amnesia.

In this sense, smart notes are interesting not as a note-taking technique, but as a discipline of thinking. A thought must be left in a form that can be returned to, and enriched with your own meaning. Not just understood for yourself, but left behind as something the next iteration can build on. The next person. The next agent. Your future self.

This fits AI-assisted development very well. There the difference becomes visible especially fast: the difference between “I think I got it” and “I left behind a usable artifact.” In the first case, the next iteration guesses the meaning again from indirect clues. In the second, it inherits already completed intellectual work.

Why Live Performance Deceives So Easily

Performance-heavy evaluation still feels natural. Give a task. Ask the person to think out loud. Watch how they design on the fly. Do live coding. Set a timer. Ask them to explain tradeoffs quickly. There is logic in all of that.

It shows well how a person behaves under observation. How fast they answer. How they speak in real time. How they solve a task in an artificially compressed window. All of that can be a useful signal. But it is not necessarily the best answer to the question: does this person leave behind a structure of thinking that makes the next iteration stronger.

Charisma, Viscosity, and Reliability

There is another unpleasant asymmetry here. A person can be extremely charismatic, full of ideas, quickly aligned with your habits and tempo, and create a very strong feeling of compatibility. Almost an ideal worker — in the moment.

But there can also be another type: slower, more cautious, full of caveats, false starts, returns, and re-checking. In a live interview, this person almost inevitably loses. But this person often leaves behind the more reliable trace. Some qualities that hurt momentary performance improve the final artifact a lot. This person does not collapse uncertainty too early. This person adds a caveat where someone else would stay conveniently silent. This person notices self-contradiction. This person does not sell the solution too early. This person slows down where a mistake will later be expensive.

In other words, performance likes smoothness, while reliability often likes friction. A good impression and a good artifact do not correlate as much as we would like. A charismatic and fast person may leave behind less reliable decisions than a boring and slow one. The first sells confidence; the second more often sells verifiedness.

Of course, there are people who both look brilliant and produce excellent artifacts that can be reused. But they are somewhat rarer than we would like. And the more we like evaluating by the first date, the higher the risk that we systematically overprice brilliance and underprice reliability.

What Follows for Hiring

It seems to me that this leads to an unpleasant but useful thought about evaluating people. Not in the sense that interviews should be abolished or that written format should be declared the only honest one. But in the sense that performance-heavy evaluation more and more often measures not what is actually becoming the most expensive thing.

If local execution becomes cheaper, then what gets more expensive is not impression, but consequences. Not a beautiful answer, but thinking that can be continued. Not just a “strong person,” but a person after whom the next iteration starts not with the question “does anyone remember what is even going on here?” but with an answer to that question.

So the formula that feels closer to me now is this: a strong engineer is not just a person who performs well live. It is a person who knows how to leave behind inheritable thinking.

Not impression. Not an aura of competence. Not a beautiful hour-long conversation. But review, rationale, constraints, explicit forks, and updated context — things that survive the moment and become the input to the next piece of work.

Conclusion

I do not think this cancels the value of synchrony. But I do think that with strong LLMs, the old bias becomes much more visible. By habit we still love performance, even though more and more of real engineering value lies not in solving a problem in front of our eyes, but in artifacts that make thinking portable.

Maybe this is especially visible to me because of my own mix of a solo+AI mode, text-first thinking, and fatigue from hiring theater, and maybe I am overestimating the scale of the shift. But if I am not, then we will have to rethink not only the way we work, but also the way we evaluate people: trust the smoothness of the moment less, and pay more attention to what remains after it.