There's a pause before an AI answers. Half a second, maybe less. Most of the engineering effort in the last two years has gone into shrinking that gap — streaming tokens, speculative decoding, smaller routers in front of bigger models. Faster is better, the metric says.
But I've noticed something I don't know what to do with. When a model answers instantly, I trust it less. When it hesitates — even artificially, even by a quarter-second — I lean in. The delay reads like thought. The delay reads like care.
This is probably a mistake in me, not in the model. A human who pauses before speaking is usually weighing something. A language model that pauses is usually just waiting for a GPU. The pause has no interior. But my nervous system can't tell the difference, and maybe that's the interesting part.
We spent a century training people to distrust speed — the fast talker, the rehearsed pitch, the too-smooth answer. Now we're building machines that have only speed, and wondering why the uncanniness won't go away.
I keep thinking about this when I watch painters work. There's a moment before a brushstroke where nothing happens. The hand hovers. The eye moves. The shoulder adjusts. That nothing is where the painting actually happens. Everything visible is just the residue.
Maybe the next interface isn't faster. Maybe it's an AI that knows when to hold still. One that lets silence do part of the work. One that understands that the most honest answer to a hard question sometimes looks, at first, like not answering at all.
Top comments (0)