Joshua Hall

Posted on Jun 9 • Originally published at yaplabs.com

Chat Is an Input, Not an Interface

#ai #ux #uxdesign #uidesign

Ask me my address in a chat box and you've just made my life worse. I can type it into a form in three seconds: one that validates the ZIP, knows my state from my city, and tells me immediately if I fat-fingered a digit. In a chat box I'm at the mercy of however the model decides to parse my reply, and some fraction of the time it comes back subtly wrong because I phrased it in a way the prompt didn't anticipate. The form was right there.

That's the thing I keep coming back to looking at the wave of chat-first products. Conversation is a real input modality. It is rarely the right default. Picking the interface that fits the task is design work, and replacing every form, grid, and canvas with a prompt box because the model is cheap is the opposite of doing that work.

Chat Is One Input, Not the Interface

The category error in most chat-first products is treating chat as the interface rather than an input. The model underneath is a processing layer. The surface the user touches is a separate decision, and collapsing the two ("we shipped an LLM, so the product is a chat window") is what produces the worst of this genre.

Conversation is one input among many, and most software already runs on a rich vocabulary of others: forms, dropdowns, sliders, grids, direct manipulation on a canvas. Each exists because it fit a task better than typing a sentence would. A chat box doesn't retire any of them. It joins the list, useful in the specific places where the others fall short — and a liability everywhere they don't.

The honest test for any chat-first screen is whether prose is the best way to express the task, or merely the easiest way to ship it. Those answers diverge more often than the demos admit.

Where Chat Earns Its Keep

There's a category of input that genuinely benefits from conversation: anything where the structure of the answer isn't known in advance, where the follow-up questions depend on previous answers, or where the user's vocabulary doesn't match the application's and something has to translate between them. If the task is the kind of thing an interview handles better than a questionnaire, chat is worth exploring. The keyword is interview — there are limits, and they arrive fast.

Construction takeoff estimating is a good example from a project I work on. The drawings give you measurable structured data (square footage, room count, fixture positions) but say nothing about finish level. A builder-grade kitchen package might run $10K. A premium one with a Viking range and a Sub-Zero fridge can hit $150K. The plans don't say which the owner wants, and no single form field captures the difference cleanly.

Chat fits there. The system asks the contractor what conversations they've been having with the owner, picks up on "they want it nice but not crazy" or "she keeps showing me Italian tile," and translates that into structured data the estimator can apply. The contractor's mental model is conversational; forcing them through a finishing-level dropdown throws information away. Chat also lets ambiguity survive when it needs to — "they're still deciding, but I think it'll land in the $35K–$45K range" is a perfectly valid answer, and form fields either choke on that kind of variance or become elaborate interfaces in their own right.

Chat earns its place there because the input is genuinely unstructured at the source. Most chat interfaces I see aren't solving that problem. They're solving the much smaller problem of "we shipped a model and now everything has to look like one."

Where Direct Manipulation Wins

Then there's the larger territory the chat-first framing quietly ignores: the work that graphical interfaces do better than any sentence can, because the interface is the precision.

Try to nudge the spacing on an image by describing it. "Move it left a bit — no, less — okay, now down." That's ten seconds of direct manipulation in Figma or Illustrator turned into a frustrating game of telephone. A prose-only path to production design (describe the screen, accept whatever comes back) fails before it starts, because human language can't specify pixel spacing, and the tools that actually work never ask it to. The same holds for CAD, spreadsheets, page layout, business analytics: domains where the value lives in a precise, spatial, direct relationship between the user's hand and the artifact. Prose can't carry that bandwidth.

The forms case is just the narrow, well-behaved end of this same spectrum. Address entry, date selection, numeric ranges, anything with strict validation, anything the user has typed a hundred times, anything whose answers enumerate cleanly in a short dropdown: these are known data of a discrete shape the system needs to collect, sometimes with an order or a dependency between them. A form is a contract. It states exactly what's needed, validates locally, surfaces errors on the spot, and finishes in seconds. Run the same task through chat and you get a slow guessing game with the ambiguity pushed onto the user, who might not notice the model misread the address until the shipping confirmation arrives a day later.

Bulk is where it gets stark. Entering one phone number through chat? Maybe. Entering a client list with every contact's details? Almost never. It's faster to upload a CSV or type into a grid than to narrate a hundred rows. The richer the structure, the worse prose performs.

None of this is an argument against the model. The model can sit behind the form, the grid, the canvas, autocompleting, validating against external sources, suggesting corrections. The interaction surface doesn't have to be a chat box just because the processing layer is an LLM. As Andrej Karpathy puts it in his Software 3.0 framing, the future stack is classical code, machine learning, and LLMs working together; for genuine business logic I'd add a rules engine to that list. The interface is a design choice independent of which of those is doing the work underneath.

The Right Number of Ways

This is partly an old debate in new clothes: how many ways to do one thing should an application offer? Both extremes are wrong.

Perl and COBOL famously hand you fifteen ways to write the same line, and the result is often code nobody else can read; the reader has to reverse-engineer not just what the author did but which dialect they were speaking. The most flexible interface to an operating system is probably a Unix shell: it can do nearly anything, and it's correspondingly hard to learn and easy to misuse. Maximum optionality has a real cognitive price.

Python takes the opposite stance ("there should be one obvious way to do it") and rode it past the point of usefulness. For most of its history the language had no real case statement, on the principle that nested if/elif/else was already enough. The philosophy was consistent; the lived reality was that a match statement is simply easier to read than a chain of elifs, and the one-obvious-way dogma cost the language readability for years in a spot that didn't need the discipline.

The balance for an application is two or three modalities for the same task, chosen deliberately. A form for the structured case. A chat fallback for the unstructured one. Maybe a power-user shortcut for repeat operations. Three covers the meaningful variation in how people work without sliding into Perl-land. One, picked because the designer fell for a modality, quietly pushes half the user base into a worse experience.

The LLM Bridges the Gaps

So if chat isn't the interface, what is the model actually for? Its real power is bridging ambiguity, papering over the places in a workflow where the input is messy, the formatting is inconsistent, or the answer is still half-formed, so the user can keep moving toward the objective instead of stalling on a field that won't accept what they have.

Back to construction. Bids come back from subcontractors in wildly inconsistent formats. Pulling the total price off each one is trivial; regex or a parser handles it. What that price includes and excludes is where you need real interpretation, and that's the LLM's job: read the messy document, triage what plain machine learning can extract, what a rules engine can decide, and what genuinely needs a model to interpret. The output isn't a chat transcript. It's a structured model of the bid, echoed back through the normal interface in a consistent shape: the same fields, every time, however ragged the source.

That consistency is the actual payoff of LLM-powered UX, and it has almost nothing to do with chat as a surface. Picture it in CAD. I've got my 3D mouse, my shortcuts, my hands deep in the geometry of a part. Chat isn't where I model; that would be absurd. But "now that the shape's roughed in, run a torque analysis on this and farm it to the background while I keep working" is exactly where a conversational command earns its place, riding alongside the precise interface instead of replacing it. The model bridges; it doesn't take the wheel.

Default Is the Enemy of Design

If you're designing a product right now and the default screen is a chat box, here's the test. Pull up the five most common tasks your users perform. For each, ask whether a form, grid, dropdown, button, or canvas would let them finish faster, with less ambiguity and better validation than typing a sentence.

If the answer is yes across the board, you've built a forms-and-grids application wearing a chatbot costume, and the costume is hurting your users. Ship the real interface. Keep chat for the slice of cases where the input genuinely doesn't fit a structured surface, and put it behind a button that says "or just describe it" rather than making it the front door.

If the answer is no for some of them (you can't predict the shape of the input, the follow-ups depend on the answers, the user's vocabulary won't survive translation to fields) then chat is the right primary input for those tasks. Build it well, set the structured shortcuts beside it for the users who want them, and treat chat as a mode the application offers, not the application's identity.

The model is a tool. Chat is one of the interfaces it can power, valuable exactly where ambiguity lives and forgettable everywhere else. Forms, grids, sliders, and direct manipulation remain the right answer for most of what people actually do with software. The designers who can hold all of those in their head at once and choose on purpose will out-build the ones who reach for whatever modality is trending.

Default is the enemy of design.

Top comments (2)

Mustafa ERBAY • Jun 9

The line that stuck with me was:

"Chat is an input, not an interface."

I've lost count of how many products replaced a perfectly good UI with a prompt box and somehow made the workflow slower. The best AI experiences I've seen don't force conversation everywhere. They use it only where the structure of the input is genuinely unknown.

Joshua Hall • Jun 25

Totally agree. Prose can be quite powerful with frontier LLMs; however, forcing me to type (or worse speak in an office or public area) to accomplish even simple workflows is just plain bad UX. I'm sure this will settle down eventually, but it can be quite frustrating today.