DEV Community

khalid
khalid

Posted on

Treating the LLM Like an Unreliable Dependency

And why your frontend will be better for it.


When I started building the frontend for Second Sight — an AI-powered strategic foresight platform — I made the same assumption most developers make the first time they integrate an LLM: I assumed the output would be consistent.

It isn't. And that assumption cost me several debugging sessions before I reframed the whole problem.

Here's what I learned.


The setup

Second Sight guides users through a six-stage strategic analysis — classifying forces, building scenario axes, writing full scenarios, stress-testing them in a "wind tunnel," and exporting a final report. Each stage is a separate LLM call. Several of them have three-to-six minute processing windows.

That means a single completed session can represent more than ten minutes of non-deterministic computation. A user who hits a dead-end error at stage five doesn't just lose time — they lose trust in the product entirely.


The mistake I was making

I was treating the model like a reliable API. Call it, get structured JSON back, render it. Simple.

The problem is that LLMs aren't reliable APIs. They're probabilistic systems with a tendency to:

  • Return JSON wrapped in a markdown code block one run and raw JSON the next
  • Use single quotes instead of double quotes in what looks like a JSON object
  • Return output that structurally belongs to a different stage of the workflow
  • Time out under certain input lengths with no warning

None of these are bugs you can fix. They're the nature of the dependency. The question is whether your frontend is designed around that reality — or just hoping it won't happen in production.


The reframe: treat it like an external service that lies sometimes

Once I stopped expecting consistency and started designing for inconsistency, the solutions became obvious.

1. Shape-tolerant parsing

For each stage, I wrote a parser that accepts multiple valid shapes and normalises them to a single internal type. The wind-tunnel result, for example, can arrive as either an option-major 2D array or a scenario-major object array. Both are valid — the model just chooses. The normaliser handles both transparently. The component downstream never knows the difference.

2. Typed failure detection

Instead of a single generic error state, I built three specific failure types:

  • Alignment failure — the output doesn't reconcile with the user's selected axes
  • Processing failure — timeout or unparseable response
  • Structural failure — the model returned a previous stage's output

Each one surfaces a different recovery modal with a clear explanation and the right next action. A user who sees "the model returned results from an earlier step — click here to retry from this point" can recover. A user who sees "something went wrong" cannot.

3. Durable session state

The entire workshop — current step, every intermediate artifact, conversation history — is persisted on the client. A refresh, a closed tab, an accidental navigation: none of it loses the session. This isn't just UX polish. It's essential when each step costs real latency and real money.


The result

Failures stopped being dead ends. They became typed, explainable, recoverable states — and the product felt stable even when the model wasn't.

The client moved from market-validating the MVP to committing to full-scale development. I don't think that happens without this reliability layer.


The takeaway

If you're building a product on top of an LLM, the frontend's job isn't just to display AI output. It's to make the product feel reliable even when the AI isn't.

Treat the model like an external service that lies sometimes. Parse tolerantly. Fail specifically. Persist aggressively.

Your users will never know the model misbehaved. That's the point.


Built with Next.js, TypeScript, Zustand, and Anthropic Claude.
Second Sight is live at secondsight.tech.

Top comments (0)