Julien Avezou

Posted on Jan 2

Ship AI in your engineering stack without the chaos

#webdev #programming #ai #productivity

AI can make your product feel elevated. It can also make it feel unreliable, expensive, and risky. So the real question isn’t "Should I add AI?" but rather

Can I add AI without breaking my UX or my budget?

Building with LLMs is not like integrating a normal API.
You’re dealing with:

Non-determinism: the same prompt can yield different outputs
Tokens: every request/response has a budget that drives cost + latency
New attack surfaces: prompt injection, data leakage
Privacy constraints: you may be sending sensitive user text to a third party

And there’s a mindset trap too:

AI isn’t your product (most of the time). It’s just a tool from your larger engineering toolbox that can elevate your stack if integrated properly
LLMs aren’t truth machines. If you treat them like one, your UX will suffer. The models can be prone to hallucinations and need to be monitored with guardrails in place.
Overfitting your UX to AI often makes the product worse, not better due to the added complexity

I have been leveraging LLMs in the last few months and here are just some of my key learnings:

Start with mindset, not models.

Before I write a single prompt, I ask myself:

What user pain does this solve today?
What’s the “manual” version of this flow? And how does AI improve the experience, without taking control away?
For my app, a rule of thumb is that I only add AI when I can justify it through discovery and feedback. And I usually layer it on top of an existing non-AI flow first. That gives me a baseline UX that still works when AI is slow, wrong, or unavailable.

Technical discovery is time well spent.

For larger features, I stretch technical discovery across a few days, exploring different use-cases. This is a habit that helps me gain user perspective.

What I actually did during my last discovery session:

Write acceptance criteria (what “good” looks like)
Sketch the visual journey using a tool like Excalidraw
Decide where users need control (accept/reject AI generated outputs)
Define my scope for faster iterations and early feedback

Treat every AI feature like a pipeline, not a simple function call

For AI features implementation, I think in a sequential pipeline:

Normalize input
Sanitize and redact (privacy-first)
Schema validation and assertions
Trigger logic (Threshold)
Generation logic (prompt + tool choices)
Post-processing (format, structure, safety)
Delivery (UI controls, logging, persistence)

This pipeline approach solves two problems at once:

It tames non-deterministic outputs with constraints and checks
It makes the system observable when potential risks are uncovered

How I control output quality:

Strict assertions where possible (format, JSON shape, required fields)
Tight system prompts + a few high-quality examples of input/output pairs (few-shot)

Assume latency and cost will spike. Then design against it.

If your AI feature becomes popular, you will see it in your bill and latency.

What I put in place early:

Feature-level AI usage logging (so I know what’s expensive)

await aiUsageLogger({
  userID: userID,
  type: 'AI_PROMPT',
  model: 'gpt-4o-mini',
  inputTokens: completion.usage?.prompt_tokens ?? 0,
  inputCachedTokens:   completion.usage?.prompt_tokens_details?.cached_tokens ?? 0,
  outputTokens: completion.usage?.completion_tokens ?? 0,
});

Rate limiting + hard daily caps (avoid surprise bills)
Deduping (event table + hash keys)
Similarity checks to avoid “same output, different words” fatigue
Feature flags to ship safely and roll back fast
Breadcrumbs + Alerts in Sentry across the whole pipeline for visibility. When the feature fails, I want to know where and why in seconds.

Evaluate like an engineer, not like a researcher.

Early on, I don’t start with “LLM-as-a-judge.” It adds complexity fast.
My default is simpler:

Manual review of recent traces
Bottom-up error analysis: group failures, count patterns, fix the top ones

Don’t aim for perfection from the start, just keep tinkering.

Recent examples of AI features I implemented for my app.

AI-generated prompts
AI-generated tasks

Each feature has its own pipeline. That separation gives clear ownership and keeps changes localized. I also created reusable utility functions to be shared across these pipelines.

lib/
├─ ai/
│  ├─ AiQuickPrompts/
│  │  ├─ assert.ts
│  │  ├─ generate.ts
│  │  └─ threshold.ts
│  ├─ AiTasks/
│  │  ├─ assert.ts
│  │  ├─ generate.ts
│  │  └─ threshold.ts
│  ├─ utils/
│  │  ├─ fuzzy.ts
│  │  ├─ normalize.ts
│  │  └─ sanitize.ts
│  └─ types.ts

One UX decision made a big difference: users can accept or reject the AI output. 

That does two things:

It keeps the user in control (mindset)
It gives me a clean signal of usefulness from users (technical feedback loop). When users reject an output, I don’t take it personally. I treat it like a failing test.

If you’re adding AI to your product this week, I’d do this:

Start with one question: What improves for the user if AI is added?

If the answer is “nothing,” you’re adding complexity without any added value.

If yes, ship the smallest AI feature that has:

A clear input contract
A pipeline (even a simple one)
Logging + cost visibility
A UX escape hatch (edit, reject, fallback)
Security layers in place

Keep your mindset clear:

AI is a tool, not the product
Your UX should still make sense without it
Don’t let the model steer the roadmap

The same principles I introduced in software engineering can also be applied to life in general.

I think about AI as a power tool. Used well, it cuts out the mundane parts. Used poorly, it cuts into your judgment.

So I try to keep a balance. I use AI to accelerate repetitive work and explore options. But I keep the important decisions and reflections grounded in my own thinking.

If you’re building with LLMs, I’m curious: what do you wish you knew before you shipped your first LLM feature?

And if you’re into improving your critical thinking skills as an engineer, I’ve been building Jots to help. Jots uses research-backed frameworks and AI assistance to prompt you with the right questions, to reflect and learn in a better way.

Top comments (2)

Nals • Jan 4

I am still experimenting with LLMs. Thank you for these insights, I will take them with me for my upcoming projects.

Julien Avezou • Jan 4

Great! Nice to hear!