When "no AI in the calculation" is a feature, not a bug

#softwareengineering #ai #typescript #productivity

I work on an estimation engine where the same input always produces the same output. In 2026 that's apparently a controversial design choice.

Every other tool in the demo deck has a sparkle icon now. The investor slide that doesn't say "AI-powered" feels like it was printed in 2019 and forgotten. So when a B2B platform's pitch says, in plain text, "the calculation contains no AI" — people stop and ask if that's a typo.

It isn't. It's the product.

The thing the engine does

Let me describe the shape without describing the brand. It's a deterministic software-estimation platform I work on with a client. You feed it a structured project description — features, integrations, target platforms, team composition assumptions, a long list of normalized inputs. It returns a number: hours, cost, range. That number lands in a contract. A buyer signs it. A vendor delivers against it. If the estimate is wrong by 40%, somebody loses money.

The engine that produces that number is a few thousand lines of TypeScript. Eighty-something modules. Rules, weights, lookups, modifiers. No model call anywhere in the calculation path. Same inputs in, same number out. Today, tomorrow, next March.

When I describe this to other engineers in 2026, the second question is always: "Why not put an LLM in there? It would handle the edge cases."

It would. That's the problem.

Trust is not vibes

A contract estimate is a number two parties stake real money on. The buyer wants to know what they're paying for. The vendor wants to know what they can deliver. Both want to know the number didn't move because somebody refreshed the page.

If I run the same estimate twice and get 312 hours then 287 hours, I haven't estimated anything. I've sampled from a distribution and presented the sample as a fact. That's a category error. Estimation implies a method. A method implies repeatability. Without repeatability, "estimate" is just a confident-sounding guess in a nice UI.

LLMs sample from distributions. That's their whole architecture. You can pin temperature to zero, but you're still at the mercy of model updates, context formatting, token boundaries, and the silent retraining the vendor pushes on a Tuesday. I've seen the same prompt return materially different answers across a model version bump. In a chatbot, that's a quirk. In a contract, it's malpractice.

Audit is the killer requirement

Estimates get re-litigated. Always. A project goes 30% over. The buyer wants to know why. The vendor wants to know if scope crept or if the original number was wrong. Somebody — usually a project manager who was not in the original conversation — has to reconstruct how the number got built.

With a rule-based engine, that reconstruction is mechanical. You open the calculation log. You see which modules fired, which inputs they consumed, which weights applied. You point at the line that says "multi-tenancy modifier: +18%" and you have a conversation about whether that modifier was correct. The disagreement is bounded.

With an LLM in the path, the audit trail is "the model decided." You can ask it to explain itself, and it will, in fluent English that may or may not reflect what actually drove the output. There's no causal trace from input to output that a human can verify. The model's "reasoning" is a post-hoc story it told itself to keep the conversation going.

This is fine for "summarize this article." It is not fine for a number on a procurement contract. The deeper issue is liability. When the engine is deterministic, the vendor can defend the estimate by walking through the rules. When it isn't, the defense is "trust the AI" — which a lawyer will eat alive.

Reproducibility, plainly

I'll say it the blunt way. An estimator that gives different answers to the same input is not estimating. It's hallucinating with confidence. The fact that the hallucinations cluster around a plausible value most of the time makes it worse, not better, because it lets the failure mode hide.

Determinism here isn't a nostalgia thing. It's a contract with the user that says: if you change the answer, you changed an input. If you didn't change an input, the answer is the same. That contract is what makes the tool a tool and not an oracle.

Where AI actually earns its keep

I'm not anti-AI here. The MCP image-generation server I run in production sits one repo over from this argument. I'm not arguing for a museum.

The argument is about placement. There are three spots in this product where AI does real work and nobody objects:

Input parsing. Customers describe projects in prose. "We need a Flutter app with iOS and Android, two user roles, payment via Stripe, and a moderator dashboard." A model is great at turning that into the structured input the engine expects — feature flags, platform targets, role count, integrations. If it gets it wrong, the user sees the structured form and corrects it before hitting calculate. The AI's output is a draft for a human to confirm.

Similar-project retrieval. Given a finished estimate, an embedding lookup over past projects ("here are five engagements with similar shape and scope; here's how they actually came in") is genuinely useful. It contextualizes the deterministic number without replacing it.

Explanation drafting. The engine outputs the number and a structured rationale. A model turns that rationale into a paragraph the buyer can read. The number doesn't move. The prose around it does.

Notice the pattern. AI sits at the interface between human language and structured data, in both directions. It does not sit in the calculation. The calculation is where the math has to be defensible.

The rule, generalized

Here's the rule I've started applying to every product decision involving AI:

AI fits at the human-interface layer. AI breaks anywhere the output needs to be defensible later.

If the output of a step is going to a human who will look at it and decide what to do — translation, summarization, drafting, suggestion, classification with a human-in-the-loop — AI is often the right tool. Slightly different output each time is fine, because the human is the error-correction layer.

If the output of a step is going to feed into another deterministic process, or into a contract, or into a regulated decision, or into something a court or an auditor or a finance team will examine in twelve months — AI is the wrong tool. You need a function. Functions are repeatable. Models are not.

Most products today are picking the wrong layer. They put a model in the calculation path because that's where the magic-feeling happens, then bolt deterministic guardrails around it to clean up the mess. The order is backwards. The deterministic core should be the product. The model should be the helpful assistant standing next to it, translating between humans and the core, never replacing it.

What this looks like as a positioning choice

Saying "no AI in the calculation" out loud, in marketing copy, in 2026 — that's a position. It's a bet that the buyer who actually has to defend an estimate to their CFO values reproducibility more than they value a sparkle icon. So far that bet keeps paying off in customer conversations. The buyers who matter ask one question after the demo: "Will this give me the same answer next month?" When the answer is yes, the conversation gets serious.

Determinism is the product when accountability is the use case. AI breaks accountability. Build the boring math first. Wrap it in the helpful interface second. Don't confuse the two.

— Mohamed