After working with LLMs, I believe the hardest part of the transition for backend engineers isn't the math—it's unlearning determinism.
In traditional distributed systems, Input A always yields Output B. If it doesn't, it’s a bug. With GenAI, Input A might yield Output B today, and a completely different structure tomorrow.
This breaks everything we know about stability at scale. You can't write a standard unit test for a "vibe check." You can't rely on a model to output valid JSON 100% of the time, even with strict prompting. You can't predict latency when the inference provider is overloaded.
The solution isn't better prompt engineering; it's defensive architecture. We need to shift focus from "making the model perfect" to building resilient wrappers—schema validators, circuit breakers, and automated evaluation pipelines that catch regressions before users do.
Treat the LLM like an untrusted, high-latency 3rd-party API, not a magic box.
Top comments (0)