Last week I was at a tech meetup in Berlin where we got into something: are teams actually making deliberate infrastructure decisions, or just reacting to AI hype? Three practitioners shared their real experience. Here's what stuck with me.
Don't lock in before you understand your workload
Ömer from #Youzu talked through their migration off hyperscalers after getting trapped by credits and tight service coupling. Not a hypothetical, they went through it.
His takeaway: decouple early so you can move workloads freely. Know roughly where you're heading before you build, then migrate toward full control progressively. Portability isn't a nice-to-have, it's insurance.
Question the GPU arms race
David from #SteliaAI made a point that a lot of teams need to hear right now: most people are provisioning for scale they don't have yet.
His suggestion was almost counterintuitively simple:
- Start with half compute, half control plane
- Get customers
- Then revisit
Don't optimize for a scale you haven't reached yet, because the shape of your workload will change by the time you get there.
Observability is not optional
Felix from #Cloudeteer made the case that GPU utilization metrics alone don't tell the full story. You can be running at 100% capacity and still be producing wrong outputs.
Traces — not just metrics — are what let you catch problems before they fail silently. If your AI stack doesn't have tracing today, you're flying blind with a full tank.
The thread running through all three talks
AI hype is driving infrastructure decisions that don't match actual workload needs. Every speaker arrived at the same place from a different direction: start lean, stay observable, don't couple yourself to a provider before you understand what you're building.
Now, over to you
- Are you provisioning GPUs reactively or from a clear workload map?
- Have you ever scaled back after realizing you over-provisioned?
- Where does observability fit in your AI stack today?
Drop your experience in the comments.
Top comments (0)