Do we actually need more GPUs, or just the right one?

#ai #discuss #infrastructure #devops

Last week I was at a tech meetup in Berlin where we got into something: are teams actually making deliberate infrastructure decisions, or just reacting to AI hype? Three practitioners shared their real experience. Here's what stuck with me.

Don't lock in before you understand your workload

Ömer from #Youzu talked through their migration off hyperscalers after getting trapped by credits and tight service coupling. Not a hypothetical, they went through it.

His takeaway: decouple early so you can move workloads freely. Know roughly where you're heading before you build, then migrate toward full control progressively. Portability isn't a nice-to-have, it's insurance.

Question the GPU arms race

David from #SteliaAI made a point that a lot of teams need to hear right now: most people are provisioning for scale they don't have yet.

His suggestion was almost counterintuitively simple:

Start with half compute, half control plane
Get customers
Then revisit

Don't optimize for a scale you haven't reached yet, because the shape of your workload will change by the time you get there.

Observability is not optional

Felix from #Cloudeteer made the case that GPU utilization metrics alone don't tell the full story. You can be running at 100% capacity and still be producing wrong outputs.

Traces — not just metrics — are what let you catch problems before they fail silently. If your AI stack doesn't have tracing today, you're flying blind with a full tank.

The thread running through all three talks

AI hype is driving infrastructure decisions that don't match actual workload needs. Every speaker arrived at the same place from a different direction: start lean, stay observable, don't couple yourself to a provider before you understand what you're building.