TPU Mythbusting: vendor lock-in

#tpu #googlecloud #ai #gcp

Tensor Processing Units are a technology developed and owned by Google. While you can find GPUs in every cloud provider offer, the TPUs are currently only available through Google Cloud Platform. Situation when you invest in a technology or a service that is not available anywhere else is called vendor lock-in — it's something the sales people love, while customers try to avoid it. What does this look like for TPUs? Let's see.

Myth 5: TPUs are available only through Google Cloud Platform

As of today (December 12th, 2025) it is still true that TPUs are available only through Google Cloud Platform. If you develop your application to work specifically with the TPU technology, leverage all its strong sides and account for all the limitations, moving to a different provider would be a big challenge. Luckily, as you may remember from the first myth busting post, GPUs can do everything that TPUs do. They may not be as efficient for a given task, scaling might be different or limited, but in a lot of cases, a move from TPU to a GPU is possible and much easier than the other way around.

Technically, when you decide to use TPUs, you are limited to GCP as your provider, that is true. However, leaving TPUs to use GPUs is not an impossible task. Unless you make use of the TPUs amazing scaling capabilities, a migration to GPUs and a different provider is always an option.

Myth 6: TPUs require unique software

The first TPUs were developed together with the TensorFlow library. Back in 2018 when Google released the first TPUs to their customers, it was indeed the case that your application written for TPUs would not be compatible with other accelerators. Luckily, over the years since then, the software landscape has changed dramatically. Many abstraction layers were added and support for TPUs is now present in popular software solutions. For example the JAX library — it supports TPUs, GPU and CPUs alike.

The situation is especially easy when it comes to inferencing. vLLM supports plenty of models on TPUs as well as on GPUs. Similarly, MaxText can handle both accelerator types out of the box. If you're looking for a platform to run your models, it's a great idea to give TPUs a try, as jumping between the accelerator platforms has never been easier.

What's next?

In the next post, I will dive into more technical aspects of TPUs and their supporting systems. After all, the efficiency of an AI system is not dependent only on its accelerator speed. Networking and storage are also very important and while storage is pretty much the same for TPU systems as it is for GPU systems, networking is a lot more complicated. Stay tuned for the next article and keep an eye on the official Google Cloud blog and GCP YouTube channel!

Top comments (3)

Lisa Gela • Apr 27

Your myth 5 conclusion is honest - scaling is where lock-in bites. Nobody migrates a 256-chip TPU pod to GPUs for fun. For small to medium workloads though? Totally fine. Just don't build your entire company on one exotic topology.

Socials Megallm • Apr 20

the lock-in concern is real but often overstated most ml frameworks abstract hardware well enough that switching costs

Some comments may only be visible to logged-in visitors. Sign in to view all comments.