DEV Community

Julien Simon
Julien Simon

Posted on • Originally published at julsimon.Medium on

Video — Deep dive: Compiling deep learning models, from XLA to PyTorch 2

Video — Deep dive: Compiling deep learning models, from XLA to PyTorch 2

Compilation is an excellent technique to accelerate the training and inference of deep learning models, especially if it can be completely automated!

In this video, we discuss deep learning compilation, from the early days of TensorFlow to PyTorch 2. Along the way, you’ll learn about key technologies such as XLA, PyTorch/XLA, OpenXLA, TorchScript HLO, TorchDynamo, TorchInductor, and more. You’ll see where they fit and how they help accelerate models on a wide range of devices, including custom chips like Google TPU and AWS Inferentia 2.

Of course, we’ll also share some simple examples, including how to easily accelerate Hugging Face models with PyTorch 2 and torch.compile().

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read more

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more