Mohammad Raziei

Posted on Apr 6

The Real Size of AI Frameworks: A Wake-Up Call

#ai #python #opensource #programming

You Think You Know What You're Installing

When someone says "just install PyTorch," you probably think "how bad can it be?" It's a deep learning library, right? A few hundred megabytes, maybe?

Think again.

I built pip-size to expose the hidden cost of Python packages. And what I found in the AI ecosystem is... shocking.

The Numbers Don't Lie

I ran pip-size on the most popular AI frameworks. Here are the results:

Framework	Package Size	Total (with deps)
torch	506.0 MB	2.5 GB 🤯
tensorflow	545.9 MB	611.9 MB
paddlepaddle	185.8 MB	212.1 MB
jax	3.0 MB	137.1 MB
onnxruntime	16.4 MB	39.5 MB
transformers	9.8 MB	38.4 MB
keras	1.6 MB	29.5 MB

The PyTorch Surprise

Here's what happens when you pip install torch:

torch==2.11.0  506.0 MB  (total: 2.5 GB)
├── nvidia-cudnn-cu13==9.19.0.56  349.1 MB
├── nvidia-cublas==13.3.0.5  384.6 MB  [extra: cublas]
├── nvidia-nccl-cu13==2.28.9  187.4 MB
├── triton==3.6.0  179.5 MB
├── nvidia-cusparse==12.7.9.17  143.9 MB  [extra: cusparse]
├── nvidia-cusparselt-cu13==0.8.0  162.0 MB
├── nvidia-curand==10.4.2.51  57.1 MB  [extra: curand]
├── nvidia-cusolver==12.1.0.51  192.4 MB  [extra: cusolver]
└── ... (more CUDA libs)

2.5 GB. For a "simple" deep learning library.

The package itself is 506 MB, but CUDA dependencies add another ~2 GB. This is why your Docker images are huge. This is why your CI takes forever. This is why you need a 100GB disk just to do machine learning.

TensorFlow: The Heavy Champion

TensorFlow isn't far behind:

tensorflow==2.21.0  545.9 MB  (total: 611.9 MB)
├── keras==3.14.0  1.6 MB  (total: 8.3 MB)
│   └── ml-dtypes==0.5.4  4.8 MB
├── numpy==2.4.4  16.1 MB
├── h5py==3.14.0  4.3 MB
└── grpcio==1.80.0  6.5 MB

612 MB total. Keras helps (it's now bundled), but TensorFlow still brings a lot of baggage.

JAX: The Lightweight Contender?

JAX looks small at first glance — just 3 MB! But look closer:

jax==0.9.2  3.0 MB  (total: 137.1 MB)
├── jaxlib==0.9.2  79.4 MB
├── scipy==1.17.1  33.7 MB
└── numpy==2.4.4  16.1 MB

137 MB when you count everything. Still smaller than PyTorch and TensorFlow, but not "lightweight" by any means.

The Hidden Gems

ONNX Runtime: Only 39.5 MB

If you're deploying models and don't need the full training stack, ONNX Runtime is surprisingly compact:

onnxruntime==1.24.4  16.4 MB  (total: 39.5 MB)
├── numpy==2.4.4  16.1 MB
└── sympy==1.14.0  6.0 MB

That's 65x smaller than PyTorch. For inference, this is a game-changer.

Keras: Just 29.5 MB

Keras (the standalone version, not bundled with TensorFlow) is the lightest option:

keras==3.14.0  1.6 MB  (total: 29.5 MB)
├── numpy==2.4.4  16.1 MB
├── h5py==3.16.0  4.8 MB
└── ml-dtypes==0.5.4  4.8 MB

Perfect for when you want something simple without the enterprise overhead.

What This Means for You

1. Docker Images

If you're shipping PyTorch in a Docker image, plan for at least 3 GB. TensorFlow? 700 MB. ONNX Runtime? 50 MB.

Choose wisely based on your deployment constraints.

2. CI/CD

Every pip install torch in your CI pipeline costs time and bandwidth. Consider:

Caching wheels
Using lighter alternatives for testing
Installing only what's needed

3. Local Development

That "quick experiment" with PyTorch? It's 2.5 GB. Maybe JAX at 137 MB is enough for your use case.

Conclusion

The AI ecosystem is massive — literally. Before you pip install your next ML library, know what you're getting into.

Use pip-size to see the full picture:

pip install pip-size
pip-size torch
pip-size tensorflow
pip-size jax

Your disk space will thank you.

Links:

GitHub: github.com/mohammadraziei/pip-size
PyPI: pypi.org/project/pip-size

What's the biggest package surprise you've encountered? Let me know in the comments!

DEV Community