You Think You Know What You're Installing
When someone says "just install PyTorch," you probably think "how bad can it be?" It's a deep learning library, right? A few hundred megabytes, maybe?
Think again.
I built pip-size to expose the hidden cost of Python packages. And what I found in the AI ecosystem is... shocking.
The Numbers Don't Lie
I ran pip-size on the most popular AI frameworks. Here are the results:
| Framework | Package Size | Total (with deps) |
|---|---|---|
| torch | 506.0 MB | 2.5 GB 🤯 |
| tensorflow | 545.9 MB | 611.9 MB |
| paddlepaddle | 185.8 MB | 212.1 MB |
| jax | 3.0 MB | 137.1 MB |
| onnxruntime | 16.4 MB | 39.5 MB |
| transformers | 9.8 MB | 38.4 MB |
| keras | 1.6 MB | 29.5 MB |
The PyTorch Surprise
Here's what happens when you pip install torch:
torch==2.11.0 506.0 MB (total: 2.5 GB)
├── nvidia-cudnn-cu13==9.19.0.56 349.1 MB
├── nvidia-cublas==13.3.0.5 384.6 MB [extra: cublas]
├── nvidia-nccl-cu13==2.28.9 187.4 MB
├── triton==3.6.0 179.5 MB
├── nvidia-cusparse==12.7.9.17 143.9 MB [extra: cusparse]
├── nvidia-cusparselt-cu13==0.8.0 162.0 MB
├── nvidia-curand==10.4.2.51 57.1 MB [extra: curand]
├── nvidia-cusolver==12.1.0.51 192.4 MB [extra: cusolver]
└── ... (more CUDA libs)
2.5 GB. For a "simple" deep learning library.
The package itself is 506 MB, but CUDA dependencies add another ~2 GB. This is why your Docker images are huge. This is why your CI takes forever. This is why you need a 100GB disk just to do machine learning.
TensorFlow: The Heavy Champion
TensorFlow isn't far behind:
tensorflow==2.21.0 545.9 MB (total: 611.9 MB)
├── keras==3.14.0 1.6 MB (total: 8.3 MB)
│ └── ml-dtypes==0.5.4 4.8 MB
├── numpy==2.4.4 16.1 MB
├── h5py==3.14.0 4.3 MB
└── grpcio==1.80.0 6.5 MB
612 MB total. Keras helps (it's now bundled), but TensorFlow still brings a lot of baggage.
JAX: The Lightweight Contender?
JAX looks small at first glance — just 3 MB! But look closer:
jax==0.9.2 3.0 MB (total: 137.1 MB)
├── jaxlib==0.9.2 79.4 MB
├── scipy==1.17.1 33.7 MB
└── numpy==2.4.4 16.1 MB
137 MB when you count everything. Still smaller than PyTorch and TensorFlow, but not "lightweight" by any means.
The Hidden Gems
ONNX Runtime: Only 39.5 MB
If you're deploying models and don't need the full training stack, ONNX Runtime is surprisingly compact:
onnxruntime==1.24.4 16.4 MB (total: 39.5 MB)
├── numpy==2.4.4 16.1 MB
└── sympy==1.14.0 6.0 MB
That's 65x smaller than PyTorch. For inference, this is a game-changer.
Keras: Just 29.5 MB
Keras (the standalone version, not bundled with TensorFlow) is the lightest option:
keras==3.14.0 1.6 MB (total: 29.5 MB)
├── numpy==2.4.4 16.1 MB
├── h5py==3.16.0 4.8 MB
└── ml-dtypes==0.5.4 4.8 MB
Perfect for when you want something simple without the enterprise overhead.
What This Means for You
1. Docker Images
If you're shipping PyTorch in a Docker image, plan for at least 3 GB. TensorFlow? 700 MB. ONNX Runtime? 50 MB.
Choose wisely based on your deployment constraints.
2. CI/CD
Every pip install torch in your CI pipeline costs time and bandwidth. Consider:
- Caching wheels
- Using lighter alternatives for testing
- Installing only what's needed
3. Local Development
That "quick experiment" with PyTorch? It's 2.5 GB. Maybe JAX at 137 MB is enough for your use case.
Conclusion
The AI ecosystem is massive — literally. Before you pip install your next ML library, know what you're getting into.
Use pip-size to see the full picture:
pip install pip-size
pip-size torch
pip-size tensorflow
pip-size jax
Your disk space will thank you.
Links:
- GitHub: github.com/mohammadraziei/pip-size
- PyPI: pypi.org/project/pip-size
What's the biggest package surprise you've encountered? Let me know in the comments!
Top comments (0)