DEV Community

Cover image for Why I Built pip-size: A Story About Obsession with Performance
Mohammad Raziei
Mohammad Raziei

Posted on

Why I Built pip-size: A Story About Obsession with Performance

It Started with a Simple Question

"How fast is it?"

That's the question I always ask when I write a Python package. Not "does it work?" — because obviously it works. The real question is: how fast is it compared to what already exists?

I've been building high-performance Python libraries for years. Libraries like:

yyaml, pygixml, serin, ctoon, novasvg, liburlparser

And the results? In many cases, 20x to 100x faster than the mainstream alternatives.

I have the benchmarks to prove it. I've spent countless hours profiling, optimizing, and benchmarking. I know exactly how fast my code runs.

But there was one question I couldn't answer easily:

"How big is it?"


The Problem Nobody Talks About

When you compare Python packages, everyone talks about:

  • Features
  • API simplicity
  • Community support
  • GitHub stars

But nobody talks about download size. And that's a problem.

Here's why: a package might be "lightweight" in source code, but its dependencies tell a different story.

Let me give you a real example. A few months ago, I was comparing HTTP libraries:

requests==2.33.1  63.4 KB  (total: 620.4 KB)
httpx==0.28.1  71.8 KB  (total: 560.0 KB)
aiohttp==3.13.5  1.7 MB  (total: 2.6 MB)
Enter fullscreen mode Exit fullscreen mode

The package itself is small. But the total size tells a different story.

Now imagine you're choosing between two libraries:

  • Library A: 50 KB package, but pulls in 500 KB of dependencies
  • Library B: 200 KB package, but zero dependencies

Which one is really "lighter"?

That's the question I wanted to answer. But there was no tool to do it.


The Search for a Solution

I searched for existing tools. I found:

  • pip show — shows installed package size, but only for what's already installed
  • pip download — downloads everything to measure it (wasteful!)
  • Various size calculators — none of them considered the full dependency tree

The problem? You have to install the package to see its size. That's insane!

I wanted to know the size before installing. I wanted to see the full picture — the package plus every dependency, transitively.

So I did what any developer would do: I built it myself.


Introducing pip-size

pip-size calculates the real download size of PyPI packages and their dependencies. Zero downloads. No pip subprocess. Pure PyPI JSON API.

pip install pip-size
Enter fullscreen mode Exit fullscreen mode
pip-size requests
Enter fullscreen mode Exit fullscreen mode
🔍 Resolving 'requests'...
  ✓ requests==2.33.1  →  requests-2.33.1-py3-none-any.whl
    ✓ idna==3.11  →  idna-3.11-py3-none-any.whl
    ✓ certifi==2026.2.25  →  certifi-2026.2.25-py3-none-any.whl
    ✓ charset_normalizer==3.4.7  →  charset_normalizer-3.4.7-py3-none-any.whl
    ✓ urllib3==2.6.3  →  urllib3-2.6.3-py3-none-any.whl
  requests==2.33.1  63.4 KB  (total: 620.4 KB)
  ├── idna==3.11  69.3 KB
  ├── certifi==2026.2.25  150.1 KB
  ├── charset_normalizer==3.4.7  209.0 KB
  └── urllib3==2.6.3  128.5 KB
Enter fullscreen mode Exit fullscreen mode

Now you can see:

  • The package size (63.4 KB)
  • The total size including all dependencies (620.4 KB)
  • The breakdown of each dependency

Features

  • Full dependency tree — see every transitive dependency
  • Extras support — check requests[security] or fastapi[standard]
  • JSON output — integrate with scripts
  • Proxy support — for restricted networks
  • Caching — 24-hour cache to avoid repeated requests

Why This Matters

When I'm developing high-performance libraries, size matters for several reasons:

1. Deployment

If you're shipping to edge devices, every megabyte counts. A library that claims to be "lightweight" but pulls in 500 MB of dependencies is not lightweight — it's a liability.

2. Cold Starts

In serverless environments (AWS Lambda, Google Cloud Functions), cold start time correlates with package size. Smaller packages = faster cold starts.

3. CI/CD

Smaller packages mean faster pip installs in your CI pipeline. Over hundreds of builds, this adds up.

4. User Trust

As a package maintainer, I want to be transparent about what I'm shipping. If my package is 100 KB but pulls in 50 MB of dependencies, users deserve to know.


The Bigger Picture

Building pip-size made me realize something: we've been comparing packages wrong.

When we see "package X is 50 KB" and "package Y is 200 KB," we assume X is lighter. But that's only half the story.

The real cost of a package is:

package size + size of all dependencies + size of their dependencies + ...
Enter fullscreen mode Exit fullscreen mode

That's what pip-size reveals.


What's Next

I'm continuing to improve pip-size. Some ideas:

  • Compare multiple packages side-by-side
  • Show size trends over time
  • Integrate with dependency security tools
  • Add "size budget" warnings for CI

If you have ideas or want to contribute, the repo is open: github.com/mohammadraziei/pip-size


Final Thoughts

I've spent years optimizing for speed. Now I'm obsessed with size too.

Because at the end of the day, performance isn't just about how fast code runs — it's about how efficiently it reaches your users.


Have you ever been surprised by a package's hidden size? Let me know in the comments!


Links:

Top comments (4)

Collapse
 
freerave profile image
freerave

Brilliant tool and a much-needed mindset shift! But let’s talk about the real elephant in the room: Download Size vs. Uncompressed Disk Footprint.
You mentioned cold starts in serverless environments. Cold starts aren't just bounded by network transfer (the wheel size); they are heavily bounded by disk I/O and memory loading of the uncompressed .pyc and .so files. A lightweight 500KB wheel can easily explode into a 5MB+ installation footprint on disk.
Since you are using the PyPI API, any plans to parse the wheel metadata to estimate the actual installed size? That would be the ultimate game-changer for true performance obsession.

Collapse
 
mohammadraziei profile image
Mohammad Raziei

Thank you so much! Really appreciate the kind words! 🙏

And you're absolutely right — that's a brilliant point! Download size is just one piece of the puzzle. A 500KB wheel can easily explode into 5MB+ on disk once you account for uncompressed .pyc files, .so libraries, and all the metadata.

That's exactly why I made the distinction in the article: download size vs. installed footprint are two very different things.

For pip-size specifically, I have no plans to parse wheel metadata to estimate uncompressed size. Why? Because the whole point of pip-size is zero downloads. I wanted a tool that tells you the size before you install anything — no wheel download, no extraction, nothing. Just metadata from the PyPI JSON API.

So yes, pip-size gives you an estimate based on the wheel file size. It's not perfect, but it's incredibly useful for quick comparisons and catching surprisingly heavy dependencies early.

Now, for the actual disk footprint question — that's a different problem entirely. And you're in luck! I'm actually working on exactly that: pip-browse.

The idea behind pip-browse is to extract and analyze the contents of a .whl file without actually downloading it (using PyPI's file serving). This would let you see exactly what's inside a package: .py files, .so files, data files, etc. — and calculate the true installed size.

It's still in early stages (very much a work in progress), but it's the next step in this whole "know before you install" philosophy.

Would love your feedback if you have time to check it out! And if you think this is useful, please star both repos — it means a lot and helps others find these tools.

Thanks again for reading and for the thoughtful comment! 🙌

Collapse
 
freerave profile image
freerave

That is exactly the answer I was hoping for! Using PyPI's file serving to peek into the .whl contents without a full download is a brilliant architectural move—I'm assuming you're leveraging HTTP Range requests to read the zip central directory? That's some next-level optimization.

You completely nailed the 'know before you install' philosophy. As someone who spends a lot of time building CLI and developer tools (I maintain a project called DotSuite), I have huge respect for this zero-bloat mindset. We need more of this in the Python ecosystem.

You’ve definitely earned my stars for both repos. I’ll be watching pip-browse closely—if you ever need someone to stress-test it against some heavy Linux environments, hit me up. Keep up the amazing work!

Thread Thread
 
mohammadraziei profile image
Mohammad Raziei

thanks. pip-browse is not ready at this moment. But the roadmap and algorithm are definite.