Managing dependencies is deceptively hard. Need proof? Talk to anyone who has to manage a package.json
in JavaScript. I'm sure they'll have stories.
Python is not immune to this hard problem. For years, the community rallied around the requirements.txt
file to manage dependencies, but there are some subtle flaws that make dependency handling more confusing than necessary. To fix these issues, the Python Packaging Authority, which is the group responsible for many things including pip
and PyPI, proposed a replacement for requirements.txt
called a Pipfile
. We're going to look at the two file formats to see why a Pipfile
is a better fit for the community in the future and how you can get started using one.
requirements.txt
Let's look at requirements.txt
to see where the flaws are.
A requirements.txt
file has a very primitive structure. Here's a sample file from the handroll project that I work on.
Jinja2==2.8
Markdown==2.4
MarkupSafe==0.23
PyYAML==3.11
Pygments==2.1.3
Werkzeug==0.11.4
argh==0.26.1
argparse==1.2.1
blinker==1.4
docutils==0.12
mock==1.0.1
pathtools==0.1.2
textile==2.2.2
watchdog==0.8.3
The core requirement is that each line in the file specifies one dependency.
The example adds a version specifier for each package even though that is not required. The file could have said Jinja2
instead of Jinja2==2.8
. In that small detail, we can begin to see weaknesses in the structure. Which is more correct, to specify versions or not? It depends.
Specifying the version of a package is called pinning. Files that pin versions for every dependency make it possible to reproduce the environment. This quality is very valuable for operating in a production scenario.
What's the downside? It's very hard to determine which packages are the direct dependencies. For instance, handroll directly uses Jinja2
, but MarkupSafe
is only listed because it is a dependency of a dependency. Jinja2
depends on MarkupSafe
. Thus, MarkupSafe
is a transitive dependency of handroll.
The reason to include the transitive dependency comes back to reproducing the environment. If we only listed Jinja2
, it's possible for an updated version of MarkupSafe
to be installed that could break handroll. That leads to a bad user experience.
We've reached the core problem of the older format: requirements.txt
is attempting to be two views of dependencies.
- A pinned
requirements.txt
acts as a manifest to reproduce the operating environment. - An unpinned
requirements.txt
acts as the logical list of dependencies that a package depends on.
There is also a secondary problem related to the audience. If I'm a user of handroll, I only care about the dependencies that make the tool work. If I'm a developer for handroll, I also would like the tools needed for development (e.g., a linter, translation tools, upload tools for PyPI).
At this stage, conventions begin to break down in the community. Some projects use a requirements-dev.txt
file for developer-only dependencies. Others opt for a requirements
directory that contain many different files of dependencies. Both are imperfect solutions.
We're now positioned to consider what a Pipfile
brings to the problem.
Pipfile
A Pipfile
handles the problems that requirements.txt
does not. It is important to note that a Pipfile
is not a novel creation. Pipfile is a Python implementation of a system that appears in Ruby, Rust, PHP, and JavaScript. Bundler, Cargo, Composer, and Yarn are tools from each of those languages that follow a similar pattern. What traits do these systems have in common?
- Split logical dependencies and a dependency manifest into separate files.
- Separate the sections for user and developer dependencies.
Pipfile
and Pipfile.lock
The Pipfile
manages the logical dependencies of a project. When I write "logical," I'm referring to the dependencies that a project directly depends on in its code. One way to think about the logical dependencies is as the set of dependencies excluding the transitive dependencies.
Conversely, a Pipfile.lock
is the set of dependencies including the transitive dependencies. This file acts as the dependency manifest to use when building an environment for a production setting.
The
Pipfile
is for people. ThePipfile.lock
is for computers.
Having a clear distinction between files offers a couple of benefits.
- People can read and reason about the
Pipfile
. There is no need to guess if a dependency is a direct dependency of a project. - Extra metadata can be stored in the
Pipfile.lock
. The metadata can include things likesha256
checksums that help verify the integrity of a package's content.
Users and developers
The other trait of a Pipfile
is the split between user and developer dependencies. Let's look at the Pipfile
for pytest-tap, a project that I converted recently to the Pipfile
format.
[[source]]
url = "https://pypi.python.org/simple"
verify_ssl = true
[dev-packages]
babel = "*"
flake8 = "*"
mock = "*"
requests = "*"
tox = "*"
twine = "*"
[packages]
pytest = "*"
"tap.py" = "*"
Because Pipfile
uses TOML, it can include sections when a requirements.txt
file could not. The sections give a clear delineation between user packages and developer packages.
pytest-tap is pytest plugin that produces Test Anything Protocol (TAP) output. It is a natural fit to depend on pytest
and tap.py
, a TAP library.
The other dependencies do developer specific things. tox
and mock
help with test execution, twine
is for uploading the package to PyPI, and so on.
I hope that you could have an intuition about pytest-tap dependencies even without my prose descriptions. Additionally, splitting things out permits regular users to skip installing extra packages. That's the power of a Pipfile
.
pipenv
Now that we've covered the benefits, how do you create a Pipfile
for your own project? Enter pipenv.
Kenneth Reitz, of requests
fame, created pipenv, a tool to manage a Pipfile
. pipenv helps users add and remove packages from their Pipfile
(and Pipfile.lock
) in conjunction with a virtual environment.
Rather than manipulating a virtual environment and pip directly, you use the pipenv
command, and it will do the work for you. If you come from the Ruby world, this is very similar to bundle
.
Suppose you have a project that depends on Django. You could prepare your Django project with these commands:
$ pipenv --three
$ pipenv install Django
$ pipenv lock
Those steps would:
- create a Python 3 virtual environment
- install Django and add it to a
Pipfile
- generate a
Pipfile.lock
Once the files are created, you can share your work, and others should be able to recreate your environment.
Summary
Pipfile
is still an emerging standard. In spite of that, it is very promising and solves some problems that arise when working with packages. We saw how Pipfile
beats out the venerable requirements.txt
file, and we're equipped with pipenv to make Pipfile
s for our projects.
I hope you learned something about Python dependencies and the brighter future that is accessible today.
This article first appeared on mattlayman.com.
Top comments (0)