Jason C. McDonald

Posted on Jan 13, 2019 • Edited on Apr 27, 2022

Dead Simple Python: Virtual Environments and pip

#python #beginners #coding

Like the articles? Buy the book! Dead Simple Python by Jason C. McDonald is available from No Starch Press.

Virtual environments. If you've ever done any meaningful work in Python, you've almost certainly heard of these. Chances are, you've even been told they're non-negotiable. Trouble is, you have no idea what they are, much less how to make them.

For my first several dozen attempts at using virtual environments, I managed to get something horribly wrong. They never worked. I hate to admit it, but I don't even know what I did anymore! Ever since I've learned how virtual environments worked, I haven't had a single problem with them.

Why Should I Care?

A virtual environment, or virtualenv as it is sometimes called, is a sandbox where you can install only the Python packages you need.

"Yeah, and what's a package?"

Well, Python is rather famous for being "batteries included." Most things "just work" with a simple import statement...

But what if you want something more than the built-in packages? For example, you might want to create a snazzy user interface, and you decide to use PySide2. PySide2, like thousands of other third-party libraries, aren't already built into Python - you have to install them.

Thankfully, installing most third-party libraries is easy! The library authors already bundled the whole library up into a package, which can be installed using a handy little Python tool called pip. (We'll get to that later.)

But this is where it gets tricky. Some packages require other packages to be installed first. Certain packages don't like certain other packages. Meanwhile, you can actually install specific versions of a package, depending on what exactly you need.

Did I mention that some of the applications and operating system components on your system rely on those Python packages?

If you're not careful, you'll end up with this mess...

This is why we have virtual environments! We can create a different little sandbox for each project, install only the packages we want in it, and keep everything neatly organized. Bonus, we never actually change what Python packages are installed on our system, so we avoid breaking important things that have nothing to do with our project.

Getting the Tools

Let's install pip and (if needed by your system, venv). Just for the sake of example, we'll also be installing Python 3. If you already have it, reinstalling doesn't hurt; otherwise, you can just skip that part.

Linux

On Linux, your distribution's package repository almost certainly has what you need.

Debian/Ubuntu: sudo apt install python3-venv python3-pip
Fedora: sudo dnf python3-pip

Mac

On Mac, you can use either Macports or Homebrew to install.

Here's Macports, for Python 3.7. If you want 3.6, change all the instances of 37 to 36 below...



sudo port install python37 py37-pip
sudo port select --set python python37
sudo port select --set pip py37-pip

Here's Homebrew...



brew install python

Windows

On Windows, you'll only need to download and install Python. This should also automatically install pip and venv.

However, if you try to run pip in the command line, you should download get-pip.py onto your Desktop, navigate to that directory on your command line, and run it via python get-pip.py.

Whew! With that out of the way, let's get to the fun part...CREATING virtual environments!

Creating a Virtual Environment

Once again, a virtual environment is like a sandbox, containing only the packages you choose, and ignoring (by default) all the Python packages installed elsewhere on your system. Each virtual environment resides in a dedicated directory. Conventionally, we name this folder venv.

For each project, I typically like to create a dedicated virtual environment inside the project folder. (If you're using Git or another VCS, there's an additional setup step we'll get to in a moment.)

To create the virtual environment, we first change our working directory in our command line to our project folder. (Remember, that's the cd command.) Then, we create the virtual environment and its directory in one step.

On UNIX...



python3 -m venv venv

The last part of that command, venv, is the name of the directory you're creating for the virtual environment. Technically, you can call it whatever you want, but like I mentioned before, venv is the convention.

Note, we're explicitly specifying we want to use python3 here, although we can invoke venv with the specific Python executable we want to use (such as python3.6 -m venv venv)

If you look at your working directory, you'll notice that the directory venv/ has been created.

Activating a Virtual Environment

Great, so how do we use this thing?

Actually, it's deliciously simple.

On UNIX-like systems (Mac, Linux, etc.), just run...



source venv/bin/activate

On Windows, run...



venv\Scripts\activate.bat

Like magic, you're now using your virtual environment! On UNIX systems, you'll probably see (venv) at the start of your command line prompt, to indicate that you're using a virtual environment named venv.

Naturally, if you named your virtual environment something else, like bob, you'd need to change the activation command accordingly (source bob/bin/activate).

One of the wonderful things about virtual environments on a system with multiple versions of Python is that you no longer have to specify the path in your commands. While the virtual environment is activated, python whatever_your_command_is.py will use the version of Python you selected when you created the venv. Every time.

Introducing pip

Most of us have great expectations for Python's package system. (See what I did there? No? Sigh.)

pip is pretty easy to use, and much easier than it used to be in the bad-old-days. It used to be so clunky, in fact, that someone felt they needed to create something called easy_install, but pip is now very much painless to use.

Installing Packages

To install a package - say, pyside2, just run...



pip install PySide2

If you want to install a specific version of something, that's easy too.



pip install PySide2==5.11.1

Bonus, you can even use operators like >= ("at least this version, or greater"), and so forth. These are called requirement specifiers. So this...



pip install PySide2>=5.11.1

Would install the latest version of PySide2 that is at least version 5.11.1, or else later. This is really helpful if you want to make sure someone actually has access to a minimum version of a package (they might not).

requirements.txt

You can actually save even more time for yourself and others by writing a requirements.txt file for your project. On each line, list the name of the pip package, exactly as you would type it into the install command.

For example, if you had a requirements.txt file like this...



PySide2>=5.11.1
appdirs

...you could install all those packages in one shot with...



pip install -r requirements.txt

Easy, right?

Upgrading Packages

You can update an already installed package using the pip install command and the --upgrade flag. For example, to install the latest version of PySide2, just run...



pip install --upgrade PySide2

You can also upgrade all your required packages at once with...



pip install --upgrade -r requirements.txt

Removing Packages

Removing things is just as easy.



pip uninstall PySide2

Finding Packages

Great, so we can install, upgrade, and remove things. But how do we know what packages pip even has to offer?

There are two ways. The first is to use pip itself to run a search. Say you want a package for web scraping.



pip search web scraping

That will give you a whole ton of results to sift through, but it's helpful if you simply forgot the name of a package.

If you want something a little more browsable and informative, PyPI.org is the official Python Package Index.

Last Notes

Once you've installed the packages you need for your virtual environment, you're good to go! The next time you start the virtual environment, those packages will still be there, exactly as you left them, waiting for you.

One Warning About pip...

No matter who tells you to, never, ever ever, EVER use sudo pip on a UNIX system. It will do so many bad things to your system installation that your system package manager cannot correct, you will be regretting the decision for the lifetime of your system.

All the problems that sudo pip appears to fix can be solved with virtual environments.

Friends don't let friends sudo pip

Leaving a Virtual Environment

Great, so how do you get out of the virtual environment, and back to reality...er, ahem, the system.

You ready for this, UNIX users?



deactivate

I know! Simple, right?

Of course, things are just slightly more complicated on Windows...



venv\Scripts\deactivate.bat

Eh, still pretty painless. (Remember, like with activation, if you named your virtual environment something else, you'd have to change that line accordingly.)

The Whole She-Bang

So, one last little detail. You may have noticed that most Python files start with something like...



#!/usr/bin/python

First, this is called a "she-bang" (short for haSH-BANG, or #!), and it allows the script to be run without python being tacked onto the beginning of the terminal command.

Second, the above line is very, very wrong. It forces the computer to use a particular system-wide copy of Python, which more or less throws the whole virtual environment thing out the window.

Instead, you should always use the following she-bang for Python3 scripts:



#!/usr/bin/env python3

If you happen to have a script which runs on both Python2 and Python3, use:



#!/usr/bin/env python

(By the way, the rules about python vs. python2, vs. python3 officially come from PEP 394.)

Virtual Environments and Git

Remember that warning earlier, about venv if you're using a VCS like Git?

Within a virtual environment's directory are the actual packages you installed with pip. Those would clutter up your repository with big, unnecessary files, and you can't necessarily copy a virtual environment folder from one computer to another and expect it to work anyway.

Thus, we don't want to track these files in our VCS. In Git, you'll have a file called .gitignore in the root directory of your repository. Create or edit that file, and add this line somewhere in it...



venv/

Naturally, if you used a different name for your virtual environment, you'd change that line to match.

Conventionally, every developer who clones your repository will build their own virtual environment, probably using that requirements.txt file you created.

If you're using a different VCS, like Subversion or Mercurial, check the documentation to see how to ignore a directory like venv.

But...

A lot of Python developers will probably frown deeply at you for putting your virtual environment in the repository folder at all. The main disadvantage to my method above is, if you name your virtual environment anything but venv (or whatever you put in your .gitignore, it's going to get committed, and that's bad.

The best habit is actually to keep your virtual environment out of the repository directory altogether. But, if we're honest, most of us actually don't. It just feels more convenient to do it the "wrong" way.

That's why we add venv to the .gitignore. You, or someone else, is probably going to stick the virtual environment in your repository directory, so this just helps prevent some accidental commits.

A Few Extra Tricks

Some of my Python developer friends on Freenode IRC, as well as folks in the comments, pointed out a few extra tricks that will be helpful to virtual environment users.

Before Python 3.3

The venv command works only if you're using Python 3.3 or later. Before that, you'll need a package from pip called virtualenv. See the virtualenv documentation if you need to use that.

If you're on a system with both Python 2 and Python 3, be sure you use python3 -m venv or whatever is appropriate. This trick doesn't work on any version of Python before 3.6.

Using a Virtual Environment Without Activating

You can also use the binaries that are part of the virtual environment without actually activating it. For example, you can execute venv/bin/python to run the virtualenv's own Python instance, or venv/bin/pip to run its instance of pip. It'll actually work the same as if you had activated the virtual environment!

For example, I could do this (assuming my virtual environment is venv)...



venv/bin/pip install pylint
venv/bin/python

>>> import pylint

...and it works! Yet, import pylint will NOT work on the system-wide Python shell still...unless, of course, you installed it on the system. ;)

The Alternative

I've heard a lot of suggestions for using pipenv, including in the comments section. I won't be covering it in this article, but it has a neat looking workflow, and many vocal fans. You can find out more here: pipenv on PyPI

Wrapping Up

I hope this guide has completely demystified virtual environments and pip for you. Naturally, I recommend you keep the documentation under your pillow:

Chris Warrick also has an excellent article that covers some other facets of virtual environments: Python Virtual Environments in Five Minutes

Comics courtesy XKCD

Thank you to grym, deniska, and ChrisWarrick (Freenode IRC #python) for suggested revisions.

Top comments (39)

Christopher Wrinn • Jan 13 '19 • Edited

Check out pipenv, it so very nicely married pip and venv/virtualenv. I've quite enjoyed that it tracks the requirements for me as I go.

kip • Jan 13 '19 • Edited

If you're gonna start with Python 3.6+, venv module (built in) is my recommendation, of course only if you will work with 3.6+ version and you're not looking for previous versions.

Good start for this good serie !

Richard Lenkovits • May 28 '19

Nice one! Could add pip freeze step for requirements.txt creation, it results in better dependency management, as it handles transitive cases too.

Anna R Dunster • Jun 10 '20

What are your thoughts on using Anaconda? I've done a couple online courses and so far they've all started with installing python via anaconda. It has its own package management and virtual environment stuff (or at least its own commands for them).

Jason C. McDonald • Jun 10 '20 • Edited

Honestly, I've never used it. I know Anaconda is a Python distribution platform geared specifically towards data science. If you like it, by all means, enjoy it. :3 But you should know how regular Python works, and then find out what makes Anaconda different. I don't know much about it at all.

Anna R Dunster • Jun 10 '20

Makes sense. Everywhere I've seen commands they look pretty similar. I wouldn't be surprised if conda is using a lot of the same stuff behind the scenes, but I don't actually know. (it is definitely recommended to NOT use pip to install anything if you're using conda, though...or if you absolutely must, do it last.)

I'm not sure how I like it, comparatively, having not done anything with Python without it. Sometimes it seems like more overhead than necessary, though.

Ravinder Kumar • Jan 20 '19

Hello Jason,
I found this article very informative and point-to-point coverage on topics.
Though I still have doubts on she-bang. As you said above if we write !#/usr/bin/env as first line in python script we don't need to call it with 'python' command on terminal. So I tried this by writing 'test.py' on Linux but this script is not getting executed without typing 'python' onto the beginning of the terminal command and it gives error : 'test.py' is not a command..
So I didn't understand the use of she-bang properly.
Are there any exceptions to these?

Jason C. McDonald • Jan 20 '19

That error has nothing to do with Python at all. Running just...

test.py

...is going to search your system environment paths for an executable file called test.py, and I can practically guarantee your project folder isn't part of that environment path.

Thus, you need to run...

./test.py

The . means "current working directory", and then you build the relative path from there.

Michael Learns • Jan 13 '19

Awesome! Quick question tho on sudo pip. I've had encounters before with it.

Why is it wrong,
and what happens if you do do it,
and how do you "undo" such an action?

João Veiga • Jan 13 '19

It's not exactly wrong, it might not do what you are expecting.
People might run pip install requests and fail because of permissions. They might try sudo pip instead of pip install --user.

Sudo pip will install things in the system's python, rather than the user's python.

Jason C. McDonald • Jan 13 '19

Good explanation, @jcsvveiga .

The reason I say you should "never" use sudo pip is, on any system maintained by a dedicated package manager, it can easily conflict with the system-maintained packages, which can make a real mess, especially where dependencies are concerned.

Michael Learns • Jan 13 '19

Ah! That makes sense. So it would be much better then to do pip install --user than sudo?

João Veiga • Jan 13 '19

Yes, or if you want a ditectory/project dependencies, use a virtual environment

Tomasz Buszewski • Jan 15 '19

Hi, nice article, but honestly I've abandoned virtualenv in favor of Docker. What are your thoughts about this tool?

Jason C. McDonald • Jan 15 '19 • Edited

Docker has a lot of uses, but merely as a replacement for virtual environments? It's overkill. If a virtualenv is a sandbox, Docker is a clean room. You can use it, naturally, but if the only thing you were needing was a controlled set of Python packages, it's a lot of wasted time and disk space.

Now, if you actually need to control your environment beyond what virtualenv allows you to control, Docker is fine. (It makes sense for web development, for example.)

Just not as a straight virtualenv replacement.

Thomas Junkツ • Dec 28 '19 • Edited

I do not share the point of docker being "overkill" because if one is fluent with it, there would be no "time wasted". And disk space is cheap - as long as you aren't on a Mac (SCNR - But admittedly one of my box is a Mac and I am too guilty of suffering from the lack of a bunch of GBs).

I put it more mildly: It has no advantages over virtual envs if you do not use docker for deployment.

Jason C. McDonald • Jan 20 '20 • Edited

It has no advantages over virtual envs if you do not use docker for deployment.

Well, and see, that's why I said that if it was merely a replacement for virtualenvs, it was overkill. A virtual environment only replicates the bare minimum necessary, in terms of the Python interpreter and libraries you're using.

If you're actually deploying with Docker, that's another story altogether.

First, not everyone has Docker configured on their machine. Second, resources like disk space are not always "cheap" on any system. (Macs aren't the only machines to suffer from low disk space.) Internet may be limited, slow, or inexpensive, making a Docker build prohibitive. Time may be of the essence, RAM might be limited, or CPU might be in demand. There are so many scenarios in which Docker is just not going to be work.

Virtual environments are the one canonical "works everywhere" solution for sandboxing Python. Anyone and everyone who can run Python 3 (and most versions of 2) can create a virtual environment. They require very little overhead, generally minimal network time, and nothing in the way of extra processing power. Ironically, they're one of the fastest and most reliable ways to deploy a Python project in a Docker!

To this aim, your project should be configured properly so that it can be run in a virtual environment. (It doesn't take that much work; certainly less work than a Dockerfile does.)

The third problem is, if the Python project in question has anything to do with GUI or the like, you're going to be beyond the "just start a Docker container" situation. You'll have to configure and work with VNC, which for testing a local application is generally impractical. The same could be said of working with local filesystems (you have to plan what to access and mount it), system integration features, and the list goes on.

Docker certainly has a purpose in some Python development situations, but it should be considered a separate tool altogether, and never as a "replacement" for virtual environments.

Tomasz Buszewski • Jan 20 '20

Hey Jason, thanks for the great write-up! It certainly solves a lot of questions for me. But one is still there – what to do when I have databases and I don't want to have pq, mysql and mongo installed on my system in various versions?

Jason C. McDonald • Jan 20 '20 • Edited

Assuming SQLite is not an option, this is exactly the sort of scenario where Docker is a good solution: you aren't just needing the sandbox the Python environment itself (virtual environment), but the actual system environment.

Virtual environments really become extra helpful here, too, because it simplifies your Dockerfiles. Assuming your Python project has a requirements.txt file, you'd only need the following:

RUN python3 -m venv venv && \
    venv/bin/pip install -r requirements.txt

You'll note, I had no need to "activate" the virtual environment. You can just use the venv's binaries directly; they know to use the virtual environment.

Tomasz Buszewski • Jan 15 '19

Thanks for your reply, Jason.

I enjoy using Docker, mostly because I always have a database or something similar standing next to the actual application, plus deploying Docker is super easy, but perhaps I should try using venv more. Thank you :)

Thomas Junkツ • Dec 28 '19

Both docker and virtualenvs try to solve the same problem: separating the system side of things from the user side of things. In case of virtual envs you are getting basically a bit "shell magic" where paths are tweaked in such a way that things work as you were interfering with the system's python but you aren't. That makes it impossible to have interfering installations of python packages and versions.

Docker does the same on a more fundamental level:
Docker divides the kernel-land from the user-land. So it behaves more or less like a virtual environment; but instead of relying on "shell magic" it relies on "kernel magic".

From that point of view there is no real difference or advantage of using one over the other. It totally depends on your deployment model. If your deployment involves docker, it would make more sense to use docker from the first line of code written - and it would be a minimal smoketest for what you are doing anyways. If your deployment model doesn't involve docker, I see no advantage of using docker over a virtual environment.

Jaime Lopez A.K.A. Tacoder • Jan 22 '19

Great article Jason, i love how easy is to understand virtualenvs using this.
Recently i came across a problem trying to teach a colleague how to set up his own virtualenv to simulate what i had done with mine since we were trying to mimic the entire environment of mine we faced a struggle and ended up failing :(

A day later we came across pipenv wich reproduced the exact ambient for us and problem solved... now i have to ask whats your stance in this virtualenv vs pipenv that seems to be happening around the python world?

and also are you planning to add it to your dead simple series? i will love to use it to teach my colleagues at college, i love the style of your writing (fan girl screaming on the background).

greetings and lots of tacos for you.

Jason C. McDonald • Jan 22 '19

Honestly, I never planned on adding pipenv to this, but I may have to cave and add it anyway! Everyone keeps asking. :P

So glad you enjoyed it!

leob • Jan 14 '19

Just started with Python a week ago and the first advice I came across: pipenv! Combine venv, pip and whatnot in one simple command. It seems to be "the new standard" and until now seems to work flawlessly.

Chris Lozac'h • Jan 14 '19

This is excellent! I did something wrong awhile back and ended up in a situation that makes that second xkcd look very familiar. 😬

(Speaking of xkcd, I don't see any attribution for the comics. While I'm not exactly sure what legal attribution compliance with the license looks like, a link to the source material seems like an appropriate minimum.)

This has me excited to try python again, this time following your clear and concise instructions, and taking advantage of venv! Thank you!

Jason C. McDonald • Jan 14 '19 • Edited

So glad you liked the article!

I went ahead and added an attribution at the bottom, thanks. XKCD is so widespread at this point, they get embedded rather commonly anymore (so much so, that Randall just hands out embed links, which I used). That's why I overlooked attribution. It's polite, anyhow. ;)

View full discussion (39 comments)