DEV Community

Edward Amor
Edward Amor

Posted on • Originally published at edwardamor.xyz on

Virtual Environments with Python

Similar to other programming languages (R, Ruby, Scala, JavaScript) Python comes with its own way of managing third party packages you choose to install for projects. And since Python 3.4, pip has been included by default in all binary installations of Python, allowing users to install packages from the Python Packaging Index (a public repository of open source licensed packages). However, there is one major shortcoming of the way packages are managed, and that is all packages get installed and retrieved from the same place. To the uninitiated this may not seem like an issue, however it is a disaster waiting to happen.

Without going in-depth on the inner workings of package managers and dependency resolution, I’ll paint a simple picture. You’re a hobbyist developer and you enjoy scripting general tasks that are monotonous, and your new project involves downloading a bunch of images from a website. You’ve read through some repositories and figure you only need the package foo to get the task done. You go to download foo using pip but suddenly an error gets raised ERROR: bar 1.0 has requirement requests==2.24.0, but you'll have requests 2.22.0 which is incompatible. It appears a package you previously installed bar has a conflicting dependency with foo, in this case they both require different versions of the requests library. To solve this issue you could manually go through your package list and remove stuff, or you could use a virtual environment.

A virtual environment is essentially an isolated sandbox, with its own instance of pip and an isolated set of packages (and their dependencies). This means that for each different project you have, instead of downloading packages globally, you can have isolated environments with no dependency conflicts with other projects. The other benefit to having virtual environments is there are no limits, and for as many projects as you have, each one can have its own unique sandbox of packages to work with. I imagine the next question you might have is, well how do I get started using them, and honestly there are so many ways the options seem endless.

The Standard Library

The simplest way to start using virtual environments is to use the venv module (link here) that’s available in the standard library. Simply run the following command inside your project’s directory.

$ python -m venv env
$ source env/bin/activate

Enter fullscreen mode Exit fullscreen mode

Et voilà, you’ve officially created and activated your virtual environment. You’ll know it’s active because your prompt will be different, and you can verify by running the command pip list, you should have both pip and setuptools installed.

(env) $ pip list

Package Version
---------- -------
pip 20.1.1
setuptools 47.1.0

Enter fullscreen mode Exit fullscreen mode

You’ll also have a new directory env in your project, make sure not to commit it to your version control system. Instead, if you’re not already, keep a requirements.txt in your project which is just a plain text file with packages on each line required by your project. This will allow you, and any collaborators, to recreate your environment simply by running the command pip install -r requirements.txt.

The main disadvantage of this method of creating virtual environments is you have to maintain your requirements.txt file. Typically this means manually appending packages to the file (if you want a human readable version), or running pip freeze > requirements.txt (for a more explicit machine readable version) every time you install something new. Note that the output of the pip freeze command will include the exact version number of each package you’ve installed along with their dependencies.

Pipenv

As an alternative to the standard library’s venv module, and from the same mind that created the popular requests python library is pipenv. As the pipenv repository says:

[Pipenv] automatically creates and manages a virtualenv for your projects, as well as adds/removes packages from your Pipfile as you install/uninstall packages. It also generates the ever-important Pipfile.lock, which is used to produce deterministic builds.

It is an amazing tool once you get tired of using venv, and similar to other projects by Ken Reitz it is made for humans, and is immensely simple to use.

To get started, the best way to use pipenv is to have a fresh install of python, although it is fine if you don’t. Simply run pip install pipenv and you’re all set. Moving forward instead of using pip to install packages for your projects, you’ll use pipenv install [insert package name].

One thing you’ll notice when using pipenvis instead of a requirements.txt, it generates a Pipfile and Pipfile.lock. Both of these files are important and should be committed to your version control system. The Pipfile simply contains information about your project’s dependencies, whereas the Pipfile.lock contains sha256 hashes of each downloaded package allowing pip to guarantee you’re installing what you intend to. The result is a simple way to get deterministic environments (environments which are exactly the same), without any intervention from you.

**Note that in order to run your python files using your virtual environment, you’ll need to activate it. With pipenv it’s as simple as running pipenv shell while inside your project’s directory. **

One disadvantage to using pipenv, similar to venv, is that you need to already have python and pip installed on your system, otherwise you won’t be able to use them.

Conda

conda is the de facto environment/package manager for python data scientists for a reason. It’s a platform agnostic binary (python doesn’t need to already be on your system), which not only does package management, it also allows you to have different versions of python for different projects. You can download it by going to the Anaconda website and selecting the installer for your platform. After you have it, you can use the graphical user interface anaconda-navigator to access and manage your virtual environments. However, if you’re like me and live in your shell, then you’ll most likely want to use the conda command.

Note there is wonderful documentation on all the amazing configuration you can perform which I won’t go into, but I highly recommend you edit you condarc to your specification.

The main parts of conda that you should get acquainted with are creating environments, installing packages, and generating an environment.yml. The environment.yml is similar to both a requirements.txt and Pipfile.

To create an environment, activate, and install a package to the environment:

$ conda create --name my-env # create a new environment
$ conda activate my-env # activate the environment
(my-env) $ conda install jupyter # install jupyter in the environment

Enter fullscreen mode Exit fullscreen mode

One of the most important thing is after installing packages creating an environment.yml, and committing it to version control. You can generate two types, one is a deterministic version which isn’t as useful especially when working on multiple platforms conda env export. The second type which is more generally useful can be created by running the command conda env export --from-history. These files can also be generated by hand if you ever need to.

One of the biggest advantages of using conda is that it also works for multiple programming languages, not just Python. A short list of the languages it is available for include R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN, and more.

Pyenv & Pyenv-Virtualenv

pyenv is a python environment manager written in shell scripts (available for *nix systems). As it says in the project description, “It’s simple, unobtrusive, and follows the UNIX tradition of single-purpose tools that do one thing well.” Coupled with pyenv-virtualenv you can create virtual environments for many versions of python, including:

  • Python 2
  • Python 3
  • activepython
  • anaconda2
  • anaconda3
  • ironpython
  • jython
  • micropython
  • miniconda3
  • pypy
  • pypy2.7
  • pypy3.6
  • pyston
  • stackless

You’ll notice that anaconda is available, and by using pyenv you’re not limited to just regular python. pyenv really shines on GNU/Linux because it alleviates the pressures of installing packages to your system version of python. Personally, I prefer to use pyenv as it allows me to mix and match and play around with my python environments with no worry. And it’s super simple to install, and remove if you choose to abandon it.

The simplest way to install it is to use the automatic installer.

$ curl https://pyenv.run | bash
$ exec $SHELL

Enter fullscreen mode Exit fullscreen mode

Once done, you’ll have the pyenv command available, along with some additional plugins. Most importantly you’ll have pyenv-virtualenv, which allows you to create virtual environments for the python versions you install.

To get started creating virtual environments, you’ll need to first install a version of python, then use virtualenv to creaete one.

$ pyenv install 3.8.5 # version I want to install
$ pyenv virtualenv 3.8.5 my-env # create the environment
$ pyenv activate my-env # activate the environment

Enter fullscreen mode Exit fullscreen mode

The nicest part of pyenv-virtualenv, is that you can have global, and directory settings of which python version to use. For example:

[me@host my-project] $ pyenv local my-env

Enter fullscreen mode Exit fullscreen mode

The above command will create a new file, .python-version, in the project directory, and every time you enter or exit the directory will magically activate/deactivate the environment. The best part is that it’s so easy.

The one disadvantage to using pyenv and friends, is that you will most likely need to read up on it to get comfortable with it. The learning curve isn’t steep, but if you’re curious it’s always best to know the inner workings of any tools you use. Note on some distributions of GNU/Linux you will have to install additional dependencies which can be found in the ‘Common build problems’ section of the wiki.

Recommendations

Depending on your use case my recommendation will be different, however above all else experiment with the options above. There are many other options available, although I’ve only listed some of the mainstream options, more obscure options are available.

Platform Recommendation
Windows pipenv for 99% of use cases. conda if you’re into data science
Mac OSX pyenv with pyenv-virtualenv
GNU/Linux pyenv with pyenv-virtualenv

I hope moving forward if this was of any help to you, that you start taking environment management seriously. It’s very powerful and will liberate you of some of the more serious headaches that may arise from not using it.

Top comments (0)