DEV Community

dmikhr
dmikhr

Posted on

Extend Python VENV: Organize Dependencies Your Way

Introduction

Virtual environments are great way to organise development process by isolating project specific packages. In such way Python has a built-in tool venv for creating virtual environments. In this tutorial we are going to explore how to extend its functionality by implementing a feature that stores information about production and development related packages in separate requirements file.

This tutorial will guide you how to implement such feature, explaining logic behind the solution and purpose of each function in the script. Prerequisites for this tutorial are basic knowledge of Python and experience with virtual environments. Examples from this tutorial requires Python 3.5+.

Tools

There are different ways to split dependencies in Python. Poetry provides versatile options to configure Python app including staging: production, development and testing. It also provides a lot of other useful options for Python developers.

But, despite powerfull functionality there might be reasons to use simpler alternatives like venv. If you are working on a complex project, planning to publish your project on python repository like PyPi or it's historically based on Poetry then staying with Poetry makes sense.

Other alternatives worth mentioning are virtualenv and pipenv. In data science and scientific computing also conda has substantial popularity. While these tools provide options to customize your virtual environment it's not always needed.

Simpler alternatives

If you are working on a pet project, simple service or just don't want to complicate code with extra third party tools then using venv can be an ideal solution. It's built-in Python library and has limited set of commands which makes it easy to learn. In the simplest scenario there no need to learn venv at all. Just navige to the project directory and create virtual environment:

python -m venv .venv
Enter fullscreen mode Exit fullscreen mode

What it will do is to call python module venv and create virtual environment in .venv directory under the current directory.

Other common names for virtual environment folders are venv, env, .env. Now this folder contains executables for python and it will store installed packages. Let's now install some packages. But before we can do that virtual environment must be activated

source .venv/bin/activate
Enter fullscreen mode Exit fullscreen mode

When you no longer need virtual environment active deactivate it by simply typing deactivate.

In our hypothetical scenario let's assume that we are building a web app on Flask, but before that let's check that we are really in the virtual environment by calling which python. It will produce a path to currently used python executable. If path leads to the current directory and ends like .venv/bin/python then environment is activated and we are good to go.

Installing Flask

pip install Flask
Enter fullscreen mode Exit fullscreen mode

Packages especially complex ones like Flask doesn't exist in isolation, it depends on other python packages that will be installed alongside. To ensure that this is the case call pip freeze. It shows a list of installed packages in our virtual environment with their corresponding versions.

A good practice is to maintain a text file requirements.txt with relevant dependencies so other developers can setup environment for work by installing all required packages by calling pip install -r requirements.txt.

Creating requirements.txt

pip freeze > requirements.txt
Enter fullscreen mode Exit fullscreen mode

Apart from packages that are required for running an app usually there is a need in tools that are used in development and testing. For this example we choose a package for testing pytest, for maintaining code style in accordance with PEP8 let's install flake8 (for manual check) and autopep8 that will format our code properly.

pip install pytest flake8 autopep8
Enter fullscreen mode Exit fullscreen mode

Type pip freeze to see that now there are more packages. And finally save to file: pip freeze > requirements.txt

Keeping dependencies separate

Here we came to the point where venv simplicity gives us a bit of inconvenience. Currently installed packages can be divided into 2 groups - one group of packages related to Flask are necessary for running the app. However it's redundant to install test packages into production environment.

The solution is to maintain two separate files - one for app packages (requirements.txt) another for development packages (requirements-dev.txt). In this case testing packages are also stored in -dev file like for example in Flasgger. It's possible to split dependencies manually removing them from original requirements.txt and saving them into requirements-dev.txt.

However if new packages will be installed this process should be repeated. At this point we have two options: either moving to more advances third-party virtual environment management tools or we can automate this process.

Automation

Basically this process consists of two steps: save current packages list into file: pip freeze > requirements.txt and then some script sould sort packages keeping app related packages in requirements.txt while moving development packages to its development counterpart.

This sequence can be executed manually from terminal however on UNIX based operation systems including MacOS there is more convenient solution: make utility. Basic usage is make command-name. This utility when called search for file named Makefile and execute sequence of commands under command name section.

Let's create our Makefile and command

freeze:
    pip freeze > requirements.txt
    python -m split_dependencies.py
Enter fullscreen mode Exit fullscreen mode

Calling make freeze will save dependencies into requirements.txt and call python script to split dependencies between separate files. To run it correctly first let's create the basis for a future script. Create file split_dependencies.py and put the following code into it:

if __name__ == "__main__":
    pass
Enter fullscreen mode Exit fullscreen mode

Now run make freeze. If everything has been done right terminal will produce the list of executed commands:

pip freeze > requirements.txt
python -m split_dependencies
Enter fullscreen mode Exit fullscreen mode

At this point the following file structure is presented in working directory:

.
├── .venv
├── Makefile
├── requirements.txt
└── split_dependencies.py
Enter fullscreen mode Exit fullscreen mode

Script architecture

It's time to start developing the script. But before coding let's think about things that our script should be doing. It's going to consists of multiple functions and main executable function run() that will be responsible for executing other functions.

It's a good practice to follow single responsibility principle while coding function. So, every function will have only one task. For example list of packages should be loaded from file (load_requirements) but for cleaning data from newline character another function can be used. Then to obtain cleared list of packages these function will be called in a sequence one passing data to another forming data pipeline.

A list of functions with their interfaces that will be used in the script separated by their responsibility:

Load

  • load_requirements(fname="requirements.txt")
  • clean_list(items: list) - remove newline character from each string loaded from requirements.txt

Logic

  • is_dev_requirement(item: str) - check if current package is related to development category
  • is_prod_requirement(item: str) - check if current package is related to production (or app running) category
  • extract(criteria: Callable, data: list) - filter list of packages by criteria (is_dev_requirement, is_prod_requirement)

Save

  • prepare_data(data: list) - prepare data for saving (joining list of filtered packages to string)
  • save_requirements(fname: str, data: str) - save requirements to file (requirements.txt and requirements-dev.txt)

Since packages related to development and testing usually much less in quantity compared to app related packages it makes sense to store information about them for filtering purposes. Here is an example of a list with packages that might be used for development:

DEV_REQUIREMENTS = ["autopep8", "black", 
                    "flake8", "pytest-asyncio", 
                    "pytest", "Faker"]
Enter fullscreen mode Exit fullscreen mode

Also during the coding we are going to use python typing to make code more readable in terms what datatypes are used and your IDE can use this information for providing type hinting.

Further type checkers like mypy can be used to enable static typing (Python by design is dynamic typing language). Since Python 3.5 typing is a built-in feature. However some additional functionality for typing can be achieved by using typing package (shipped with Python distribution).

Coding

Using developed script architecture let's code the script. It's going to be managed by run() function where script logic will be implemented.

Step 1: loading list of packages from requirements.txt and cleaning it from newline characters.

Step 2: forming two lists of packages - for production and development

Step 3: preparing data and saving it into two text files.

Here is a final version of a script. Feel free to experiment with it:

from typing import Callable

DEV_REQUIREMENTS = ["autopep8", "black", 
                    "flake8", "pytest-asyncio", 
                    "pytest", "Faker"]


def run():
    dependencies = clean_list(load_requirements())
    dev_dependencies = extract(is_dev_requirement, 
                               dependencies)
    prod_dependencies = extract(is_prod_requirement, 
                                dependencies)
    save_requirements("requirements-dev.txt", 
                      prepare_data(dev_dependencies))
    save_requirements("requirements.txt", 
                      prepare_data(prod_dependencies))


def extract(criteria: Callable, data: list) -> list:
    return list(filter(lambda item: criteria(item), data))


def is_dev_requirement(item: str) -> bool:
    package_name, _ = item.split("==")
    return package_name in DEV_REQUIREMENTS


def is_prod_requirement(item: str) -> bool:
    return not is_dev_requirement(item)


def load_requirements(fname="requirements.txt") -> list:
    with open(fname, "r") as f:
        dependencies = f.readlines()
    return dependencies


def save_requirements(fname: str, data: str) -> None:
    with open(fname, "w") as f:
        f.write(data)


def prepare_data(data: list) -> str:
    return "\n".join(data) + "\n"


def clean_list(items: list) -> list:
    return list(map(lambda item: item.strip(), items))


if __name__ == "__main__":
    run()
Enter fullscreen mode Exit fullscreen mode

Commentary

  • load_requirements - loads packages from file to list using readlines method
  • clean_list removes newline character from each string through map which applies item.strip() to each element of items list
  • extract - filters initial list using built-in function filter
  • is_dev_requirement - takes a string with information about package (example: Flask==2.2.3), extract package name and its version and checks whether package in the list of development packages
  • is_prod_requirement - check whether package is production related, calls is_dev_requirement and returns negated result
  • prepare_data joins list of packages by newline character and adds one at the end since according to convention in UNIX based opearation systems
  • save_requirements - writes data to file, called twice since packages are saved into two separate files. Writing is performed with w flag since this is text data
  • run - implements script logic and ensures data exchange between functions

Now if someone wants to work on the project in order to set it up dependencies should be installed from both requirements files:

pip install -r requirements.txt
pip install -r requirements-dev.txt
Enter fullscreen mode Exit fullscreen mode

Conclusions

In this tutorial we looked at the way of extending venv functionality with a simple Python script that can be shipped with a project. Regarding the options on extensibility of this script there are plenty of things that can be done. For example splitting packages into three parts instead of two by introducing testing stage or applying sorting to make package list more convenient to navigate. The latter can be especially handy in large projects with lots of dependencies.

You can find the files and code used in this tutorial here: https://github.com/dmikhr/split-dependencies-demo

Top comments (0)