Nowadays almost every programming project uses dependencies, often open source. Each of these dependencies has its own license. If you launch your project you need to make sure all licenses are okay to use for you.
In this article, I will show you how to set up a fully automatic step in your build pipeline for Python projects.
In an earlier project of mine, I showed how 8000 Python packages do not have GNU GPL but do depend on a package using GNU GPL. This might* be fine for the package itself but if you use this package in enterprise software, you just created a product that uses GNU GPL.
*This is not tested in a court as far as I know, and every court of every country, state, county, federation, etc. could rule differently. Consult a lawyer if you need legal advice, not a random article on the internet.
This continues on 8000+ python packages might have to change to GNU Public License
To avoid this I will show 2 different ways of testing this; testing just your installed environment after building, and testing your full docker image before shipping. I will show it with GitHub actions and Jenkins.
General setup
We are going to use the Python package license-scanner. This package will look at the pyroject.toml
for whitelisted licenses and packages. It works by finding ALL packages installed in the Python environments and checking if the license or the package is whitelisted. If one or more packages do not pass this test an error will be raised.
Below is an example, it contains a common list of licenses that can be used in commercial software.
[tool.license_scanner] | |
allowed-licenses = [ | |
"Apache software license v2", | |
"Apache software license v3", | |
"Apache software license", | |
"BSD 0-clause license", | |
"BSD 2-clause license", | |
"BSD 3-clause license", | |
"BSD license", | |
"GNU lesser general public license v2", | |
"GNU lesser general public license v3", | |
"GNU lesser general public license", | |
"Historical Permission Notice and Disclaimer (HPND)", | |
"ISC", | |
"MIT", | |
"Mozilla public license 2.0 (mpl 2.0)", | |
"mozilla", | |
"Python software foundation license", | |
] | |
allowed-packages = [ | |
"License_scanner", | |
"this_package", | |
"my_own_internal_package", | |
"a_package_I_check_manually", | |
] |
To test this locally, navigate to the folder of
pyroject.toml
and run license_scanner -m whitelist
.
Github action: Check dependencies of my package
When you build a package you want to do this step automatically, but you also only want to test the dependency and not build or test tools. To avoid it, install the dependencies using pip install -r requirements.txt or via pip install .. Now run the license scanner and after that install other tools like pytest or build if needed.
name: Pytest & Licenses check | |
on: | |
pull_request: | |
branches: | |
- main | |
permissions: | |
contents: read | |
jobs: | |
deploy: | |
runs-on: ubuntu-latest | |
steps: | |
- uses: actions/checkout@v3 | |
- name: Set up Python | |
uses: actions/setup-python@v3 | |
with: | |
python-version: '3.x' | |
- name: Check for licenses | |
run: | | |
python -m pip install --upgrade pip | |
python -m pip install -r requirements.txt | |
python -m pip install license_scanner | |
python -m license_scanner -m whitelist | |
- name: Pytest | |
run: | | |
pip install pytest | |
python -m pytest |
Jenkins: check a full docker image
If you are deploying a docker image as a product, it is better to test the entire docker image. To avoid cluttering the docker with test code we create a test docker based on the production docker we want.
Create a file named Dockerfilelicensescan and populate it as below. Note how the image we build from is variable.
# Make the base docker variable | |
ARG IMAGE_NAME | |
FROM $IMAGE_NAME | |
# Get the license allowed to use | |
COPY pyproject.toml . | |
# Do the scan | |
RUN python3 -m pip install license_scanner | |
CMD python3 -m license_scanner -m whitelist |
To your build pipeline (Jenkins in this example) we add the build step and a test step. In the test step, we build the docker test image based on the docker production image. After that, we just run the container. If an exception is raised the build pipeline will fail.
stage ("Build Docker Image") | |
{ | |
steps | |
{ | |
sh(""" | |
docker build -t myawsomedocker:latest . | |
""") | |
} | |
} | |
stage ("Check the licenses") | |
{ | |
steps | |
{ | |
sh (""" | |
docker build -t licensetest --build-arg IMAGE_NAME=myawsomedocker:latest -f Dockerfilelicensescan . | |
docker run -t licensetest | |
""") | |
} | |
} |
Conclusion
It is relatively simple to set up an automatic way to check your Python licenses. The hard part is to find out what licenses you can use within your project or to refactor your code if a dependency uses a license you want to avoid.
Top comments (0)