- Basic Usage
- Requirements Management
- Python Project Structured Modules
- Resource Directories
- Entry Points
- Docker Deployments
- pex Tools
- Conclusion
pex stands for Python EXecutable, and is a method to produce an easy to distribute python package. One important thing to note is that pex doesn't have reliable Windows support. Due to this you'll want to be running pex on *NIX systems. This article will showcase some of the things you can do with pex to make distributing different types of python projects.
Basic Usage
Given that the python interpreter being used for pex packaging matters, it's highly recommended to utilize a virtual environment. As an example I'll use a python 3.11 environment:
$ virtualenv --python=python3.11 venv
$ source venv/bin/activate
$ python -m pip install pex
Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple
Collecting pex
Using cached https://www.piwheels.org/simple/pex/pex-2.1.144-py2.py3-none-any.whl (2.9 MB)
Installing collected packages: pex
Successfully installed pex-2.1.144
The general format for pex CLI execution is:
pex [MODULES] [OPTIONS]
where [MODULES]
is a space separated list of modules in pip style dependency declaration strings:
$ pex "requests" "setproctitle==1.3.2" "uvicorn[standard]"
Python 3.11.4 (main, Aug 17 2023, 03:18:09) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>>
Without any other options pex will drop into an interactive shell and the modules provided will be available within:
>>> import requests
>>> import setproctitle
>>> import uvicorn
>>>
After closing out of the console we can see that the virtual environment packages are not affected at all:
$ pip list
Package Version
---------- -------
pex 2.1.144
pip 23.2.1
setuptools 65.5.0
$
Requirements Management
As listing out each module is generally not ideal, two alternative methods an be utilized to pass in requirements. The first solution is to use a requirements.txt
file:
requirements.txt
requests
setproctitle==1.3.2
uvicorn[standard]
Then pex can be ran with the -r
option and the requirements.txt
file passed in:
$ pex -r requirements.txt
Python 3.11.4 (main, Aug 17 2023, 03:18:09) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>>
-r
arguments can also be passed in multiple times in case you have multiple projects being bundled. If you have a virtual environment already setup then you can pass in pip freeze
to pex
:
$ pex $(pip freeze)
Python 3.11.4 (main, Aug 17 2023, 03:18:09) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>>
The requirements.txt
method would be good if you have a lot of modules to work with. pip freeze
is good for cases where a virtualenv is already setup.
Python Project Structured Modules
pex also supports python packages as modules which have a structure similar to the basic one in the python packaging docs. For this example I'll be using the project layout in this git repository. It includes a basic layout with a README, LICENSE, a simple module, and a pyproject.toml. This is enough for it to be recognized by pex
much like a development mode pip install:
$ pex .
Python 3.11.4 (main, Aug 17 2023, 03:18:09) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from simple_pex import simple_math
>>> simple_math(3,4)
7
>>>
This was all made possible without having to build the project itself.
Resource Directories
pex
can also add in directories for important items such as test data and configurations. In the app repository there's a resources
directory which contains a test_data.json
file that looks like this:
{
"a": 1,
"b": 2
}
We can use pex
with the -D
argument to add a specific directory for bundling. Then it can be used within the script/interactive prompt like so:
$ pex . -D resources
Python 3.11.4 (main, Aug 17 2023, 03:18:09) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from simple_pex import simple_math
>>> import json
>>> fp = open('resources/test_data.json', 'r')
>>> data = json.load(fp)
>>> fp.close()
>>> simple_math(data['a'], data['b'])
3
>>>
As you can see, the JSON data is loaded in and then passed to the simple_math
function where the proper result is returned.
Entry Points
One feature of python scripts is the ability to set an entry point as if running a basic program. For this example I'll be using code hosted in this repository. What makes this work is the declaration of a console script like so:
[project.scripts]
adder = "cli_pex:run"
This will produce a script called "adder" which will execute run
from the cli_pex
package:
import argparse
def run():
parser = argparse.ArgumentParser()
parser.add_argument("--integer1", type=int, help="First Integer")
parser.add_argument("--integer2", type=int, help="Second Integer")
args = parser.parse_args()
print(args.integer1 + args.integer2)
While not a very practical program it gets the job done of showing off how pex works with console scripts. To showcase this:
$ pex . -o adder.pex -c adder
$ ./adder.pex --integer1 3 --integer2 4
7
Using -c
tells pex we want to use the adder
script defined in pyproject.toml
. Now when we package everything it acts just like a basic program. There's also an option to utilize fixed arguments so only the execution of the .pex
file is necessary:
$ pex . -o adder.pex -c adder --inject-args "--integer1 3 --integer2 4"
$ ./adder.pex
7
This is useful for making easy to deploy server scripts which take arguments such as bind ports and hostnames.
Docker Deployments
To put this all together I'll make a Docker deployment of a pex web application. It will bundle gunicorn with a flask app which will act as the entrypoint for the container. The code that's used in this example can be found here. In this setup there is a simple flask app, a gunicorn configuration file, and a Dockerfile to enable deployment. This time the pyproject.toml
declares some dependencies:
dependencies = [
"flask",
"gunicorn",
"setproctitle",
]
Another thing to consider is that pex will need the setup of the system packaging it to be fairly close to the target system. That means I'll build on an Unbuntu box and my container will be based off Debian (slimmer, and close enough system wise). A few other things that need to be done:
- The pex executable needs to point to the gunicorn console script to run the server
- gunicorn config file will need to be copied over to the system
-
--inject-args
will need to have the--config
argument set to the gunicorn config - The resulting
.pex
file will need to be set as an entry point
Looking over the requirements, the resulting pex
call will be:
pex . -o web_pex.pex -c gunicorn --inject-args "--config /home/gunicorn/app/gunicorn.config.py"
While the Dockerfile will look like:
FROM python:3.11.4-bullseye
USER root
RUN useradd -d /home/gunicorn -r -m -U -s /bin/bash gunicorn
USER gunicorn
RUN mkdir /home/gunicorn/app
COPY config/gunicorn.config.py /home/gunicorn/app
COPY web_pex.pex /home/gunicorn/app
ENTRYPOINT /home/gunicorn/app/web_pex.pex
EXPOSE 8000
Given that my interpreter building the .pex
bundle is python 3.11, I set that as the base image. Now all that remains is to build the Dockerfile and then run the resulting image:
$ docker buildx build -f Dockerfile -t flask/web-pex:latest .
$ docker run -it -p 8000:8000 flask/web-pex:latest
[2023-08-25 00:13:11 +0000] [7] [INFO] Starting gunicorn 21.2.0
[2023-08-25 00:13:11 +0000] [7] [INFO] Listening at: http://0.0.0.0:8000 (7)
[2023-08-25 00:13:11 +0000] [7] [INFO] Using worker: sync
[2023-08-25 00:13:11 +0000] [8] [INFO] Booting worker with pid: 8
[2023-08-25 00:13:11 +0000] [9] [INFO] Booting worker with pid: 9
This will run the newly created flask/web-pex:latest
image and expose port 8000. Now to test with curl:
$ curl http://127.0.0.1:8000
Hello World
Thanks to setproctitle
the process list also comes out cleaner:
$ ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
gunicorn 1 0.0 0.0 2480 512 pts/0 Ss+ 00:13 0:00 /bin/sh -c /home/gunicorn/app/web_pex.pex
gunicorn 7 4.5 0.2 53904 48244 pts/0 S+ 00:13 0:00 gunicorn: master [gunicorn]
gunicorn 8 1.1 0.3 63244 52084 pts/0 S+ 00:13 0:00 gunicorn: worker [gunicorn]
gunicorn 9 0.6 0.3 62024 51644 pts/0 S+ 00:13 0:00 gunicorn: worker [gunicorn]
gunicorn 10 0.5 0.0 6052 3784 pts/1 Ss 00:13 0:00 /bin/bash
gunicorn 17 0.0 0.0 8648 3276 pts/1 R+ 00:13 0:00 ps aux
This makes it easier to discern the various gunicorn processes on the container.
pex Tools
Another interesting feature is that pex also has some tools available which let us create a more performant docker image. To make this work we need to add --include-tools
to the pex
build command:
$ pex . -o web_pex.pex -c gunicorn --inject-args "--config /home/gunicorn/app/gunicorn.config.py" --include-to
ols
The Dockerfile will also be updated to a multi-stage build to produce a finalized image:
FROM python:3.11.4-bullseye as deps
RUN mkdir -p /home/gunicorn/app
COPY web_pex.pex /home/gunicorn/
RUN PEX_TOOLS=1 /usr/local/bin/python3.11 /home/gunicorn/web_pex.pex venv --scope=deps --compile /home/gunicorn/app
FROM python:3.11.4-bullseye as srcs
RUN mkdir -p /home/gunicorn/app
COPY web_pex.pex /home/gunicorn
COPY config/gunicorn.config.py /home/gunicorn/app
RUN PEX_TOOLS=1 /usr/local/bin/python3.11 /home/gunicorn/web_pex.pex venv --scope=srcs --compile /home/gunicorn/app
FROM python:3.11.4-bullseye
RUN useradd -d /home/gunicorn -r -m -U -s /bin/bash gunicorn
COPY --from=deps --chown=gunicorn:gunicorn /home/gunicorn/app /home/gunicorn/app
COPY --from=srcs --chown=gunicorn:gunicorn /home/gunicorn/app /home/gunicorn/app
USER gunicorn
ENTRYPOINT /home/gunicorn/app/pex
EXPOSE 8000
This will separate out the dependency and source compilation. When python does compilation it will create an interpreter specific set of bytecode so it doesn't have to be done at runtime. This makes things run much faster. The docker build's only change is a different Dockerfile while the run command stays the same:
$ docker buildx build -f Dockerfile_pex_tools -t flask/web-pex:latest .
$ docker run -it -p 8000:8000 flask/web-pex:latest
[2023-08-25 01:25:47 +0000] [7] [INFO] Starting gunicorn 21.2.0
[2023-08-25 01:25:47 +0000] [7] [INFO] Listening at: http://0.0.0.0:8000 (7)
[2023-08-25 01:25:47 +0000] [7] [INFO] Using worker: sync
[2023-08-25 01:25:47 +0000] [8] [INFO] Booting worker with pid: 8
[2023-08-25 01:25:47 +0000] [9] [INFO] Booting worker with pid: 9
Looking inside the container you can see the layout of pex
in the ~/app
directory of the gunicorn
user:
$ cd ~/app
$ ls
PEX-INFO __main__.py __pycache__ bin gunicorn.config.py include lib lib64 pex pyvenv.cfg
And the cache files also show up a time earlier than the gunicorn workers spawning to show that they are indeed compiled output and not just python generating them naturally:
$ ls -lah lib/python3.11/site-packages/flask/__pycache__/
total 388K
drwxr-xr-x 2 gunicorn gunicorn 4.0K Aug 25 01:03 .
drwxr-xr-x 4 gunicorn gunicorn 4.0K Aug 25 01:03 ..
-rw-r--r-- 1 gunicorn gunicorn 4.0K Aug 25 01:03 __init__.cpython-311.pyc
-rw-r--r-- 1 gunicorn gunicorn 249 Aug 25 01:03 __main__.cpython-311.pyc
-rw-r--r-- 1 gunicorn gunicorn 86K Aug 25 01:03 app.cpython-311.pyc
-rw-r--r-- 1 gunicorn gunicorn 32K Aug 25 01:03 blueprints.cpython-311.pyc
Conclusion
This concludes a look at using pex
for packaging python code. It's an interesting system and judging from a GitHub issue also has the potential for reproducible builds. Having tools enabled allows for both an easy to work with single package deploy while at the same time enabling a more performant option via multi-stage compilation. I encourage taking a look to see how it can enhance your python projects.
Top comments (0)