- On Benchmarks
- gunicorn Overview
- Basic Running
- Configuration
- Notable HTTP Features
- PEX bundling
- Process Titles
- Project Support Overview
- Conclusion
In the last section the basics of WSGI were introduced along with some helpful enhancements via the werkzeug. Now there are many servers which implement WSGI and can be utilized towards more production level environments. In this article I'll be looking at gunicorn as one of those WSGI server solutions. While my original plan was to introduce all WSGI servers in a single post, I've found that describing each has enough content for a single article. With this in mind future posts in this series will introduce WSGI server solutions as their own dedicated article.
On Benchmarks
There won't be much in the way of benchmarks here and future server posts. Software is constantly evolving, and what may be slow now could change to very performant in a week. WSGI servers in a more production environment can also be part of clusters or diversified to meet different application needs. Even in cases where performance metrics are important, they should be measured in a more controlled environment that's closer to where deployment will occur.
gunicorn Overview
Gunicorn is a pure python WSGI capable server which runs on a pre-fork worker model as well as other alternatives. Being purely python it loads up a WSGI app via module import. The pure python architecture also means easier integration with PyPy which provides JIT optimizations on long running processes.
Basic Running
gunicorn can be installed via pip install gunicorn
. Given a simple WSGI application:
wsgi_test.py
def application(env, start_response):
data = b'Hello World'
status = '200 OK'
response_headers = [
('Content-Type', 'text/plain'),
('Content-Length', str(len(data))),
]
start_response(status, response_headers)
return [data]
gunicorn would be executed like this:
$ gunicorn --workers=2 wsgi_test:application
[2023-08-23 05:00:33 +0000] [13126] [INFO] Starting gunicorn 21.2.0
[2023-08-23 05:00:33 +0000] [13126] [INFO] Listening at: http://127.0.0.1:8000 (13126)
[2023-08-23 05:00:33 +0000] [13126] [INFO] Using worker: sync
[2023-08-23 05:00:33 +0000] [13176] [INFO] Booting worker with pid: 13176
[2023-08-23 05:00:33 +0000] [13177] [INFO] Booting worker with pid: 13177
where wsgi_test.py
without the extension as the modeule, followed by a :
, and then the name of the callable (the application
function in this case).
Configuration
A python file can be used for configuration purposes. The configuration pulls from settings described in the documentation. It also works as a standard python file meaning you can do something like:
gunicorn.config.py
import multiprocessing
bind = "127.0.0.1:8000"
workers = multiprocessing.cpu_count() * 2 + 1
wsgi_app = "wsgi_test:application"
Then the configuration can be checked to ensure it's valid:
$ gunicorn --check-config -c gunicorn.config.py
Finally run gunicorn with the --config
/-c
option and the name of the config file:
$ gunicorn -c gunicorn.config.py
[2023-08-23 02:50:37 +0000] [18928] [INFO] Starting gunicorn 21.2.0
[2023-08-23 02:50:37 +0000] [18928] [INFO] Listening at: http://127.0.0.1:8000 (18928)
[2023-08-23 02:50:37 +0000] [18928] [INFO] Using worker: sync
[2023-08-23 02:50:37 +0000] [18984] [INFO] Booting worker with pid: 18984
[2023-08-23 02:50:37 +0000] [18985] [INFO] Booting worker with pid: 18985
[2023-08-23 02:50:37 +0000] [18986] [INFO] Booting worker with pid: 18986
[2023-08-23 02:50:37 +0000] [18987] [INFO] Booting worker with pid: 18987
[2023-08-23 02:50:37 +0000] [18988] [INFO] Booting worker with pid: 18988
[2023-08-23 02:50:37 +0000] [18989] [INFO] Booting worker with pid: 18989
[2023-08-23 02:50:37 +0000] [18990] [INFO] Booting worker with pid: 18990
[2023-08-23 02:50:37 +0000] [18991] [INFO] Booting worker with pid: 18991
[2023-08-23 02:50:37 +0000] [18992] [INFO] Booting worker with pid: 18992
Notable HTTP Features
Here I'll look at the ability to support chunked input, chunked output, and range headers. These features may or may not be necessary depending on your use case.
Chunked Input Support
gunicorn supports chunked input via wsgi.input_terminated
as shown by this simple application:
wsgi_chunked_input
def application(environ, start_response):
input = environ['wsgi.input']
with open('test.json', 'wb') as stream_fp:
stream_fp.write(input.read())
status = '200 OK'
body = b'Hello World\n'
response_headers = [
('Content-Type', 'text/plain'),
('Content-Length', str(len(body))),
]
start_response(status, response_headers)
return [body]
Sending a 25MB JSON file comes back with:
$ curl -v -H "Transfer-Encoding: chunked" -d @large-file.json http://127.0.0.1:8000
* Trying 127.0.0.1:8000...
* Connected to 127.0.0.1 (127.0.0.1) port 8000 (#0)
> POST / HTTP/1.1
> Host: 127.0.0.1:8000
> User-Agent: curl/7.74.0
> Accept: */*
> Transfer-Encoding: chunked
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 100 Continue
* Signaling end of chunked upload via terminating chunk.
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Server: gunicorn
< Date: Tue, 22 Aug 2023 17:47:21 GMT
< Connection: close
< Content-Type: text/plain
< Content-Length: 12
<
Hello World
* Closing connection 0
Which shows up just fine on the server side:
$ ls -lh test.json
-rw-r--r-- 1 john doe 25M Aug 22 18:47 test.json
Chunked Response Support
Chunked responses work as well, and simply need the Transfer-Encoding: chunked
header added as per a modified example:
class TestIter(object):
def __iter__(self):
lines = [b'line 1\n', b'line 2\n']
for line in lines:
yield line
def app(environ, start_response):
status = '200 OK'
response_headers = [
('Content-type', 'text/plain'),
('Transfer-Encoding', "chunked"),
]
start_response(status, response_headers)
return TestIter()
Which when ran against via curl produces:
$ curl -iv --raw -H "Transfer-Encoding: chunked" http://127.0.0.1:8000/
* Trying 127.0.0.1:8000...
* Connected to 127.0.0.1 (127.0.0.1) port 8000 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:8000
> User-Agent: curl/7.74.0
> Accept: */*
> Transfer-Encoding: chunked
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Server: gunicorn
Server: gunicorn
< Date: Tue, 22 Aug 2023 21:40:12 GMT
Date: Tue, 22 Aug 2023 21:40:12 GMT
< Connection: close
Connection: close
< Transfer-Encoding: chunked
Transfer-Encoding: chunked
< Content-type: text/plain
Content-type: text/plain
<
7
line 1
7
line 2
0
* Closing connection 0
Range Support
Ranges do not have an explicit wrapper and would requires a helper function such as werkzeug.http.parse_range_header. The value to be passed would be available through HTTP_RANGE
:
from werkzeug.http import parse_range_header
def application(environ, start_response):
range = parse_range_header(environ['HTTP_RANGE'])
start, end = range.ranges[0]
with open('large-file.json', 'rb') as stream_fp:
stream_fp.seek(start)
data = stream_fp.read(end - start)
status = '200 OK'
response_headers = [
('Content-type', 'application/json')
]
start_response(status, response_headers)
return [data]
Which when ran against with curl:
$ curl -v -r 1200-1299 http://127.0.0.1:8000/
* Trying 127.0.0.1:8000...
* Connected to 127.0.0.1 (127.0.0.1) port 8000 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:8000
> Range: bytes=1200-1299
> User-Agent: curl/7.74.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Server: gunicorn
< Date: Wed, 23 Aug 2023 10:59:27 GMT
< Connection: close
< Transfer-Encoding: chunked
< Content-type: application/json
<
* Closing connection 0
pt"},"message":"Fix main header height on mobile","distinct":true,"url":"https://api.github.com/repo
PEX bundling
Being pure python, gunicorn can also be packaged using pex (Python EXecutable). As an example, I'll create a simple WSGI application wsgi_test.py
along with a simplified gunicorn.config.py
:
wsgi_test.py
def application(env, start_response):
data = b'Hello World'
status = '200 OK'
response_headers = [
('Content-Type', 'text/plain'),
('Content-Length', str(len(data))),
]
start_response(status, response_headers)
return [data]
gunicorn.config.py
bind = "127.0.0.1:8000"
workers = 2
wsgi_app = "wsgi_test:application"
I'll put these in a dedicated folder so it's easy to copy over. Then the executable will need to be packaged:
$ pex gunicorn -c gunicorn -o wsgi_app.pex --python pypy --inject-args "--config gunicorn.config.py"
So the first part of this is the modules to include. We'll be including gunicorn
since it's what we want to run our WSGI app as. The next is an entry point, which is the console_script
gunicorn defined in the setup.py of the project. This will allow for the resulting executable to be run just as if we were running gunicorn
on the command line. --python pypy
sets pypy to be the python binary to utilize. You'll want to make sure this matches a python available on the target machine. --inject-args
makes it so that the config file argument is always passed in so they don't have to be written out at execution. Finally -o wsgi_app.pex
is the executable file to output. If I copy this to another Linux system with pypy installed:
$ scp server:~/projects/pex_example/* .
$ ./wsgi_app.pex
[2023-08-23 05:00:33 +0000] [13126] [INFO] Starting gunicorn 21.2.0
[2023-08-23 05:00:33 +0000] [13126] [INFO] Listening at: http://127.0.0.1:8000 (13126)
[2023-08-23 05:00:33 +0000] [13126] [INFO] Using worker: sync
[2023-08-23 05:00:33 +0000] [13176] [INFO] Booting worker with pid: 13176
[2023-08-23 05:00:33 +0000] [13177] [INFO] Booting worker with pid: 13177
Everything is running despite not having gunicorn installed on the target machine. Note that this works due to the config file pointing to wsgi_test:application
. To have that bundled along with the executable you would need to have it installed through a setup.py
/pyproject.toml
style build system. Then it could be bundled like:
$ pex wsgi_test gunicorn -c gunicorn -o wsgi_app.pex --python pypy --inject-args "--config gunicorn.config.py"
Process Titles
Installing steproctitle will allow gunicorn to name it's worker processes in a way that makes it easier to manage. Simply install through pip install setproctitle
and looking at the process list will show something similar to:
28928 pts/2 00:00:01 gunicorn: maste
28984 pts/2 00:00:00 gunicorn: worke
28985 pts/2 00:00:00 gunicorn: worke
28986 pts/2 00:00:00 gunicorn: worke
28987 pts/2 00:00:00 gunicorn: worke
28988 pts/2 00:00:00 gunicorn: worke
28989 pts/2 00:00:00 gunicorn: worke
28990 pts/2 00:00:00 gunicorn: worke
28991 pts/2 00:00:00 gunicorn: worke
28992 pts/2 00:00:00 gunicorn: worke
Which allows for easy differentiating between master and worker processes.
Project Support Overview
This looks at how a project is maintained for those considering use in production environments. As with any software evaluation be sure to test things in an environment close to what you expect your production environment to be.
Documentation
gunicorn's main site has a simple documentation page. For more extensive documentation there's a full docs site. Documentation itself is fairly well organized. Multi-version documentation allows pointing to a specific version or a version pointer such as latest
and stable
.
Source Maintainability
The last commit shows to be around 1 month old from writing. There are 274 issues and 85 open PRs. A good amount of recent activity seems to be a one person effort. Given the size of the code base forkability is fairly reasonable if the need arises.
Conclusion
I think the overall package is good. Getting up and running is fairly straightforward if you want to prototype quickly. The availability of multiple worker types means you'll want to evaluate each one to find the best fit if doing production deployments (except tornado worker is probably not a good idea as WSGI is a minor bonus versus being the main focus). As far as the source repository, having primarily a one person maintainer does have a point of concern. Deciding on a few co-maintainers and bug wranglers would really help the project out.
Latest comments (0)