I've used Python since version 3.4, and I love it very much. However, there are some things that I wish I had never experienced.
Necessity to pin dependencies with external tools
Python is an extremely popular language, and there are awful lots of beautiful people that have developed many useful libraries. You are probably using at least one of them. And if you are the only person on the project you might install all dependencies once in your venv
, but most likely you have somebody to collaborate with. So you decide to write a requirements.txt
file. It will probably look similar to this
fastapi
uvicorn
repid
...
etc.
One day you or somebody else will want to update these dependencies. Each of their updates might be just a bugfix, or it might also be a compatibility-breaking release. You want to avoid unknown compatibility breaches, don't you? Python libraries mostly use semantic versioning, and to take it to your advantage you can use a lock-file and a dependencies resolver. Here is an example
[[package]]
name = "numpy"
version = "1.22.2"
description = "NumPy is the fundamental package for array computing with Python."
category = "main"
optional = false
python-versions = ">=3.8"
[[package]]
name = "opencv-python"
version = "4.5.5.62"
description = "Wrapper package for OpenCV python bindings."
category = "main"
optional = false
python-versions = ">=3.6"
...
etc.
However, Python's pip
can't do that, and eventually, you will have to use a third-party tool like poetry
. And, also Python won't support lock-files natively any time soon because PEP665 has just been declined.
Docker image size
Docker image size is essential because:
- you don't want all your developers to wait for a 2GB download to complete
- the larger the image, the larger the attack surface
- the big image makes your deployments longer
- you name it
Let's take another language for comparison.
Have you ever built a container with golang? Let's test its size!
First of all, let's write a "hello world" app:
package main
import "fmt"
func main() {
fmt.Println("hello world")
}
Then we pack it up with docker:
FROM golang:1.16-alpine
WORKDIR /app
COPY *.go ./
RUN go build -o /hello
CMD [ "/hello" ]
> docker image list
REPOSITORY TAG IMAGE ID CREATED SIZE
hellogo latest e5e575eacc7d 42 seconds ago 304MB
However, if we take advantage of a multi-stage build...
FROM golang:1.16-buster AS build
WORKDIR /app
COPY hello.go ./
RUN go build hello.go
FROM gcr.io/distroless/base-debian10
WORKDIR /
COPY --from=build /app/hello /hello
CMD ["/hello"]
we get a significant size drop:
> docker image list
REPOSITORY TAG IMAGE ID CREATED SIZE
hellogo latest efef42f5a20c 11 seconds ago 21.1MB
That's because we used a smaller docker image for our compiled code. And you can push it even further if you want or/and need to.
Now, let's talk Python. Even if we use multi-stage builds, we still need an interpreter. Alpine Linux version of Python docker image does exist but we can't use it properly becaaause...
C-compiler dependency
To build Python from source you need gcc
. If a package you're using is only available as a sdist
, you will need gcc
to compile it too. So if you're a musl
user - you are in a bad spot.
PEP656 addresses this issue, but it's just a recommendation, not a compulsion. We will probably never see all packages in PyPI with musl
wheels (good luck building NumPy
in Alpine Linux).
GIL
Global Interpreter Lock or GIL for short is probably a thing that made Python popular back in the day.
I would say that the design decision of the GIL is one of the things that made Python as popular as it is today.
— Larry Hastings, PyCon 2015
GIL prevents race conditions and ensures thread safety by simply not allowing more than one thread to run at a time. However, it also means that if you want to take advantage of multi-threading to boost your application's performance, your threads will simply run one after another, not simultaneously, resulting in worse results than one-threaded runs due to thread management overhead.
You can use multi-processing if you need multi-core performance or you can take advantage of asyncio
if you are waiting for some independent from your application's server stuff to complete. Nevertheless, it's a shame that I can't run multi-threaded in Python as I do it in other languages.
Top comments (0)