Discussion on: Python, Celery, Kubernetes, and Memory

View post

For the first question, I assume that you will have tasks which are memory-heavy (as you said, using pandas in those tasks), and your original thought is to use a listene that is responsible for launching subprocesses, right?

I will say it'll work, but we can handle it by an easier solution which involved in the designs of K8s and celery themselves.

First, we can mark the specific queue which will be filled with those memory-heavy tasks for the tasks, there are many ways to route the tasks to the queues you want. Here is an example wrote in celery documentation (docs.celeryq.dev/en/stable/usergui...).

Second, we can run a celery worker on k8s which has more memory resources, specify the queue name in the worker startup command (docs.celeryq.dev/en/stable/usergui...), you can easily define the resources of the worker container on k8s by following this example(kubernetes.io/docs/concepts/config...).

So that the worker which has more memory will receive all these memory-heavy tasks sent by the application and process them.

Hope it helps.