DEV Community: Lukas Klein

Counting the queued Celery tasks

Lukas Klein — Thu, 13 May 2021 14:44:31 +0000

If you are using Redis for your Celery queues, there's a way to count the number of queued tasks grouped by the task without the need for an external tool. Note that this approach assumes that you're using the JSON serializer for your tasks.

Using a Redis-client of your choice, I chose redli (which supports tls), use LRANGE to get a list of all tasks, pipe them through jq to query headers->task, sort and make them uniq / count them:

> redli [your connection parameters] lrange celery 0 30000 | jq '.headers.task' | sort | uniq -c

 533 "devices.adapters.lorawan.tasks.run_lora_payload_decoder"
   2 "devices.adapters.particle.tasks.run_particle_payload_decoder"
  92 "devices.tasks.call_device_function"
8556 "devices.tasks.ping_device"
9682 "devices.tasks.process_device_field_rules"
   5 "devices.tasks.send_device_offline_email"
   2 "dzeroos.tasks.call_command"
   8 "dzeroos.tasks.flush_command_queue"
   1 "dzeroos.tasks.publish_device_config"
   1 "dzeroos.tasks.publish_protobuf_config"

Bonus: statistics on task details

In my case, I needed to get some insights on one of the parameters of one task (I was investigating a loop that caused a long queue). This required some more jq and bash-magic and probably doesn't fit your use-case, I just paste it here for reference:

> redli [your connection parameters] lrange celery 0 1000 | jq -c '. | select(.headers.task | contains("taskname")) .body' | while read -r line; do echo "$line" | tr -d '"' | base64 -D | jq '.[0][0]'; done | sort | uniq -c

  15 "04576f6e-d5d1-45f4-8eef-a17e015335f4"
   9 "05264cc7-ae60-4f4f-9a18-2451e8d83f65"
  25 "4e240129-b84e-4e70-9f85-0e06f7a01875"
 224 "6c6a9aeb-10c7-417f-a928-791399d8adb9"

Debouncing a Celery task

Lukas Klein — Thu, 28 Jan 2021 13:42:12 +0000

Celery is a powerful task queue with many great features already built-in. Unfortunately, debouncing tasks is not one of them (or I didn't find it in the docs), but fortunately, it's not that hard to build it yourself.

What is debouncing

What is debouncing, you may ask? Let's start with a simple example that, while not directly applicable to the Python world, should make it easier to understand the concept. Say you have an input element on a webpage. In order to implement auto-completion, you have to query a server-side API with the user's input. If you would do that on every keystroke, you would probably fry your server. You probably only want to send it e.g. 500ms after the user stopped typing.

To do this, you could start a timeout after a keystroke that, when expired, would query the server API. Instead of starting a separate timeout after every keystroke, you are re-starting the same timeout over and over again.
This way, the timeout will expire and query the server 500ms after the user stopped typing.

Why I needed it

Another real-world application is the so-called Downlinks functionality on Datacake. Datacake is an IoT-platform that not only allows you to receive data from devices ("Uplink") but also send data back to devices ("Downlink"). Downlinks on Datacake can be triggered by different events, which could lead to the same Downlink being sent in a short period of time. To avoid this, I implemented debouncing (this time for a Celery task, finally), in order to only send the last Downlink for a given period of time.

Ingredients used

Asides from Celery itself I've used Redis to store a temporary lock. Since we used Redis already as our broker for Celery, it was available anyway.

Just show me the code

Say you have a task like this that you want to debounce:

@celery_app.task
def send_downlink(device, downlink):
    # Do some magic

First, I renamed the real task to a "private, do not use directly" one:

@celery_app.task
def _send_downlink(device, downlink):
    # Do some magic

I then created a new task with the official name, which increments a Redis counter named after a unique identifier for the task including its arguments (what you want to debounce) every time it is called and finally schedules the internal task with a timeout (your debounce time):

@celery_app.task
def send_downlink(device, downlink):
    redis_con.incr(f"debounce-downlink-{device}-{downlink}")
    _send_downlink.apply_async([device, downlink], countdown=1)

Finally, in our internal task, I am decrementing the counter and check, if the new value is <= 0, in which case this is the "last call" during the debounce period:

@celery_app.task
def send_downlink(device, downlink):
    if redis_con.decr(f"debounce-downlink-{device}-{downlink}") > 0:
        logger.debug(f"Debounce hit for {downlink} on {device}")
        return
    # Do some magic

The result is a Celery-task that you can call as many times as you want but only gets executed once after a call-break of your set timeout.

Migrating a Digital Ocean LoadBalancer to another Kubernetes cluster

Lukas Klein — Thu, 28 Jan 2021 10:34:46 +0000

At Datacake, we provide our customers with the option to white-label their IoT platform. To do this, they set their DNS to point to a Load Balancer on our Kubernetes cluster, which is running on Digital Ocean.

In an ideal world, all customers would use a CNAME on a subdomain or an ALIAS on a top-level domain. In the real world, not all domain providers support the introduced-in-2011 ALIAS record type, so most customers with a top-level domain resort to using a classic A-record.

Recently, we upgraded our Kubernetes to a new cluster (for various reasons we couldn’t use the old one anymore). Unfortunately, Load Balancers on Digital Ocean still don’t support floating IPs in 2021 and so we somehow had to move the existing Load Balancer to prevent our customers from having to update their DNS. Turns out, according to their support, there’s no way to do this. Spoiler: there is.

While going through the resource definition of the old service, I noticed an annotation called kubernetes.digitalocean.com/load-balancer-id. The value of this annotation looked a lot like a UUID and, as it turned out, it’s exactly what it says it is: It’s the ID of the Load Balancer object on Digital Ocean.

So I spun up a test Load Balancer, copied its ID, created a new K8S service of type LoadBalancer with an annotation containing the ID and voilà, the Load Balancer quickly became populated with our nodes. To prevent the old K8S from interfering with the new K8S, I first created another LB, changed the annotation on the old service to this new ID, and created a service on the new K8S containing the ID of the old Load Balancer. After that, I went ahead and deleted the old service which also removed the temporary Load Balancer.
Mission accomplished.

Tl;Dr: You can create a Kubernetes service of type LoadBalancer with an annotation called kubernetes.digitalocean.com/load-balancer-id, which will tell the system to use an existing Digital Ocean Load Balancer instead of creating a new one.