This post is on how to utilize postgres advisory locks to create introspectable asynchronous job orchestration between microservices
Pretext
Recently I found myself with a task which required creating asynchronous, introspectable, jobs.
Now, this issue is nothing new. In fact, there's plenty of really good solutions out there. This can be easily achieved with tools such as kafka, mqs, there's even tools like cadence which take this to the next level.
However, while these are all very good solutions, they can also be too much. What if all you have is a single postgres instance? Well, that's the situation I was in, and the solution was fairly simple โ advisory locks!
Before I go further, let me explain the situation I was in.
Situation
Lets say a user request something from your system which requires some process to be completed in the background.
It could take a minute, or an hour, or 3 months. No matter how long the task takes, the point is the response cannot be returned immediately.
In a simple world
In this example the users ask from the system to generate them an invoice.
Creating a database entry in a tasks table would be simple enough, it's in pending state while being worked on, and deleted or set to done when it's finished.
All we really need to do is be able to resume the task if the system running the task crashes or is interrupted.
However, we're running in a microservice environment, and that makes everything that used to be simple an adventure.
The microservice adventure
So what do we do now? We don't want 20 instances performing the same task at the same time.
In reality the "just make a database entry" approach is still valid, but now we need to know whether a process should resume processing any pending tasks once it restarts.
Of course, if a service crashes it won't have the chance to update the entry to say "available for pickup".
A simple solution, if you're using postgres, is to use advisory locks.
Advisory locks
PostgreSQL provides a means for creating locks that have application-defined meanings. These are called advisory locks, because the system does not enforce their use โ it is up to the application to use them correctly.
โ From Postgres Documentation
Essentially what advisory locks allow you to do is to create a flag without actually locking any resources.
In the case of transaction level advisory locks the lock will go away when you
A. Commit the transaction
B. Rollback the transaction (which includes the client disconnecting)
This is great for us, because we don't actually want to lock the table, as that would prevent any other tasks from using it. It also means the advisory lock will go away if the process crashes.
Here's how we can utilize it
This looks great, but in practice, there's a few gaps we need to fill.
One of them is that each advisory lock requires a bigint
as its identifier.
If your user IDs are already ints, then this is not an issue (you might only want to namespace them).
If they are IDs, such as KSUID or UUID then you need to hash them into an int. I won't go into this as there's plenty of existing sources already.
We can create a transaction-level lock using the below query
SELECT pg_try_advisory_xact_lock(42) -- 42 is the User ID
Query for checking if a lock exists would look something like this
SELECT granted FROM pg_locks WHERE locktype = 'advisory' AND objid = '42' -- 42 is the LockID which is our User ID
Resume on restart
The one last thing which remains is to pick up the existing tasks upon startup.
It could look something like this.
Essentially, upon restart, query the database for any tasks in pending state and see if there is an advisory lock for any of them. If there isn't one, that means that no one is processing them and it should be picked up!
That's it!
That's all it takes to create simple, asynchronous, interopable tasks.
Top comments (0)