Understanding Row-level locking in databases.

#database #webdev #backenddevelopment #python

Recently, on an ongoing side project, I was tasked with implementing endpoints to manage API Keys, one of the main requirements is to enforce a limit to the number of keys that can be allocated to a user. The obvious way to approach this would be creating a global variable carrying the maximum number of requestable keys and then checking for the number of user keys in the database against the value of that variable when the create endpoint is queried, then a key is allocated to the user or not. That should be about it, right? No! I would later find out; that enforcing that API key limit would take more than that. This is where row-level locking comes in.

Assume that you are a boring person (you are, and you know it) and only have two friends. One of your friends (let's call him Samuel) who traveled had told you and Ahmed (your second friend) that he had an item delivered to him awaiting pickup, but only one of you is to take his place and pick up the item. Because you're a boring and annoying person, you and Ahmed had a heated argument and stopped talking two days before one of you was supposed to go pick up the item. Unsure as to who will go, because you two had failed to communicate that because of the argument, you both choose to act independently even though you knew that only one of you should go as instructed by Joseph. You go out to do it yourself because you think your friend won't pick it up. Simultaneously, your friend departs with an identical thought process.

You are both surprised to encounter each other at the pickup station, as well as perplexed. Even worse, the station's attendant is unsure about who to give the item to.

Since the terms of the agreement were unclear, this has disrupted the process of obtaining the item. You two have no idea what the other is doing, and if you had coordinated and communicated earlier, this uncertainty would have been averted.

In database terms, this is what happens when row-level locking isn't used properly during critical operations. If multiple requests (like you and your friend) try to access or update the same piece of data (pick up the item), without row-level locking, there's no coordination or restriction to prevent them from working on the same record simultaneously. This can lead to race conditions, data inconsistency, or even conflicts that disrupt the intended workflow.

The case between you and your friend, with the lack of communication, can be tied to the need for setting up row-level locking on the endpoint that handles creating an API key, so as to not allow un-intentional or malicious concurrent request try to bypass the limit.

What is row locking?

By now, you probably have an overview as to what row-level locking is, row-level locking is one of three levels of data locking that involves managing access to the modification or retrieval of a particular item or row, whilst allowing other rows that do not need to be accessed. Other levels of data locking are table locking and page locking. Row-level locking is widely praised over the other two because it increases concurrency and availability by allowing more transactions to access and modify the same table concurrently, without blocking or waiting for each other.

Types of row locking

Row-level locking differs based on the condition and operation we are trying to enforce. They are two most commonly used types of row-level locking, namely:

1. Shared Lock (S Lock)

S-lock allows multiple transactions to read a row but prevents any from writing to it and is useful in cases when transactions only need to read data without making any modification. For example, multiple users can view the same data row, but none can update it until the shared lock is released.

2. Exclusive Lock (X Lock)

X lock is used to prevent any other transaction from reading or writing to the locked row and usually comes into play when a transaction needs to both read and write to a row, ensuring that no other transaction can modify the row until the lock is released. In my case, when a user is requesting to generate an API key, an exclusive lock ensures that the row can't be accessed by another request until the operation is complete.

Other types of row locking include:

Update Lock
Intent Lock
Key-Range Lock
Deadlock Avoidance Locks

Pitfalls of Row-level locking

While row-level locking provides fine-grained control and helps maintain high concurrency, it comes with trade-offs, some of which are:

Deadlocks
Deadlocks happen when two or more transactions are waiting for each other to release locks, which causes a circular wait that cannot be resolved without intervention (e.g., transaction rollbacks). Due to many row-level locks being held by different transactions, the likelihood of deadlocks increases.
Increased Overhead
When managing row-level locks, it often requires more system resources (memory and CPU) than table-level or page-level locks, because the database needs to track each individual row being locked, which can become resource-intensive, especially with high-concurrency environments.
Potential for Lock Escalation
In some databases, if a transaction locks too many rows, the database system might escalate the row-level locks to page-level or even table-level locks to conserve resources which leads to the blocking of other transactions that should have only been blocked on specific rows. Lock escalation can reduce concurrency and cause performance issues in some cases.

Implementation of Row-level locking

Imagine you have a banking app where users can transfer money between accounts. Without row-level locking, two concurrent requests could lead to incorrect balances (e.g., both transactions reading the same balance before updating it, which would cause the account to become overdrawn).

We'll implement row-level locking with Django Rest Framework transaction module and select_for_update() ORM method in other to lock the user account row while updating balances.

After having set up your DRF(Django Rest Framework) project.

Models:

Firstly, we'll create the schema for our database table, named Account, with three columns to carry the owner of the account, and the balance in the account.

from django.db import models

class Account(models.Model):
    user = models.ForeignKey('auth.User', on_delete=models.CASCADE)
    balance = models.DecimalField(max_digits=12, decimal_places=2)

    def __str__(self):
        return f"Account of {self.user.username} with balance {self.balance}"

Serializer:
Then we will setup our serializer to handle three fields that will be passed along with the request.

from rest_framework import serializers
from .models import Account

class TransferSerializer(serializers.Serializer):
    from_account_id = serializers.IntegerField()
    to_account_id = serializers.IntegerField()
    amount = serializers.DecimalField(max_digits=12, decimal_places=2)

    def validate_amount(self, value):
        if value <= 0:
            raise serializers.ValidationError("Transfer amount must be greater than zero.")
        return value

View:

After we have our models in place, our view should look like the below snippet. Here we utilize the transaction module and DRFs select_for_update() ORM method.

from rest_framework import status, views
from rest_framework.response import Response
from django.db import transaction
from .models import Account
from .serializers import TransferSerializer

class InsufficientFundsException(Exception):
    """Custom exception for insufficient funds."""
    pass

class TransferFundsView(views.APIView):

    def post(self, request, *args, **kwargs):
        serializer = TransferSerializer(data=request.data)
        serializer.is_valid(raise_exception=True)

        from_account_id = serializer.validated_data['from_account_id']
        to_account_id = serializer.validated_data['to_account_id']
        amount = serializer.validated_data['amount']

        # Perform the transfer inside a transaction with row-level locking
        with transaction.atomic():
            # Lock the rows for both the sender's and receiver's accounts
            from_account = Account.objects.select_for_update().get(id=from_account_id)
            to_account = Account.objects.select_for_update().get(id=to_account_id)

            # Check if the sender has sufficient balance
            if from_account.balance < amount:
                raise InsufficientFundsException("Insufficient funds in the sender's account.")

            # Perform the transfer by deducting from sender and adding to receiver
            from_account.balance -= amount
            to_account.balance += amount

            # Save both account updates
            from_account.save()
            to_account.save()

        return Response({'status': 'success', 'message': f'Transferred {amount} successfully'}, status=status.HTTP_200_OK)

    def handle_exception(self, exc):
        """Handle custom exceptions."""
        if isinstance(exc, InsufficientFundsException):
            return Response({'detail': str(exc)}, status=status.HTTP_400_BAD_REQUEST)
        return super().handle_exception(exc)