Alair Joao Tavares

Posted on Mar 24 • Originally published at activi.dev

Guarding Critical Operations: Mastering SELECT FOR UPDATE for Race Condition Prevention in Django & PostgreSQL

#python #django #postgres #sql

In the world of web development, concurrency is a double-edged sword. While it allows our applications to serve thousands of users simultaneously, it also introduces a class of subtle, insidious bugs known as race conditions. These bugs occur when the outcome of an operation depends on the unpredictable sequence of events, like multiple requests trying to modify the same piece of data at the same time. The result? Corrupted data, inconsistent application state, and very confused users.

Imagine a system for managing freelance projects. A project is posted, proposals are submitted, and a project manager selects the winning bid. What happens if two managers, viewing the same project, click the "Accept Proposal" button for two different freelancers at nearly the same instant? Without proper safeguards, you might end up with two accepted proposals for a project that only needs one, leading to a logistical and contractual nightmare. This is a classic race condition.

While application-level locks exist, they can be complex to manage in a distributed environment. Fortunately, for those of us using robust transactional databases like PostgreSQL with Django, there's a powerful, built-in solution: the row-level lock. In this deep dive, we'll explore how to master PostgreSQL's SELECT ... FOR UPDATE statement through the Django ORM to build resilient, race-condition-free applications.

The Anatomy of a Race Condition

At its core, a race condition is born from the "read-modify-write" cycle. A process reads a value from the database, makes a decision based on that value, and then writes a new value back. The danger lies in the tiny gap between the read and the write. If another process slips in and modifies the same data during that interval, the first process's decision becomes invalid, yet it proceeds with its write, overwriting the other process's changes and corrupting the state.

Let's model our project management scenario in a simplified Django view. We have Project and Proposal models. A project can be 'OPEN' or 'ASSIGNED'.

# models.py
from django.db import models

class Project(models.Model):
    STATUS_CHOICES = [('OPEN', 'Open'), ('ASSIGNED', 'Assigned')]
    title = models.CharField(max_length=200)
    status = models.CharField(max_length=10, choices=STATUS_CHOICES, default='OPEN')

class Proposal(models.Model):
    STATUS_CHOICES = [('PENDING', 'Pending'), ('ACCEPTED', 'Accepted'), ('REJECTED', 'Rejected')]
    project = models.ForeignKey(Project, on_delete=models.CASCADE, related_name='proposals')
    freelancer_name = models.CharField(max_length=100)
    status = models.CharField(max_length=10, choices=STATUS_CHOICES, default='PENDING')

Now, here's a naive implementation of the view to accept a proposal. This code is dangerously susceptible to a race condition:

# views.py (Flawed Version)
from django.http import HttpResponse, HttpResponseBadRequest
from .models import Project, Proposal

def accept_proposal(request, proposal_id):
    # 1. Read the proposal and its associated project
    try:
        proposal = Proposal.objects.get(pk=proposal_id)
        project = proposal.project
    except Proposal.DoesNotExist:
        return HttpResponseBadRequest("Proposal not found.")

    # 2. Check if the project is still open (THE DANGER ZONE)
    if project.status == 'OPEN':
        # 3. Modify the state in memory
        project.status = 'ASSIGNED'
        proposal.status = 'ACCEPTED'

        # 4. Write the changes back to the database
        project.save()
        proposal.save()

        # Reject all other proposals for this project
        project.proposals.filter(status='PENDING').update(status='REJECTED')

        return HttpResponse("Proposal accepted successfully!")
    else:
        return HttpResponseBadRequest("This project has already been assigned.")

Let's trace the potential disaster:

Request A loads proposal #1 for project #100. It executes the if project.status == 'OPEN': check, which passes.
The operating system preempts the thread for Request A and switches to Request B, which is trying to accept proposal #2 for the same project #100.
Request B also executes the check if project.status == 'OPEN':. Since Request A hasn't written anything back yet, this check also passes.
Request B proceeds to set project.status = 'ASSIGNED', saves the project, and saves its proposal as 'ACCEPTED'. It then commits its transaction.
The OS switches back to Request A. It has already passed its check, so it's oblivious to what Request B just did. It also sets project.status = 'ASSIGNED', saves the project (overwriting B's save), and saves its proposal as 'ACCEPTED'.

The result: The database now shows project #100 as assigned, but two proposals are marked as 'ACCEPTED'. Data integrity is lost.

The Database Lock: `SELECT ... FOR UPDATE`

This is where the database itself can enforce order. A transaction is a sequence of operations performed as a single logical unit of work. The ACID properties (Atomicity, Consistency, Isolation, Durability) of transactions are a great start, but the default isolation levels don't always prevent this specific read-modify-write problem.

PostgreSQL gives us a more explicit tool: pessimistic locking. The SQL statement SELECT ... FOR UPDATE does exactly what it sounds like: it selects rows and immediately places a write lock on them. Any other transaction that attempts to acquire a write lock on those same rows will be forced to wait until the first transaction either COMMITs or ROLLBACKs.

The raw SQL looks like this:

BEGIN;
-- This query will block if another transaction has already locked this row.
SELECT * FROM myapp_project WHERE id = 100 FOR UPDATE;

-- ... perform application logic ...

UPDATE myapp_project SET status = 'ASSIGNED' WHERE id = 100;
UPDATE myapp_proposal SET status = 'ACCEPTED' WHERE id = 1;

COMMIT;

When a second transaction tries to run the same SELECT ... FOR UPDATE on id = 100, it will pause execution and wait. Once the first transaction commits, the second transaction will acquire the lock, read the now-updated row (where status is already 'ASSIGNED'), and its logic will correctly determine that it can no longer proceed.

The Django Way: `transaction.atomic` and `.select_for_update()`

Writing raw SQL is powerful, but Django provides a clean, Pythonic, and database-agnostic way to leverage this feature. The solution involves combining two key pieces of the ORM: django.db.transaction.atomic and the queryset method .select_for_update().

Let's refactor our broken view into a robust, race-condition-proof version:

# views.py (Corrected Version)
from django.db import transaction
from django.http import HttpResponse, HttpResponseBadRequest
from .models import Project, Proposal

def accept_proposal_safe(request, proposal_id):
    try:
        # Start an atomic transaction block.
        with transaction.atomic():
            # Retrieve the proposal, but lock the associated project row.
            # The lock is acquired here!
            proposal = Proposal.objects.select_related('project').get(pk=proposal_id)
            project = Project.objects.select_for_update().get(pk=proposal.project.id)

            # The check now happens inside the locked transaction.
            if project.status == 'OPEN':
                # Modify and write changes.
                project.status = 'ASSIGNED'
                project.save()

                proposal.status = 'ACCEPTED'
                proposal.save()

                project.proposals.filter(status='PENDING').update(status='REJECTED')

                return HttpResponse("Proposal accepted successfully!")
            else:
                # If the project is not open, the transaction will simply rollback
                # without any changes being committed.
                return HttpResponseBadRequest("This project has already been assigned.
")
    except Proposal.DoesNotExist:
        return HttpResponseBadRequest("Proposal not found.")

Let's break down why this works:

with transaction.atomic():: This decorator ensures that all database operations within the block are wrapped in a single transaction. If any exception is raised, the transaction is automatically rolled back, leaving the database in its original state.
Project.objects.select_for_update().get(...): This is the magic. When this line is executed, Django issues a SELECT ... FOR UPDATE statement to the database. It finds the project row and PostgreSQL places a lock on it. This lock will be held until the atomic block is exited (either by successful completion or by an exception).

Now, when our two concurrent requests arrive:

Request A enters the atomic block and executes .select_for_update(), acquiring a lock on project #100.
Request B enters its atomic block and also attempts to execute .select_for_update() on project #100. PostgreSQL sees the existing lock and tells Request B's database connection to wait. It is now blocked.
Request A proceeds, sees the project status is 'OPEN', updates it to 'ASSIGNED', updates the proposal, and successfully completes its atomic block. The transaction is committed, and the lock on project #100 is released.
Request B is immediately unblocked. Its .select_for_update().get() call now completes, and it receives the project data as it exists after Request A's commit. It sees that project.status is now 'ASSIGNED'.
Request B's if condition fails, and it correctly returns the HttpResponseBadRequest, preventing a double-acceptance.

Data integrity is preserved, and the race condition is eliminated.

Advanced Locking Strategies and Best Practices

select_for_update() has more tricks up its sleeve for fine-grained control.

Non-Blocking Locks: nowait=True

Sometimes, you don't want a process to wait. You'd rather it fail fast and try again later, or inform the user immediately. Using nowait=True will cause the query to raise a DatabaseError (specifically OperationalError in psycopg2) if it cannot acquire the lock immediately.

from django.db import DatabaseError

# ... inside an atomic block
try:
    project = Project.objects.select_for_update(nowait=True).get(pk=project_id)
    # ... proceed with logic
except DatabaseError:
    # Could not acquire lock, handle it gracefully
    return HttpResponse("This project is being processed by another user. Please try again in a moment.", status=409)

Skipping Locked Rows: skip_locked=True

This is incredibly useful for implementing work queues. Imagine multiple worker processes wanting to grab the next available task from a Task table. You don't want them all waiting for the same row; you want each worker to grab a different, unlocked row.

# Worker process logic
with transaction.atomic():
    # Get the first available, unlocked task
    task = Task.objects.select_for_update(skip_locked=True).filter(status='PENDING').first()
    if task:
        task.status = 'PROCESSING'
        task.save()

# Now process the task...

With this pattern, if five workers run this code simultaneously, they will each grab a different pending task without blocking each other, dramatically improving throughput.

Best Practices

Keep Locked Transactions Short: A lock is a bottleneck by design. Hold it for the shortest time possible. Perform any slow, non-database-related operations (like calling external APIs or performing heavy computations) before you enter the atomic block and acquire the lock.
Acquire Locks in a Consistent Order: If you need to lock multiple rows (e.g., a User and their Wallet), always lock them in the same order everywhere in your codebase (e.g., always lock User then Wallet). Locking in inconsistent orders can lead to deadlocks, where Transaction A is waiting for a lock held by B, and B is waiting for a lock held by A.
Test Your Locking Logic: Concurrency bugs are hard to reproduce manually. Write automated tests to validate your locks. Python's built-in threading module is perfect for this.

Here's a simplified Django test case to prove our fix works:

# tests.py
import threading
from django.test import TestCase
from .models import Project, Proposal

class ProjectLockingTest(TestCase):
    def test_concurrent_proposal_acceptance(self):
        project = Project.objects.create(title="Test Project", status='OPEN')
        p1 = Proposal.objects.create(project=project, freelancer_name='Alice')
        p2 = Proposal.objects.create(project=project, freelancer_name='Bob')

        # Use a barrier to try and synchronize thread start times
        barrier = threading.Barrier(2)
        results = {}

        def accept_worker(proposal_id, worker_name):
            # Simulates a Django view call
            # Note: Django test transactions can interfere with multi-threading.
            # Use `connections.close_all()` or run in a separate process for complex scenarios.
            # For this simple case, we'll call our logic directly.
            try:
                barrier.wait()
                # This is a simplified call to the core logic from our view
                with transaction.atomic():
                    proj_locked = Project.objects.select_for_update().get(pk=project.id)
                    if proj_locked.status == 'OPEN':
                        prop = Proposal.objects.get(pk=proposal_id)
                        proj_locked.status = 'ASSIGNED'
                        prop.status = 'ACCEPTED'
                        proj_locked.save()
                        prop.save()
                        results[worker_name] = 'success'
                    else:
                        results[worker_name] = 'failure'
            except Exception as e:
                results[worker_name] = str(e)

        thread1 = threading.Thread(target=accept_worker, args=(p1.id, 'worker1'))
        thread2 = threading.Thread(target=accept_worker, args=(p2.id, 'worker2'))

        thread1.start()
        thread2.start()

        thread1.join()
        thread2.join()

        # Assertions
        project.refresh_from_db()
        self.assertEqual(project.status, 'ASSIGNED')

        accepted_proposals = Proposal.objects.filter(status='ACCEPTED').count()
        self.assertEqual(accepted_proposals, 1)

        # Check that one worker succeeded and one failed
        self.assertIn('success', results.values())
        self.assertIn('failure', results.values())

Conclusion

Race conditions are a silent threat to data integrity in any concurrent application. While they may seem complex, Django and PostgreSQL provide a robust and surprisingly elegant solution. By understanding the read-modify-write cycle and leveraging the power of transaction.atomic with .select_for_update(), you can transform critical operations from sources of intermittent bugs into bastions of reliability.

The key takeaways are simple: identify critical sections of your code where state is read and then written, wrap them in an atomic transaction, and acquire a row-level lock on the data at the beginning of that transaction. By making this a standard part of your development practice, you can build more scalable, resilient, and correct applications that stand up to the pressures of the real world.

DEV Community

Guarding Critical Operations: Mastering SELECT FOR UPDATE for Race Condition Prevention in Django & PostgreSQL

The Anatomy of a Race Condition

The Database Lock: `SELECT ... FOR UPDATE`

The Django Way: `transaction.atomic` and `.select_for_update()`

Advanced Locking Strategies and Best Practices

Best Practices

Conclusion

Top comments (0)

The Anatomy of a Race Condition

The Database Lock: SELECT ... FOR UPDATE

The Django Way: transaction.atomic and .select_for_update()

Advanced Locking Strategies and Best Practices

Best Practices

Conclusion

The Database Lock: `SELECT ... FOR UPDATE`

The Django Way: `transaction.atomic` and `.select_for_update()`