Mufthi Ryanda

Posted on Jan 22

The Invisible Battle: When Multiple Users Fight Over Your Last Product 🥊 [Race Condition on Database Level Explained]

#go #postgres #programming #database

Recently, we discussed concurrency, its usage, and benefits, as well as how it differs from sequential processing. We also covered concurrency implementation using Goroutines. When implementing concurrent processes and Goroutines, we encountered the Race Condition problem, which can cause serious issues in the future if not handled properly. We also discussed solutions for handling Race Conditions in Goroutines. You can read more about it at Dev.to Post

Even after handling race conditions at the Go language level, we can still encounter similar problems at the database level. Why? This happens because when connecting to a database, there's a possibility that our applications access the same database, table, and specific column simultaneously.

Here's the breakdown:

Consider a concert ticket booking system where JKT48 will perform in Jakarta with only 5 VIP seats remaining
The ticketing system shows these 5 seats as available to all users browsing the website
Twenty excited fans find these seats at exactly 2:00 PM when ticket sales open
Each fan selects a seat and clicks "Purchase" at nearly the same moment

If these purchase requests are processed concurrently:

The system might check seat availability for all requests simultaneously 💫
Each check would show that seats are still available ✔
The system would proceed with all purchase attempts 📒
Multiple fans might end up being assigned the same seat ❤️‍🩹
The database would try to record more bookings than available 🧯
This could lead to various issues like double bookings, incorrect seat assignments, or system errors 🤯

You can probably guess what would happen next and what your boss would do to you! 💀
But don't worry - this is where ACID compliance comes to the rescue ✨. Most OLTP databases like PostgreSQL and MySQL implement ACID compliance, and one of its core principles is 'I' (Isolation). The Isolation principle helps us prevent database race condition problems.

Let's demonstrate this concept. We'll continue with our e-commerce example, which was previously discussed in this Dev.to Post

First, let's setup our database:

CREATE DATABASE WRITING;

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    stock INT NOT NULL CHECK (stock >= 0),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

INSERT INTO products (name, stock)
VALUES ('J.Co - Snow White Donuts', 10000);

Now, let's set up a simple database connection using Go's standard library. We'll use PostgreSQL in this example, but don't worry - this approach works with MySQL too. You can simply switch the driver while keeping the same methodology:

func PgSQLConnection() *sql.DB {
    connStr := "host=localhost port=5432 user=yourusername password=yourpassword dbname=writing sslmode=disable"

    // Open connection
    db, err := sql.Open("postgres", connStr)
    if err != nil {
        panic(err)
    }

    // Set connection pool settings
    db.SetMaxOpenConns(95)                 // Maximum number of open connections
    db.SetMaxIdleConns(10)                 // Maximum number of idle connections
    db.SetConnMaxLifetime(5 * time.Minute) // Maximum lifetime of a connection
    db.SetConnMaxIdleTime(5 * time.Minute) // Maximum idle time of a connection

    // Test connection
    err = db.Ping()
    if err != nil {
        panic(err)
    }

    return db
}

In this code, you can adjust the configuration based on your database settings. Note that we've implemented Connection Pooling here. We'll discuss the advantages of Connection Pooling later - for now, let's continue with this implementation.

Let's create a simple flow to process customer orders:

func (p *Product) ProcessOrderWithNormalQuery(db *sql.DB, orderQuantity int32) error {
    //Open TX
    tx, err := db.Begin()
    if err != nil {
        return err
    }
    defer tx.Rollback()

    //Let Say We have Other Process that take 500ms (We assume it as 500ms)
    time.Sleep(500 * time.Millisecond)

    //Query to Find Stock
    var stock int32
    err = tx.QueryRow(`SELECT stock FROM products WHERE id = $1;`, p.ID).Scan(&stock)
    if err != nil {
        return err
    }

    //Validate Stock
    if stock < orderQuantity {
        return errors.New("failed ! Stock not Enough")
    }

    //Calculate Stock
    newStock := stock - orderQuantity
    time.Sleep(100 * time.Millisecond) //Also Assume we have 100ms Business Process when Decrease Stock

    //Query to Update Stock
    _, err = tx.Exec(`
        UPDATE products 
        SET stock = $1,
            updated_at = CURRENT_TIMESTAMP 
        WHERE id = $2`,
        newStock, p.ID)
    if err != nil {
        return err
    }

    return tx.Commit()
}

Here's the breakdown of how this code works:

First, we open a database transaction (essential when working with ACID-compliant databases)
We add a simulated delay of 500ms to represent other processes in our system
After this delay, we query the database to find the current product stock
We then validate if enough stock is available for the order
If the validation passes, we calculate the new stock by subtracting the order quantity from the current stock
We add another simulated 100ms delay to represent business logic processing during stock reduction
Then we update the stock in the database
Finally, we commit the transaction. If anything fails during this process, the transaction automatically rolls back thanks to our defer tx.Rollback() statement, ensuring our database remains consistent

The defer tx.Rollback() statement provides a safety net - if anything goes wrong, the transaction will be rolled back, and no changes will be made to our database.

Let's verify our implementation by creating this test:

func TestUnsafeOrder(t *testing.T) {
    // Open 2 Connection, Let simulate that we have 2 Go Apps Running
    db1 := PgSQLConnection()
    db2 := PgSQLConnection()
    defer db1.Close()
    defer db2.Close()

    product := &Product{ID: 1}
    wg := &sync.WaitGroup{}

    // Get initial stock
    var initialStock int32
    err := db1.QueryRow("SELECT stock FROM products WHERE id = $1", product.ID).Scan(&initialStock)
    if err != nil {
        t.Fatal(err)
    }
    fmt.Printf("Initial stock: %d\n", initialStock)

    batchSize := 20    // Process in smaller batches
    totalOrders := 100 // Reduce total orders to demonstrate the race

    for i := 0; i < totalOrders; i += batchSize {
        for j := 0; j < batchSize; j++ {
            wg.Add(1)
            go func(orderNum int) {
                defer wg.Done()
                // Alternate between connections to simulate distributed access
                dbConn := db1
                if orderNum%2 == 0 {
                    dbConn = db2
                }
                err := product.ProcessOrderWithNormalQuery(dbConn, 1)
                if err != nil {
                    fmt.Printf("Error: %v\n", err)
                }
            }(i + j)
        }
        wg.Wait()
    }

    var finalStock int32
    err = db1.QueryRow("SELECT stock FROM products WHERE id = $1", product.ID).Scan(&finalStock)
    if err != nil {
        t.Fatal(err)
    }
    fmt.Printf("Final stock: %d\n", finalStock)
    fmt.Printf("Expected stock: %d\n", initialStock-int32(totalOrders))
}

In this test, we simulate having multiple applications accessing the same database by opening two connections. First, we get the initial stock for testing purposes. Then, we simulate 100 users making simultaneous purchases, with each user buying 1 unit.

The test case looks fine, but when we run it:

opt/homebrew/Cellar/go/1.22.0/libexec/bin/go tool test2json -t /private/var/folders/q4/lfxksx5x7knd4xjt3qqhyq4m0000gn/T/GoLand/___TestUnsafeOrder_in_race_condition.test -test.v -test.paniconexit0 -test.run ^\QTestUnsafeOrder\E$
=== RUN   TestUnsafeOrder
Initial stock: 10000
Final stock: 9994
Expected stock: 9900
--- PASS: TestUnsafeOrder (3.55s)
PASS

Process finished with the exit code 0

Something's wrong! We only ordered 100 donuts, and with an initial stock of 10000, we expected 9900 donuts to remain. Instead, we got 9994 - meaning only 6 orders were processed successfully. This race condition is causing our inventory management to fail.

Here's what's happening (the breakdown):

Let's say two customers (A and B) try to buy donuts at exactly the same time
Customer A's process checks the stock (sees 10000 available)
Customer B's process checks the stock (also sees 10000 available)
Customer A calculates new stock (10000 - 1 = 9999)
Customer B calculates new stock (10000 - 1 = 9999)
Customer A updates the stock to 9999
Customer B updates the stock to 9999

The problem: Even though two orders were placed, the stock only decreased by 1! This is happening because both processes read the same initial value before either one completed their update. This is known as the "Lost Update" problem in concurrent database operations.

Now that we've simulated how database race conditions can cause failures during concurrent operations, let's discuss the solution. As mentioned earlier, the "I" (Isolation) principle in ACID will help us solve this. To prevent database race conditions, we simply need to implement the proper Isolation Level.

What does this mean? It means we'll lock the database queries. Here's how it works: When Customer A and Customer B try to check the stock simultaneously, we'll lock the database so that Customer B must wait until Customer A finishes their process.

For implementing locking mechanisms, there are generally two approaches: Optimistic and Pessimistic locking. In this example, we'll implement Pessimistic locking.

Pessimistic Locking can be implemented using any of these SQL locking clauses:

FOR UPDATE
FOR UPDATE NOWAIT
FOR UPDATE SKIP LOCKED
FOR SHARE
FOR SHARE NOWAIT
FOR SHARE SKIP LOCKED

Each of these locking clauses has a specific use case:

FOR UPDATE : Locks the selected rows for updating. Other transactions must wait until the current transaction finishes.
FOR UPDATE NOWAIT: Same as FOR UPDATE, but instead of waiting, it immediately returns an error if the rows are locked.
FOR UPDATE SKIP LOCKED: Similar to FOR UPDATE, but skips any rows that are already locked by other transactions.
FOR SHARE: Locks the rows for reading only. Other transactions can also read but cannot modify the locked rows.
FOR SHARE NOWAIT: Same as FOR SHARE, but immediately returns an error if the rows are locked.
FOR SHARE SKIP LOCKED: Similar to FOR SHARE, but skips any rows that are already locked.

You can choose the locking keyword that best suits your use case. In this example, we'll use FOR UPDATE :

func (p *Product) ProcessOrderWithLock(db *sql.DB, orderQuantity int32) error {
    //Open TX
    tx, err := db.Begin()
    if err != nil {
        return err
    }
    defer tx.Rollback()

    //Let Say We have Other Process that take 500ms (We assume it as 500ms)
    time.Sleep(500 * time.Millisecond)

    //Query to Find Stock WITH LOCK
    var stock int32
    err = tx.QueryRow(`
        SELECT stock 
        FROM products 
        WHERE id = $1 
        FOR UPDATE;`, p.ID).Scan(&stock)
    if err != nil {
        return err
    }

    //Validate Stock
    if stock < orderQuantity {
        return errors.New("failed ! Stock not Enough")
    }

    //Calculate Stock
    newStock := stock - orderQuantity
    time.Sleep(100 * time.Millisecond) //Also Assume we have 100ms Business Process when Decrease Stock

    //Query to Update Stock
    _, err = tx.Exec(`
        UPDATE products 
        SET stock = $1,
            updated_at = CURRENT_TIMESTAMP 
        WHERE id = $2`,
        newStock, p.ID)
    if err != nil {
        return err
    }

    return tx.Commit()
}

Here the test case implementation :

func TestOrderWithLockSQL(t *testing.T) {
    // Open 2 Connection, Let simulate that we have 2 Go Apps Running
    db1 := PgSQLConnection()
    db2 := PgSQLConnection()
    defer db1.Close()
    defer db2.Close()

    product := &Product{ID: 1}
    wg := &sync.WaitGroup{}

    // Get initial stock
    var initialStock int32
    err := db1.QueryRow("SELECT stock FROM products WHERE id = $1", product.ID).Scan(&initialStock)
    if err != nil {
        t.Fatal(err)
    }
    fmt.Printf("Initial stock: %d\n", initialStock)

    batchSize := 20    // Process in smaller batches
    totalOrders := 100 // Reduce total orders to demonstrate the race

    for i := 0; i < totalOrders; i += batchSize {
        for j := 0; j < batchSize; j++ {
            wg.Add(1)
            go func(orderNum int) {
                defer wg.Done()
                // Alternate between connections to simulate distributed access
                dbConn := db1
                if orderNum%2 == 0 {
                    dbConn = db2
                }
                err := product.ProcessOrderWithLock(dbConn, 1)
                if err != nil {
                    fmt.Printf("Error: %v\n", err)
                }
            }(i + j)
        }
        wg.Wait()
    }

    var finalStock int32
    err = db1.QueryRow("SELECT stock FROM products WHERE id = $1", product.ID).Scan(&finalStock)
    if err != nil {
        t.Fatal(err)
    }
    fmt.Printf("Final stock: %d\n", finalStock)
    fmt.Printf("Expected stock: %d\n", initialStock-int32(totalOrders))
}

After resetting the stock to 10000 for the same demonstration, here's the result:

/opt/homebrew/Cellar/go/1.22.0/libexec/bin/go tool test2json -t /private/var/folders/q4/lfxksx5x7knd4xjt3qqhyq4m0000gn/T/GoLand/___TestOrderWithLockSQL_in_race_condition.test -test.v -test.paniconexit0 -test.run ^\QTestOrderWithLockSQL\E$
=== RUN   TestOrderWithLockSQL
Initial stock: 10000
Final stock: 9900
Expected stock: 9900
--- PASS: TestOrderWithLockSQL (13.77s)
PASS

Process finished with the exit code 0

Perfect! After implementing the lock, everything works as expected.
Here's the breakdown of what's happening now:

Let's say Customer A and Customer B try to buy donuts simultaneously
Customer A's process starts and executes SELECT ... FOR UPDATE (This locks the row in the products table for Customer A's transaction)
Customer B's process also tries to execute SELECT ... FOR UPDATE (Instead of reading the same stock value, it's forced to wait because the row is locked)

Customer A completes their entire process:
Reads stock (10000)
Validates quantity
Updates stock to 9999
Commits transaction
Releases the lock

Only then can Customer B proceed:
Reads the new stock (9999)
Validates quantity
Updates stock to 9998
Commits transaction

The key difference from our previous version:

Before: Both processes could read the same initial value
Now: The second process must wait for the first one to complete
This ensures each order correctly decrements the stock once
Notice the test took longer to run (13.77s vs 3.55s) because of the waiting time
But we got the correct final stock of 9900, exactly as expected after 100 orders

This demonstrates how database-level locking prevents race conditions by ensuring each transaction processes completely before the next one can begin.

DEV Community

The Invisible Battle: When Multiple Users Fight Over Your Last Product 🥊 [Race Condition on Database Level Explained]

Top comments (0)