DEV Community

Cover image for Why Soft Delete Can Backfire on Data Consistency
Mrakdon.com
Mrakdon.com

Posted on

Why Soft Delete Can Backfire on Data Consistency

Why Soft Delete Can Backfire on Data Consistency

“Soft delete is a safety net, but if you’re not careful it can become a data integrity nightmare.”

Introduction

Many modern applications need to preserve historical records while still allowing users to “remove” data. The most common pattern is soft delete: instead of physically deleting a row, we set a flag (deleted_at, is_deleted, etc.) and treat the record as invisible in the UI.

At first glance this looks like a win‑win:

  • Auditability – you can always trace who did what.
  • Recoverability – accidental deletions are reversible.
  • Simpler business logic – no need for cascade deletes.

However, the moment you combine soft delete with unique constraints (e.g., email addresses, usernames, invoice numbers), the simplicity evaporates. A regular unique index does not differentiate between active and soft‑deleted rows, so inserting a new “active” record that collides with a “deleted” one throws a constraint violation.

Developers often respond by moving the uniqueness check to application code. Unfortunately, that opens the door to race conditions and ultimately breaks the very consistency the database was supposed to guarantee.

In this post we’ll dive deep into the problem, illustrate the pitfalls, and present the PostgreSQL partial index as an elegant, database‑level solution.

What You Will Learn

  • Why relying on application‑level validation for uniqueness is unsafe.
  • How race conditions manifest with soft delete.
  • The syntax and semantics of partial indexes in PostgreSQL.
  • A step‑by‑step implementation example, including migration scripts.
  • Best‑practice testing strategies to ensure the solution holds under concurrency.

The Soft Delete Dilemma

1. Unique Constraints vs. Soft‑Deleted Rows

Consider a simple users table:

CREATE TABLE users (
    id          SERIAL PRIMARY KEY,
    email       TEXT NOT NULL,
    deleted_at  TIMESTAMP WITH TIME ZONE,
    CONSTRAINT uq_users_email UNIQUE (email)
);
Enter fullscreen mode Exit fullscreen mode

When a user deletes their account, we run:

UPDATE users SET deleted_at = now() WHERE id = 42;
Enter fullscreen mode Exit fullscreen mode

Now we try to register a new user with the same email address. PostgreSQL raises:

ERROR:  duplicate key value violates unique constraint "uq_users_email"
Enter fullscreen mode Exit fullscreen mode

The database treats the soft‑deleted row exactly the same as an active row because the unique index has no knowledge of the deleted_at column.

2. Shifting Validation to the Application Layer

A common workaround is:

# Pseudocode (Python)
if not db.session.query(User).filter(User.email == new_email, User.deleted_at.is_(None)).exists():
    db.session.add(User(email=new_email))
    db.session.commit()
Enter fullscreen mode Exit fullscreen mode

At first this seems fine, but under concurrent requests two processes can both evaluate the exists() check as false, then both insert, causing a duplicate key error at commit time – or worse, silently corrupting data if the unique index is removed.

3. The Race Condition Explained

Process A                         Process B
---------------------------       ---------------------------
SELECT … WHERE email='x'          SELECT … WHERE email='x'
-- returns 0 rows                -- returns 0 rows
INSERT INTO users …               INSERT INTO users …
COMMIT                            COMMIT
Enter fullscreen mode Exit fullscreen mode

Both processes see the table as empty because the soft‑deleted row is filtered out, leading to a lost‑update scenario. The only reliable guard against this is a database‑enforced constraint that understands the soft‑delete flag.


Partial Indexes – The PostgreSQL Answer

What Is a Partial Index?

A partial index is an index that only includes rows matching a given WHERE predicate. PostgreSQL evaluates the predicate for each row at index creation time and maintains the index only for rows that satisfy it.

“Think of a partial index as a filtered view of your data that the planner can use for fast look‑ups while ignoring irrelevant rows.”

Defining a Conditional Unique Index

For the users example we want uniqueness only for rows that are not soft‑deleted:

CREATE UNIQUE INDEX uq_users_email_active ON users (email)
WHERE deleted_at IS NULL;
Enter fullscreen mode Exit fullscreen mode

Now the database enforces:

  • Active rows (deleted_at IS NULL) must have unique emails.
  • Deleted rows are excluded from the index, so their email values can be reused.

Verifying the Behavior

-- Insert an active user
INSERT INTO users (email) VALUES ('alice@example.com');

-- Soft‑delete the user
UPDATE users SET deleted_at = now() WHERE email = 'alice@example.com';

-- Re‑use the same email – succeeds because the row is excluded from the index
INSERT INTO users (email) VALUES ('alice@example.com');
Enter fullscreen mode Exit fullscreen mode

If you try to insert a duplicate while the original row is still active, PostgreSQL blocks the operation with a clear error.


Full Implementation Walkthrough

1. Table Definition

CREATE TABLE users (
    id          BIGSERIAL PRIMARY KEY,
    email       TEXT NOT NULL,
    name        TEXT NOT NULL,
    deleted_at  TIMESTAMP WITH TIME ZONE,
    created_at  TIMESTAMP WITH TIME ZONE DEFAULT now(),
    updated_at  TIMESTAMP WITH TIME ZONE DEFAULT now()
);
Enter fullscreen mode Exit fullscreen mode

2. Adding the Partial Unique Index

-- Migration script (e.g., using Flyway or Alembic)
BEGIN;
    CREATE UNIQUE INDEX uq_users_email_active ON users (email)
    WHERE deleted_at IS NULL;
COMMIT;
Enter fullscreen mode Exit fullscreen mode

3. Soft Delete Helper (SQL Function)

CREATE OR REPLACE FUNCTION soft_delete_user(p_user_id BIGINT)
RETURNS VOID AS $$
BEGIN
    UPDATE users
    SET deleted_at = now(), updated_at = now()
    WHERE id = p_user_id AND deleted_at IS NULL;
END;
$$ LANGUAGE plpgsql;
Enter fullscreen mode Exit fullscreen mode

4. Application‑Level Insert (No Manual Uniqueness Check Needed)

# Example using SQLAlchemy (Python)
new_user = User(email='bob@example.com', name='Bob')
session.add(new_user)
try:
    session.commit()
except Exception as e:
    session.rollback()
    raise ValueError('Email already in use') from e
Enter fullscreen mode Exit fullscreen mode

The only source of truth for uniqueness is now the partial index – no extra code, no race conditions.


Testing Under Concurrency

To prove the solution works, simulate concurrent inserts with pgbench or a custom script:

#!/usr/bin/env bash
# concurrent_insert.sh – runs two psql sessions in parallel
psql -d mydb -c "BEGIN; INSERT INTO users (email, name) VALUES ('charlie@example.com', 'Charlie'); COMMIT;" &
psql -d mydb -c "BEGIN; INSERT INTO users (email, name) VALUES ('charlie@example.com', 'Charles'); COMMIT;" &
wait
Enter fullscreen mode Exit fullscreen mode

One transaction will succeed; the other will abort with:

ERROR:  duplicate key value violates unique constraint "uq_users_email_active"
Enter fullscreen mode Exit fullscreen mode

Because the constraint lives in the database, PostgreSQL serializes the conflicting operations safely.


When Partial Indexes Aren’t Enough

  • Multiple soft‑delete columns – you may need a composite predicate.
  • Complex business rules – sometimes a CHECK constraint combined with a partial index is required.
  • Cross‑table uniqueness – consider a separate “lookup” table with a ON DELETE SET NULL cascade.

In such cases, the same principle applies: push the rule down to the DB, not the application.


Conclusion

Soft delete is a powerful pattern, but when you pair it with unique constraints you must avoid moving the guard rail to application code. Doing so invites race conditions and erodes data integrity.

By leveraging PostgreSQL partial indexes, you get:

  • Conditional uniqueness – active rows stay unique, deleted rows are ignored.
  • Zero‑runtime overhead – the index is maintained automatically.
  • Robust concurrency handling – the database enforces the rule atomically.

“Let the database do what it does best: guarantee consistency. Your application can then focus on delivering value.”

If you’ve encountered similar challenges or have alternative strategies (e.g., UUID surrogate keys, versioned tables), share your thoughts in the comments. Happy coding!

Top comments (2)

Collapse
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO • Edited

I took a different route with MariaDB: instead of soft deletes, I keep a clean customers table and write every create/update/delete into a customers_history table via triggers.
The goal isn’t restoration UX but auditability and query consistency over time.
And there’s no universal “right” pattern here — only trade-offs depending on whether you optimize for simplicity, performance, or long-term auditability.

Collapse
 
mrakdon profile image
Mrakdon.com

Great point. Segregating history into a separate table is a very robust way to handle this, and it keeps the active index small and clean.

One of the reasons I love PostgreSQL is that Partial Indexes allow us to keep the 'Soft Delete' logic in a single table without paying the penalty on unique constraints. In databases without partial indexes (or where they are less flexible), the History Table approach you mentioned is almost mandatory to maintain integrity. Thanks for sharing the alternative!