Gabriel Anhaia

Posted on Jun 13

Soft Delete vs Archive Table: The Choice That Haunts Your Queries

#postgres #database #sql #backend

Book: Database Playbook: Choosing the Right Store for Every System You Build
Also by me: Thinking in Go (2-book series) — Complete Guide to Go Programming + Hexagonal Architecture in Go
My project: Hermes IDE | GitHub — an IDE for developers who ship with Claude Code and other AI coding tools
Me: xgabriel.com | GitHub

Someone on your team added a deleted_at column three years ago. It felt safe. Nothing gets destroyed, an accidental delete is one UPDATE away from recovery, and the product manager who asked for "undo" stopped asking.

Now every query in the codebase carries the same tail: AND deleted_at IS NULL. Miss it once and a deleted user shows up in a billing report. The ORM bolts it onto every read with a global scope, so you stopped seeing it. The table is 40 million rows, 31 million of them deleted, and the hot index is mostly pointers to rows nobody will ever read again.

That is the soft-delete tax. You pay it on every query, forever. The question is whether the safety is worth the bill, and when an archive table buys you the same safety for less.

What soft delete actually costs

Soft delete is a flag, usually a nullable timestamp:

ALTER TABLE orders
  ADD COLUMN deleted_at timestamptz;

The appeal is honest. Deletes become reversible. Foreign keys don't cascade into oblivion. Audit trails stay intact. For a table where deletes are rare and the row count stays small, soft delete is the right call and the costs below never show up.

The costs show up when deletes are common and the table grows. Three of them.

Every query gets a predicate. The live rows are a subset of the table now, so every read has to say so:

SELECT * FROM orders
WHERE customer_id = $1
  AND deleted_at IS NULL;

Forget the second line in one place and you leak deleted rows. The fix most teams reach for is a default scope in the ORM, which works until you write raw SQL for a report and the scope isn't there.

Indexes bloat with dead rows. A plain index on customer_id indexes all 40 million rows, including the 31 million deleted ones. The index is larger than it needs to be, scans touch dead entries, and the planner's row estimates drift because most of what it counts is invisible to live queries.

Uniqueness breaks. This is the subtle one. A unique constraint on email now blocks a user from re-registering with an address that belongs to a soft-deleted row:

-- this fails if a deleted row already
-- holds 'alice@example.com'
INSERT INTO users (email) VALUES ('alice@example.com');

You end up rewriting the constraint to ignore deleted rows, which means a partial unique index, which means more index machinery to maintain.

The partial index that makes soft delete bearable

If you keep soft delete, stop indexing the dead rows. A partial index covers only the live ones:

CREATE INDEX idx_orders_customer_live
ON orders (customer_id)
WHERE deleted_at IS NULL;

Now the index holds 9 million entries instead of 40 million. It's smaller on disk, faster to scan, and the planner's estimates match reality because the index only knows about rows your queries can see.

The catch: the query predicate has to match the index predicate exactly, or Postgres won't use the index. This works:

SELECT * FROM orders
WHERE customer_id = $1
  AND deleted_at IS NULL;

This does not, because the planner can't prove the predicates line up:

SELECT * FROM orders
WHERE customer_id = $1
  AND deleted_at > '2026-01-01';

Partial indexes also fix the uniqueness problem cleanly:

CREATE UNIQUE INDEX idx_users_email_live
ON users (email)
WHERE deleted_at IS NULL;

Now email is unique among live users, and a deleted row holding the same address no longer blocks re-registration. This is the pattern that makes soft delete survivable at scale. If you've shipped soft delete without partial indexes, this is the first thing to change.

The archive table: move the rows out

The other approach is to physically move deleted rows to a separate table. The live table stays small and every query against it skips the deleted_at filter entirely.

The shape:

CREATE TABLE orders_archive (
  LIKE orders INCLUDING DEFAULTS,
  archived_at timestamptz NOT NULL DEFAULT now()
);

Deleting a row becomes a move. Do it in one transaction so a crash never leaves the row in both tables or neither:

BEGIN;

WITH moved AS (
  DELETE FROM orders
  WHERE id = $1
  RETURNING *
)
INSERT INTO orders_archive
SELECT *, now() FROM moved;

COMMIT;

The DELETE ... RETURNING feeds the INSERT in a single statement, so the row leaves orders and lands in orders_archive atomically. No window where it exists twice.

What you get back: the live orders table holds only live rows. No deleted_at IS NULL on any query. No partial indexes to keep aligned. The hot table's indexes are exactly as big as the working set, which is the thing that actually drives cache hit rates and query latency.

What you give up: undo is harder. Restoring a row means moving it back, and any foreign keys pointing at the archived row are now dangling unless you archived the children too. Cross-table reports that need both live and dead rows have to UNION the two tables or query the archive directly. The move itself is a write, so a high-delete workload pays write cost at delete time instead of read cost at query time.

Where the trade actually lands

The choice isn't moral. It's about where you want to pay.

Soft delete pays at read time: every query carries the filter, every index carries the dead rows, and the cost scales with how often you read and how many dead rows accumulate. The benefit is trivial undo and zero delete-time work.

Archive tables pay at delete time and recovery time: each delete is a move, restoring is manual, and cross-state queries are awkward. The benefit is a permanently small hot table and queries that don't think about deletion at all.

A rough decision:

Signal	Lean soft delete	Lean archive
Delete rate	Low	High
Dead-row ratio over time	Stays small	Grows large
Undo frequency	Often	Rare
Reports need dead rows	Yes, mixed with live	No, or separately
Hot-table size pressure	Low	High

A user account that gets deactivated once and reactivated next week wants soft delete. A notifications table that deletes 90% of its rows within 30 days wants an archive (or partitioning, which is the archive pattern formalized).

The third option people forget: partitioning

If your deletes are time-based, neither flag nor move is the cleanest answer. Range partitioning lets you drop whole chunks of old data instantly:

CREATE TABLE events (
  id bigserial,
  created_at timestamptz NOT NULL,
  payload jsonb
) PARTITION BY RANGE (created_at);

CREATE TABLE events_2026_06
  PARTITION OF events
  FOR VALUES FROM ('2026-06-01')
  TO ('2026-07-01');

Retiring June's data is one statement, and it returns in milliseconds because it's a catalog operation, not a row-by-row delete:

ALTER TABLE events
  DETACH PARTITION events_2026_06;

The detached partition becomes a standalone table you can archive to cold storage or drop. No deleted_at filter, no per-row move, no vacuum churn from a mass delete. For append-heavy, time-expiring data this is the move that scales the furthest.

How to decide for the table in front of you

Ask three questions about the actual table, not the abstraction.

First, what fraction of rows end up deleted, and over what window? If it climbs past a third and keeps going, soft delete is taxing every read for a minority of live rows. That's an archive or partition case.

Second, how often do you actually undo? If "we might need it back" has happened twice in two years, you're paying a daily read tax for a twice-a-year event. Archive it and write a restore script for the rare case.

Third, do your reports need dead rows alongside live ones? If analytics joins deleted orders with live customers constantly, splitting the tables makes every report a UNION. Soft delete with partial indexes keeps that simple.

If you keep soft delete, add the partial indexes today, both the lookup ones and the unique ones. If you go archive, wrap the move in a transaction with DELETE ... RETURNING so a row never lives in two places. If your deletes are time-shaped, partition and detach.

The default that haunts queries is the one nobody chose on purpose. Pick the trade with open eyes and the deleted_at tail stops being a tax you forgot you were paying.

What's the worst soft-delete situation you've inherited? A table that's 80% dead rows? A report that quietly counted deleted users for a year? Drop it in the comments.

If this was useful

This post pulls from the data-modeling thread that runs through the Database Playbook: Choosing the Right Store for Every System You Build. Deletion strategy is one of those decisions that looks like a column and turns out to be an architecture, and the book walks through the rest of them: partitioning vs sharding, when a separate store beats another table, and how the shape of your reads should drive the shape of your schema.