DEV Community

Anish Anantharaman
Anish Anantharaman

Posted on

๐Ÿš€ Implement Soft Deletes Without Destroying Performance

Soft deletes sound simple: instead of deleting a record, you mark it as deleted.

UPDATE users SET deleted = true WHERE id = 123;
Enter fullscreen mode Exit fullscreen mode

Done, right?

Not quite.

At small scale, this works fine. But as your data grows, soft deletes can quietly wreck query performance, bloat indexes, and complicate logic.

Letโ€™s break down how to do it properlyโ€”without turning your database into a slow-moving legacy monster.


๐Ÿง  Why Soft Deletes Exist

We use soft deletes when:

  • You need auditability
  • You want undo/restore capability
  • Data must be retained for compliance or business logic

Instead of deleting:

DELETE FROM orders WHERE id = 10;
Enter fullscreen mode Exit fullscreen mode

You do:

UPDATE orders SET deleted_at = NOW() WHERE id = 10;
Enter fullscreen mode Exit fullscreen mode

โš ๏ธ The Hidden Problems

1. Query Performance Degrades

Every query now needs:

WHERE deleted_at IS NULL;
Enter fullscreen mode Exit fullscreen mode

If you forget it? ๐Ÿ‘€ Congrats, you just exposed deleted data.

If you include it everywhere? Your indexes might not help anymore.


2. Indexes Become Less Effective

A typical index:

CREATE INDEX idx_users_email ON users(email);
Enter fullscreen mode Exit fullscreen mode

Now your query:

SELECT * FROM users 
WHERE email = 'a@b.com' 
AND deleted_at IS NULL;
Enter fullscreen mode Exit fullscreen mode

๐Ÿ‘‰ That index is no longer optimal.


3. Table Bloat

Soft-deleted rows still sit there:

  • Increasing table size
  • Slowing scans
  • Affecting cache efficiency

โœ… The Right Way to Implement Soft Deletes

1. Use deleted_at Instead of Boolean

Avoid:

deleted = true
Enter fullscreen mode Exit fullscreen mode

Use:

deleted_at TIMESTAMP NULL
Enter fullscreen mode Exit fullscreen mode

Why?

  • Stores when deletion happened
  • Helps in cleanup jobs
  • Enables auditing

2. Use Partial Indexes (Game Changer)

If you're using PostgreSQL:

CREATE INDEX idx_users_active_email 
ON users(email) 
WHERE deleted_at IS NULL;
Enter fullscreen mode Exit fullscreen mode

Now this query is fast again:

SELECT * FROM users 
WHERE email = 'a@b.com' 
AND deleted_at IS NULL;
Enter fullscreen mode Exit fullscreen mode

๐Ÿ‘‰ Only active rows are indexed = smaller, faster index.


3. Default Scope / Global Filters (Use Carefully)

In ORMs like Spring Data / Hibernate:

@Where(clause = "deleted_at IS NULL")
Enter fullscreen mode Exit fullscreen mode

This ensures deleted records are automatically excluded.

โš ๏ธ But:

  • Can confuse developers
  • Harder to debug when data "disappears"

Use it, but document it clearly.


4. Archive Old Data

Soft delete โ‰  keep forever.

Move old deleted data:

INSERT INTO users_archive 
SELECT * FROM users 
WHERE deleted_at < NOW() - INTERVAL '90 days';

DELETE FROM users 
WHERE deleted_at < NOW() - INTERVAL '90 days';
Enter fullscreen mode Exit fullscreen mode

๐Ÿ‘‰ Keeps main table lean and fast.


5. Be Smart with Unique Constraints

Problem:

UNIQUE(email)
Enter fullscreen mode Exit fullscreen mode

Soft delete a user โ†’ can't reuse email โŒ

Fix with partial unique index:

CREATE UNIQUE INDEX uniq_active_email 
ON users(email) 
WHERE deleted_at IS NULL;
Enter fullscreen mode Exit fullscreen mode

Now:

  • Active users must be unique โœ…
  • Deleted users donโ€™t block reuse โœ…

6. Batch Your Deletes

Avoid:

UPDATE users SET deleted_at = NOW();
Enter fullscreen mode Exit fullscreen mode

Instead:

  • Process in batches (e.g., 1000 rows at a time)
  • Prevent locks and performance spikes

7. Donโ€™t Overuse Soft Deletes

Not everything needs it.

Avoid soft deletes for:

  • High-frequency log tables
  • Temporary/session data
  • Event streams

Sometimes, hard delete + backup is cleaner.


๐Ÿงช Real-World Pattern (Recommended Setup)

  • deleted_at TIMESTAMP NULL
  • Partial indexes on active data
  • Background job for cleanup/archival
  • Explicit queries (not overly magical ORM behavior)

โšก TL;DR

Soft deletes are not free.

If you:

  • Add deleted_at
  • Use partial indexes
  • Archive old data
  • Handle uniqueness correctly

๐Ÿ‘‰ You get the benefits without killing performance.

If you donโ€™tโ€ฆ

๐Ÿ‘‰ Youโ€™re slowly building a performance bug that shows up at scale (usually at the worst possible time ๐Ÿ˜„).


๐Ÿ’ฌ Final Thought

Soft deletes are like adding a feature flag to your data layerโ€”powerful, but easy to misuse.

Design it right early, or youโ€™ll end up debugging โ€œwhy is this query suddenly slow?โ€ six months later.

Top comments (0)