I still remember the exact moment I realized soft deletes weren't just a feature—they were a confession.
We were seven months into a CRM project. The product manager came to my desk with a panicked look. "A user accidentally deleted 3,000 contact records. Can we get them back?"
My stomach turned. I had designed the database with hard deletes—clean, efficient, and unforgiving. DELETE FROM contacts WHERE id = ? and it was gone. Forever. The backup restoration would take hours, and we'd lose a day's worth of new data.
That night, as I wrote a painful data recovery script, I promised myself: never again. I joined the church of soft deletes. I added an is_deleted column to every table, updated all the queries, and felt a wave of relief.
Then the performance problems started.
Six months later, every query had WHERE deleted_at IS NULL, indexes were bloated, and a junior dev accidentally included soft‑deleted records in a report, causing chaos. I realized that soft deletes are not a silver bullet—they are a design pattern that requires artistry, discipline, and a deep understanding of trade‑offs.
This is the story of how I learned to wield soft deletes like a craftsman, not a cowboy.
The Two Paths: Hard vs. Soft
Let's be honest—hard deletes are seductive. They keep your tables lean, your indexes fast, and your foreign keys clean. You never have to filter out "dead" rows. Joins are simple. Performance is predictable.
But hard deletes make a dangerous assumption: that data is only valuable when it's current.
Every senior developer knows this is false. Data has a half‑life. A customer who deleted their account last year might come back. A moderator who deleted a comment might need an audit trail. An accidental bulk delete can ruin your business.
Soft deletes acknowledge that deletion is often just a change of state. You're not destroying data—you're hiding it from the default view. But hiding is easy. Making that hiding performant, queryable, and maintainable? That's the art.
The Naïve Pattern and Its Pitfalls
The simplest soft delete pattern is a nullable timestamp column:
ALTER TABLE contacts ADD COLUMN deleted_at TIMESTAMP NULL;
CREATE INDEX idx_contacts_deleted_at ON contacts(deleted_at);
Every query becomes:
SELECT * FROM contacts WHERE deleted_at IS NULL AND ...;
This works for small tables. But as your data grows, you'll hit three walls:
1. Index Bloat – An index on deleted_at is mostly empty if few records are deleted. But the index still has to scan all those NULL entries. On PostgreSQL, you can use a partial index: CREATE INDEX ... WHERE deleted_at IS NULL. That's a game changer. It indexes only the active rows, keeping the index small and fast.
2. Forgotten Filters – Someone will forget the deleted_at IS NULL clause. A junior dev, a reporting tool, a quick debugging query. Suddenly, soft‑deleted data leaks into production. I've seen this cause regulatory nightmares (showing deleted customer data in exports) and embarrassing UI bugs (showing "deleted" comments).
3. Foreign Key Confusion – If you have a orders table with a foreign key to customers, and you soft‑delete a customer, what should happen to their orders? If the foreign key uses ON DELETE CASCADE, hard deletes remove the orders. But with soft deletes, you need to decide: hide the orders too, or show them as orphaned? There's no database constraint to help you. You have to enforce this in application logic.
The Art of the Partial Index
Let me show you the single most important optimization for soft deletes: partial indexes.
Most databases (PostgreSQL, SQLite, newer MySQL) support indexes with a WHERE clause. For soft deletes, you want:
CREATE INDEX idx_active_contacts ON contacts (last_name, first_name) WHERE deleted_at IS NULL;
Now, queries for active contacts use a tiny index that excludes every deleted row. Inserts and updates to active rows are still fast. Even if you have millions of deleted records, the index never grows.
I learned this lesson the hard way. We had a transactions table with 50 million rows, 40 million of which were soft‑deleted (archived). A simple SELECT SUM(amount) FROM transactions WHERE deleted_at IS NULL was taking 30 seconds because the index was scanning through all 50 million entries. After adding a partial index, the same query took 200 milliseconds.
Partial indexes are art because they require you to think about what "active" means in your domain. Sometimes deleted_at IS NULL is enough. Sometimes you also need status != 'archived'. The index becomes a living statement of your data's lifecycle.
The Query Pattern: Views and Row‑Level Security
To solve the "forgotten filter" problem, we created a set of views that automatically exclude soft‑deleted records.
CREATE VIEW active_contacts AS
SELECT * FROM contacts WHERE deleted_at IS NULL;
Then we trained the team to query the view, not the table. The view acts as a contract: this is what you see when you want live data. For admin reports that need to include deleted records, we forced explicit SELECT * FROM contacts.
In PostgreSQL, we went a step further and used row‑level security (RLS) to enforce the filter at the database level. Every query to the contacts table automatically had WHERE deleted_at IS NULL injected, unless the user had a specific role. This made it impossible to accidentally leak soft‑deleted data.
But RLS has a performance cost. Test it thoroughly.
The Two‑Column Strategy: Deleted At vs. Deleted By
A simple deleted_at timestamp is enough for most cases. But I've found that adding a deleted_by column (user ID) transforms soft deletes from a technical pattern into an audit trail.
ALTER TABLE contacts ADD COLUMN deleted_at TIMESTAMP NULL;
ALTER TABLE contacts ADD COLUMN deleted_by_id INT NULL REFERENCES users(id);
Now, when you restore a record, you know who deleted it and when. This has saved my team countless times during customer support inquiries: "Who marked this lead as deleted last week?" Having the answer in the same row (instead of a separate audit log) makes queries trivial.
We also added a deleted_reason text column for cases where users had to explain why. This turned out to be invaluable for spotting patterns—like a particular salesperson marking leads as deleted instead of following up.
The Performance of Restoration
Restoring a soft‑deleted record is just an update:
UPDATE contacts SET deleted_at = NULL WHERE id = ?;
But what if you need to restore all records that were deleted in a date range? That's a bulk update. And if your table has a partial index on deleted_at IS NULL, that update will be fast because the database only needs to touch the active index entries for the restored rows.
However, there's a hidden cost: if your soft‑deleted records have foreign key relationships to other soft‑deleted records (e.g., a customer and their orders), restoring the customer doesn't automatically restore the orders. You have to decide whether to cascade the restoration or leave the orders deleted. We built a stored procedure that restores a customer and all their associated records in a transaction. This is the kind of complexity that senior developers must anticipate.
The Art of Hard Deletion (Yes, Sometimes)
Despite everything I've said, there are times when you need true, hard deletion.
- GDPR right to erasure – If a user requests that all their personal data be permanently removed, a soft delete doesn't satisfy the law. You need to hard delete.
- Storage costs – If you're storing large binary data (images, videos) and the deleted records are accumulating petabytes, soft deletes become a financial liability.
- Regulatory retention periods – You might be required to keep data for 7 years, then purge it. Soft deletes can mark it as "eligible for hard deletion," then a scheduled job physically removes it.
We built a two‑phase system:
-
Soft delete –
deleted_atset, data hidden from normal queries. - Hard delete – After 90 days (or retention period), a background job permanently removes soft‑deleted records and logs the action to an audit table.
This gives users a grace period to restore mistakes while still complying with data retention laws. The background job uses batched deletes (DELETE FROM contacts WHERE deleted_at < NOW() - INTERVAL '90 days' LIMIT 1000) to avoid locking the table.
The Journey: From Panic to Peace
That CRM I mentioned at the beginning? After the hard‑delete disaster, we added soft deletes. After the performance problems, we added partial indexes. After the forgotten filters, we added views and RLS. After the GDPR request, we added the two‑phase hard delete.
Today, that system handles 50 million soft‑deleted records without breaking a sweat. Queries are fast. Audits are transparent. Support can restore accidental deletions with a single click. And I sleep better at night.
Soft deletes are not a lazy shortcut. They are a deliberate design choice that says: data has memory, and our system respects that. But like any powerful tool, they require craftsmanship.
The Senior Developer's Checklist
Before you sprinkle deleted_at across your schema, ask yourself:
- Do I actually need to recover this data? For ephemeral logs or caches, hard deletes are fine.
- How will I enforce the filter across all queries? Consider views, RLS, or a dedicated query builder layer.
- What indexes will keep active queries fast? Partial indexes are your best friend.
- How will foreign key relationships behave? Decide on cascade behavior for both deletion and restoration.
- What is the hard‑deletion policy? Soft deletes are not forever. Plan for eventual purging.
And most importantly: document your soft‑delete strategy. Leave a comment in your migration file. Explain why you chose a timestamp over a boolean (timestamps give you ordering). Clarify which tables use soft deletes and which don't. Your future self—and your team—will thank you.
The Art of Invisibility
The best soft‑delete implementation is the one users never notice. They don't see WHERE deleted_at IS NULL in the logs. They don't wonder why a "deleted" record still appears in an export. They just know that when they accidentally click the wrong button, someone can fix it.
That's the art: creating resilience behind the scenes, turning a potential catastrophe into an oops‑I‑can‑fix‑that moment. It's not glamorous. But it's the kind of quiet craftsmanship that separates senior engineers from the rest.
Top comments (0)