DEV Community

Anish Anantharaman
Anish Anantharaman

Posted on

🚀 Implement Soft Deletes Without Destroying Performance

Soft deletes sound simple: instead of deleting a record, you mark it as deleted.

UPDATE users SET deleted = true WHERE id = 123;
Enter fullscreen mode Exit fullscreen mode

Done, right?

Not quite.

At small scale, this works fine. But as your data grows, soft deletes can quietly wreck query performance, bloat indexes, and complicate logic.

Let’s break down how to do it properly—without turning your database into a slow-moving legacy monster.


🧠 Why Soft Deletes Exist

We use soft deletes when:

  • You need auditability
  • You want undo/restore capability
  • Data must be retained for compliance or business logic

Instead of deleting:

DELETE FROM orders WHERE id = 10;
Enter fullscreen mode Exit fullscreen mode

You do:

UPDATE orders SET deleted_at = NOW() WHERE id = 10;
Enter fullscreen mode Exit fullscreen mode

⚠️ The Hidden Problems

1. Query Performance Degrades

Every query now needs:

WHERE deleted_at IS NULL;
Enter fullscreen mode Exit fullscreen mode

If you forget it? 👀 Congrats, you just exposed deleted data.

If you include it everywhere? Your indexes might not help anymore.


2. Indexes Become Less Effective

A typical index:

CREATE INDEX idx_users_email ON users(email);
Enter fullscreen mode Exit fullscreen mode

Now your query:

SELECT * FROM users 
WHERE email = 'a@b.com' 
AND deleted_at IS NULL;
Enter fullscreen mode Exit fullscreen mode

👉 That index is no longer optimal.


3. Table Bloat

Soft-deleted rows still sit there:

  • Increasing table size
  • Slowing scans
  • Affecting cache efficiency

✅ The Right Way to Implement Soft Deletes

1. Use deleted_at Instead of Boolean

Avoid:

deleted = true
Enter fullscreen mode Exit fullscreen mode

Use:

deleted_at TIMESTAMP NULL
Enter fullscreen mode Exit fullscreen mode

Why?

  • Stores when deletion happened
  • Helps in cleanup jobs
  • Enables auditing

2. Use Partial Indexes (Game Changer)

If you're using PostgreSQL:

CREATE INDEX idx_users_active_email 
ON users(email) 
WHERE deleted_at IS NULL;
Enter fullscreen mode Exit fullscreen mode

Now this query is fast again:

SELECT * FROM users 
WHERE email = 'a@b.com' 
AND deleted_at IS NULL;
Enter fullscreen mode Exit fullscreen mode

👉 Only active rows are indexed = smaller, faster index.


3. Default Scope / Global Filters (Use Carefully)

In ORMs like Spring Data / Hibernate:

@Where(clause = "deleted_at IS NULL")
Enter fullscreen mode Exit fullscreen mode

This ensures deleted records are automatically excluded.

⚠️ But:

  • Can confuse developers
  • Harder to debug when data "disappears"

Use it, but document it clearly.


4. Archive Old Data

Soft delete ≠ keep forever.

Move old deleted data:

INSERT INTO users_archive 
SELECT * FROM users 
WHERE deleted_at < NOW() - INTERVAL '90 days';

DELETE FROM users 
WHERE deleted_at < NOW() - INTERVAL '90 days';
Enter fullscreen mode Exit fullscreen mode

👉 Keeps main table lean and fast.


5. Be Smart with Unique Constraints

Problem:

UNIQUE(email)
Enter fullscreen mode Exit fullscreen mode

Soft delete a user → can't reuse email ❌

Fix with partial unique index:

CREATE UNIQUE INDEX uniq_active_email 
ON users(email) 
WHERE deleted_at IS NULL;
Enter fullscreen mode Exit fullscreen mode

Now:

  • Active users must be unique ✅
  • Deleted users don’t block reuse ✅

6. Batch Your Deletes

Avoid:

UPDATE users SET deleted_at = NOW();
Enter fullscreen mode Exit fullscreen mode

Instead:

  • Process in batches (e.g., 1000 rows at a time)
  • Prevent locks and performance spikes

7. Don’t Overuse Soft Deletes

Not everything needs it.

Avoid soft deletes for:

  • High-frequency log tables
  • Temporary/session data
  • Event streams

Sometimes, hard delete + backup is cleaner.


🧪 Real-World Pattern (Recommended Setup)

  • deleted_at TIMESTAMP NULL
  • Partial indexes on active data
  • Background job for cleanup/archival
  • Explicit queries (not overly magical ORM behavior)

⚡ TL;DR

Soft deletes are not free.

If you:

  • Add deleted_at
  • Use partial indexes
  • Archive old data
  • Handle uniqueness correctly

👉 You get the benefits without killing performance.

If you don’t…

👉 You’re slowly building a performance bug that shows up at scale (usually at the worst possible time 😄).


💬 Final Thought

Soft deletes are like adding a feature flag to your data layer—powerful, but easy to misuse.

Design it right early, or you’ll end up debugging “why is this query suddenly slow?” six months later.

Top comments (0)