DEV Community

Cover image for Your Database Is Slow Because Everything Is Hot
ALI MANSOOR
ALI MANSOOR

Posted on

Your Database Is Slow Because Everything Is Hot

At some point, every growing system starts collecting ghosts.

So you have a job as a software engineer.

You build systems. APIs. Workers. Queues. Dashboards. Databases.

Life is good.

Well... not that good.

Somewhere in your infrastructure, there’s a database table quietly growing in size every second.

Maybe it's:

  • orders
  • logs
  • chat messages
  • notifications
  • webhook events
  • analytics

In the beginning, everything is beautiful.

Your queries are fast.
Your CPU is relaxed.
Your dashboards load instantly.

Then the data grows.

Good problem to have.

So naturally, you do what every engineer does.

You add indexes.

CREATE INDEX idx_orders_created_at
ON orders(created_at);
Enter fullscreen mode Exit fullscreen mode

And suddenly:

Performance returns.
Life is good again.

For now.

Then Reality Arrives

A few months later:

  • New query patterns appear
  • More indexes get added
  • Old indexes become useless
  • Write performance starts dropping
  • Backups start dragging
  • CPU usage spikes
  • RAM usage starts looking offensive

And the database begins cursing at you in 0's and 1's.

The Internet Tells You To Shard Everything

So you go searching for answers.

Maybe you ask:

  • ChatGPT
  • Claude
  • Meta AI (I'm not judging)
  • that one senior engineer who says "just use Cassandra" (he's right tho)

And suddenly the suggestions begin:

  • partition the data
  • shard the database
  • horizontal scaling
  • replicas
  • sacrifice a goat to Kubernetes

ughhh.

maybe the issue is temperature...

Most Data Is Cold

This is the important realization.

A lot of production systems only actively use recent data.

Do you really need a 3-year-old webhook event sitting inside your primary production table?

Probably not.

But...

You still need to keep it.

For:

  • compliance
  • finance
  • legal
  • audits
  • analytics
  • "just in case"

It needs to go away (not actually go away, but still go away)

so what do you do?

Freeze The Data

The solution is surprisingly simple.

You stop treating old data like active data.

You freeze it.

Meaning:

  • keep recent data queryable
  • move old data elsewhere
  • reduce table size
  • reduce index size
  • reduce backup size
  • reduce IO pressure

Your database becomes smaller again.

Smaller databases are faster databases.

What Do You Mean Freeze It? Where Does Cold Data Go?

You have options.

Option 1 — Archive Tables

orders
orders_archive
Enter fullscreen mode Exit fullscreen mode

Simple and effective.

Good when:

  • same database
  • rare access needed
  • low operational overhead

Option 2 — Data Warehouse

Perfect for:

BI teams
analytics
finance reporting

Examples:

  • BigQuery
  • Snowflake
  • ClickHouse
  • Redshift

Option 3 — Object Storage

Honestly?

Sometimes CSV files in S3 are enough.

Especially for:

  • logs
  • audit trails
  • compliance archives

You can export:

  • JSON
  • CSV

Cheap. Durable. Simple.

The Migration Strategy

So how do you do this?
If you dont know, I am worried about you....

You write a cron, it moves data which has surpassed the ttl to cold storage.

Do not move everything at once.

That is how you create incidents.

Instead:

  • small batches
  • gradual movement
  • continuous cleanup

In the beginning:

Run the cron every hour.

Move:

  • 5k records
  • maybe 10k
  • maybe less

Observe:

  • lock times
  • replication lag
  • CPU spikes
  • IO usage

Then gradually increase retention movement.

Eventually:

  • daily jobs
  • weekly jobs
  • biweekly jobs

Your system stabilizes.

Referential Integrity Matters

Data is usually connected.

Example:

orders
├── payments
├── invoices
├── shipment_logs
└── audit_events

Moving only the parent record can create:

  • orphaned rows
  • broken analytics
  • failed joins
  • compliance problems

Archive related entities together.

I'll be back again (don't know when, to share some more insights...maybe another problem to solve)

Top comments (0)