Why database backups don’t fix integration failures (and what actually does)

Sverre Senneset — Mon, 20 Apr 2026 12:39:27 +0000

Why database backups don’t fix integration failures (and what actually does)

I used to treat integration failures as data problems.

Restore the database.

Re-run the job.

Patch the gap.

It works — until it doesn’t.

Because most of the time, the data isn’t missing.

The event just never made it where it needed to go.

The gap no one owns

Most systems are built around state.

Databases, backups, snapshots — all focused on what the data looks like.

But integrations are about how data moves.

A typical flow:
ERP → Service A → Service B → API

Now imagine:

ERP emits an order.created event
Service A forwards it
Service B times out
No retry is triggered
The upstream system assumes success

Nothing crashes.

No alerts fire.

Until someone notices the order was never fulfilled.

Why backups fall short

When this happens, teams usually:

restore a backup
re-run a job
patch data manually

But backups only restore state.

They don’t tell you:

what failed
what never arrived

And they can’t reconstruct a missing event.

The real problem: delivery

At some point it becomes clear:

This isn’t a data problem.

It’s a delivery problem.

Most systems rely on:

webhooks
ad hoc retry logic
logs spread across services

When something fails mid-flight, debugging becomes guesswork.

And if something is lost entirely, recovery becomes manual.

A different approach: replay

Instead of restoring state, replay flow.

If events are:

captured
stored
replayable

You can recover without guessing.

Not by re-running jobs.

Not by patching data.

But by replaying what actually happened.

What this enables

Re-deliver only failed events
Trace what happened to a specific event
Apply updated retry or routing policies

Recovery becomes predictable — assuming idempotent consumers.

Trade-offs

Storage overhead
Idempotency requirements
Architectural shift

Final thought

If you can’t replay what happened between your systems,

you don’t really have a recovery strategy.

You have a snapshot.

DEV Community: Sverre Senneset

Why database backups don’t fix integration failures (and what actually does)