DEV Community

Taras
Taras

Posted on

Webhooks Are Broken by Design — So I Built a Fix

If you've ever integrated a third-party service, you've dealt with webhooks. Payment processed? Webhook. New subscriber? Webhook. File uploaded? Webhook.

They feel simple. A POST request hits your server and you handle it. Done.

Except it's not that simple. Not even close.


The Problem Nobody Talks About

Webhooks are a "fire and forget" mechanism. The sender makes one HTTP request to your endpoint. If that request fails — your server is down, restarting, overloaded, or just returned a 500 — most senders either give up or retry a handful of times with no real strategy.

And you never know it happened.

No error in your dashboard. No alert. The event is just gone.

This is a fundamental design flaw that affects every system using webhooks:

  • E-commerce platforms — an order.paid webhook drops, your fulfillment system never triggers
  • CI/CD pipelines — a push event is missed, your deployment never kicks off
  • SaaS integrations — a subscription.cancelled webhook fails, you keep charging a customer who already left
  • IoT and data pipelines — sensor events silently disappear under load

The downstream consequences can be severe: lost revenue, broken workflows, angry customers, and hours of debugging with no clear trail.


**
Why This Is Hard to Solve on Your Side**

The natural reaction is "I'll just make my endpoint more reliable." But that only solves half the problem.

Even with 99.9% uptime, you'll have:

  • Planned deployments (your server restarts)
  • Database connection spikes
  • Cold starts on serverless functions
  • Network blips between the sender and your server

And here's the thing — you don't control the sender. Stripe, GitHub, Shopify — they decide how many retries they do and when. Some
retry 3 times over an hour. Some retry once. Some don't retry at all.

You're building your system around delivery guarantees you don't actually have.


What a Real Solution Looks Like

The right fix is a reliability layer that sits between the sender and your application:

1. Accept the webhook immediately — always return 200, store the raw payload
2. Queue async delivery to your actual endpoint
3. Retry with exponential backoff — 30s, 5min, 30min, 2h, 24h
4. Track every attempt — status codes, errors, timestamps
5. Let you inspect and manually retry anything that failed

This decouples receipt from processing. The sender's job is done the moment your relay accepts the request. Everything after that is your problem to solve — reliably.


Why I Built My Own

I ran into this problem on a project integrating multiple payment and subscription providers. Events were being missed. I couldn't tell if it was my server, the network, or the sender. There was no audit trail.

I looked at existing solutions. Some were too expensive for a side project. Some were too complex to self-host. Most were black boxes I couldn't modify or extend.

So I built Webhook Relay Layer — an open, self-hostable reliability platform.

The stack is straightforward: FastAPI for async webhook ingestion, Celery + Redis for the task queue and retry logic, PostgreSQL for durable event storage, and a simple dashboard to monitor everything in real time.

The core principle: no webhook ever disappears silently again.

Every event is stored on receipt. Every delivery attempt is logged. Every failure is retryable — automatically or manually. You always know what happened.


What's Next

This is the first post in a series where I'll go deeper on:

  • How the retry engine works under the hood
  • Webhook security: HMAC signatures and replay attack prevention
  • Handling high-throughput ingestion without dropping events
  • Going from local Docker setup to production

If you've been burned by dropped webhooks before, I'd love to hear your story in the comments.


Let me know if you want to add code snippets, shorten any section, or shift the tone.

Tired of webhooks ghosting you? Your events deserve better treatment.

Give them a reliable home → https://webhookrelay.org/

Top comments (0)