Andrew Lencmanis

Posted on May 10

Building a Replay-Safe Webhook Event Gateway

#api #architecture #backend #systemdesign

Webhooks look simple at first: receive an HTTP request, forward it somewhere, return 200 OK.

That simplicity disappears the moment the system becomes production-critical. Providers retry. Consumers fail. Payloads arrive twice. Teams need to debug what happened yesterday. A single endpoint becomes dozens of integrations, each with its own routing, retry, transformation, and observability needs.

At that point, webhooks stop being “just HTTP callbacks” and become an event gateway problem.

This is the layer FastHook is built around: receive webhook requests, route them through connections, deliver events to destinations, and make every step inspectable and retryable.

The Core Model
A useful webhook gateway needs clear internal concepts.

In FastHook, a source is where inbound webhooks arrive. It represents the public endpoint that providers send requests to.

A destination is where routed events are delivered. For example, an internal API endpoint, a worker, or another HTTP service.

A connection links one source to one destination. This is where delivery behavior lives: filtering, transformation, retry, delay, and deduplication rules.

A request is the inbound webhook received by FastHook. It contains the original payload and request metadata.

An event is the outbound delivery created from an accepted request and a matching connection. Events can be queued, processed, delivered, failed, or ignored.

That separation matters. If a provider sends one webhook request and it matches several routing paths in the future, you want to inspect the original request separately from each delivery attempt. Even with one source-to-destination connection, keeping request and event separate makes retries, debugging, and metrics much cleaner.

Why Replay Safety Matters
Retries are not optional in webhook systems. They are the normal path.

A destination can return 500. A network request can timeout. A deploy can briefly break an endpoint. If the only recovery path is “ask the provider to resend it,” your system is fragile.

A replay-safe gateway stores enough information to retry later without losing context.

FastHook exposes this at two levels:

curl -X POST \
  "https://api.fasthook.io/v1/requests/req_.../retry" \
  -H "Authorization: Bearer $API_KEY" \
  -H "x-team-id: tm_..."

Request retry replays an inbound request through routing again.

curl -X POST \
  "https://api.fasthook.io/v1/events/evt_.../retry" \
  -H "Authorization: Bearer $API_KEY" \
  -H "x-team-id: tm_..."

Event retry retries a specific delivery event.

Those two operations solve different problems. If routing or connection rules changed, retrying the request can create fresh routed events. If only one destination delivery failed, retrying the event is more precise.

Bulk Retry Is Where Operations Become Real
Manual retry is useful. Bulk retry is where the system becomes operationally useful.

Imagine a destination was down for 30 minutes. You do not want to click retry 800 times. You want to filter failed events by time range, source, status, or search term, then create a controlled bulk operation.

For example:

curl -X POST \
  "https://api.fasthook.io/v1/events/bulk_operations" \
  -H "Authorization: Bearer $API_KEY" \
  -H "x-team-id: tm_..." \
  -H "Content-Type: application/json" \
  -d '{
    "from": "2026-04-17T00:00:00.000Z",
    "to": "2026-04-17T23:59:59.999Z",
    "status": "failed",
    "source_id": "src_...",
    "q": "stripe"
  }'

The important part is that replay is filtered and tracked as an operation, not treated as an invisible background action.

Observability Starts With Count APIs
Before retrying, teams usually ask: how many objects match this filter?

FastHook supports count endpoints for requests and events:

curl \
  "https://api.fasthook.io/v1/events/count?from=2026-04-17T00%3A00%3A00.000Z&to=2026-04-17T23%3A59%3A59.999Z&status=failed&source_id=src_..." \
  -H "Authorization: Bearer $API_KEY" \
  -H "x-team-id: tm_..."

A typical response includes both the exact total and the applied time range:

{
  "total": 1234,
  "count": 1234,
  "total_exact": true,
  "applied_range": {
    "from": "2026-04-17T00:00:00.000Z",
    "to": "2026-04-17T23:59:59.999Z"
  }
}

That sounds small, but it changes the workflow. Operators can estimate blast radius before starting a replay.

Routing Rules Belong On Connections
Webhook providers usually send too much data to too few endpoints. A gateway needs routing rules between ingress and delivery.

FastHook connection rules can cover common operational needs:

filter rules decide whether a request should become an event for a connection.
transform rules modify payloads before delivery.
retry rules control retry behavior.
delay rules postpone delivery.
deduplicate rules help ignore duplicate deliveries inside a configured window.
This keeps source endpoints stable while letting each destination have its own behavior.

Transformations Need Their Own Audit Trail
Transformations are powerful, but they can also hide bugs. If a payload changes before delivery, teams need to know what ran and what came out.

A webhook gateway should treat transformation execution as inspectable data, not just inline code.

FastHook exposes transformation resources and execution history through the API, so teams can list transformations, run tests, and inspect executions tied to delivery behavior.

What To Avoid
The biggest mistake in webhook infrastructure is treating delivery as a single HTTP forward.

That loses the real story.

A production webhook gateway should not only answer “did we receive it?” It should answer:

Was the inbound request accepted or rejected?
Which connection matched it?
Which event was created?
Was the event delivered, failed, ignored, or retried?
What was the destination response?
Can we replay one event safely?
Can we bulk retry a filtered set?
Can we count the impact before doing it?
Once teams have those answers, webhook debugging stops being guesswork.

Closing
Webhooks are often the first integration surface a product exposes, but they are rarely treated as infrastructure until something breaks.

A replay-safe, observable event gateway gives teams a better foundation: stable source URLs, explicit routing, destination delivery tracking, retries, bulk operations, transformations, deduplication, and metrics.

That is the direction we are building with FastHook: not just receiving webhooks, but making webhook delivery understandable, recoverable, and safe to operate.

If you are building webhook infrastructure and want this as a managed product, FastHook is available at https://www.fasthook.io.

DEV Community

Building a Replay-Safe Webhook Event Gateway

Top comments (0)