DEV Community

Thesius Code
Thesius Code

Posted on • Originally published at datanest-stores.pages.dev

Webhook Framework

Webhook Framework

Building reliable webhook delivery is deceptively complex — you need retry logic with exponential backoff, cryptographic signature verification, idempotent processing, dead letter queues, and monitoring to know when deliveries fail. This framework gives you all of it: a complete webhook delivery engine in Python with configurable retry policies, HMAC-SHA256 signature generation and verification, event sourcing for full audit trails, and a monitoring dashboard that shows delivery success rates, latency percentiles, and failing endpoints in real time.

Key Features

  • Reliable Delivery Engine — Queue-based webhook dispatcher with at-least-once delivery semantics, configurable concurrency, and per-endpoint circuit breaking
  • Exponential Backoff Retry — Configurable retry policy with jitter (1s, 2s, 4s, 8s, ...) up to a maximum number of attempts, with dead letter queue for permanently failed deliveries
  • HMAC-SHA256 Signatures — Every webhook request includes a signature header so receivers can cryptographically verify the payload originated from your system
  • Event Sourcing — Immutable event log with full payload snapshots, enabling replay of any webhook to any endpoint at any time
  • Endpoint Health Tracking — Automatic disable of consistently failing endpoints after configurable failure thresholds, with email alerts and manual re-enable
  • Monitoring Dashboard Data — JSON API endpoints for delivery metrics: success rate, p50/p95/p99 latency, failures by error type, and per-endpoint status
  • Idempotency Support — Unique event IDs and delivery attempt tracking so receivers can safely deduplicate retried deliveries

Quick Start

  1. Configure your webhook endpoints:
# config.example.yaml
webhook_engine:
  max_concurrent_deliveries: 50
  default_timeout_seconds: 10
  signing_secret: "YOUR_WEBHOOK_SECRET_HERE"

retry_policy:
  max_attempts: 5
  initial_delay_seconds: 1
  max_delay_seconds: 300
  backoff_multiplier: 2
  jitter: true

endpoints:
  - id: "ep_001"
    url: "https://api.example.com/webhooks/orders"
    events: ["order.created", "order.updated", "order.cancelled"]
    secret: "YOUR_ENDPOINT_SECRET_HERE"
    active: true

dead_letter:
  enabled: true
  max_age_days: 30

health:
  failure_threshold: 10
  check_interval_seconds: 60
Enter fullscreen mode Exit fullscreen mode
  1. Send a webhook event:
from webhook_framework import WebhookEngine
from webhook_framework.config import load_config

config = load_config("config.example.yaml")
engine = WebhookEngine(config)

await engine.dispatch(
    event_type="order.created",
    payload={
        "order_id": "ord_12345",
        "customer_email": "user@example.com",
        "total": 99.99,
        "currency": "USD",
        "created_at": "2026-03-23T10:30:00Z"
    }
)
Enter fullscreen mode Exit fullscreen mode
  1. Verify a webhook on the receiving end:
import hmac, hashlib

def verify_webhook(payload_bytes: bytes, signature: str, secret: str) -> bool:
    expected = hmac.new(secret.encode(), payload_bytes, hashlib.sha256).hexdigest()
    return hmac.compare_digest(f"sha256={expected}", signature)
Enter fullscreen mode Exit fullscreen mode

Architecture

webhook-framework/
├── src/webhook_framework/       # core, delivery, signing, retry, event_store, health, monitoring
├── docs/                        # Overview, patterns, checklists
└── config.example.yaml
Enter fullscreen mode Exit fullscreen mode

Event flow: dispatch() -> Event Store (persist) -> Route to endpoints -> Sign payload -> HTTP POST -> On failure: Retry Queue -> Backoff -> Retry -> After max attempts: Dead Letter Queue.

Usage Examples

Retry Policy with Exponential Backoff

from dataclasses import dataclass
import asyncio, random, logging

logger = logging.getLogger(__name__)

@dataclass
class RetryPolicy:
    max_attempts: int = 5
    initial_delay: float = 1.0
    max_delay: float = 300.0
    multiplier: float = 2.0
    jitter: bool = True

    def get_delay(self, attempt: int) -> float:
        delay = min(self.initial_delay * (self.multiplier ** attempt), self.max_delay)
        return delay * (0.75 + random.random() * 0.5) if self.jitter else delay

async def deliver_with_retry(delivery_fn, payload: dict, url: str, policy: RetryPolicy) -> bool:
    for attempt in range(policy.max_attempts):
        try:
            if 200 <= await delivery_fn(url, payload) < 300:
                return True
        except Exception as e:
            logger.warning(f"Attempt {attempt + 1} failed: {e}")
        if attempt < policy.max_attempts - 1:
            await asyncio.sleep(policy.get_delay(attempt))
    return False
Enter fullscreen mode Exit fullscreen mode

Signature Generation

import hmac, hashlib, json, time

def sign_payload(payload: dict, secret: str) -> dict:
    ts = int(time.time())
    body = json.dumps(payload, separators=(",", ":"), sort_keys=True)
    sig = hmac.new(secret.encode(), f"{ts}.{body}".encode(), hashlib.sha256).hexdigest()
    return {
        "X-Webhook-Signature": f"sha256={sig}",
        "X-Webhook-Timestamp": str(ts),
        "X-Webhook-Id": payload.get("event_id", ""),
    }
Enter fullscreen mode Exit fullscreen mode

Configuration

Key Type Default Description
webhook_engine.max_concurrent_deliveries int 50 Max parallel HTTP requests
webhook_engine.default_timeout_seconds int 10 HTTP timeout per delivery
webhook_engine.signing_secret string required YOUR_WEBHOOK_SECRET_HERE
retry_policy.max_attempts int 5 Total delivery attempts
retry_policy.initial_delay_seconds float 1.0 First retry delay
retry_policy.backoff_multiplier float 2.0 Delay multiplier per attempt
dead_letter.enabled bool true Enable dead letter queue
health.failure_threshold int 10 Consecutive failures to disable

Best Practices

  • Always include a timestamp in signatures. Without it, intercepted webhooks can be replayed. Receivers should reject payloads older than 5 minutes.
  • Use idempotency keys on the receiver side. At-least-once delivery means duplicates — track X-Webhook-Id and skip duplicates.
  • Set aggressive delivery timeouts. 10s max. Don't let slow endpoints block the pipeline.
  • Monitor per-endpoint, not just globally. A 98% global rate can hide one endpoint failing 100%.

Troubleshooting

Webhooks received but signature verification fails
Verify against raw request body bytes, not parsed-and-re-serialized JSON. The signing side uses sort_keys=True and compact separators.

Endpoint disabled unexpectedly
Transient network issues can trigger failure_threshold. Increase it or add a health check probe that re-enables endpoints after recovery.

Retry storm overwhelming a recovering endpoint
Enable jitter and consider a per-endpoint concurrency limit to spread retries after an outage.


This is 1 of 7 resources in the API Developer Pro toolkit. Get the complete [Webhook Framework] with all files, templates, and documentation for $29.

Get the Full Kit →

Or grab the entire API Developer Pro bundle (7 products) for $79 — save 30%.

Get the Complete Bundle →


Related Articles

Top comments (0)