Wolyra

Posted on May 8 • Originally published at wolyra.ai

Webhook Security Best Practices

#api #integration #webdev

Webhooks are how modern systems actually talk to each other. A payment processor notifies an ERP when a charge settles. A CRM pings a marketing tool when a lead converts. A document signing platform fires off to a contract management system when a signature is captured. At scale, a single mid-market company receives and sends tens of thousands of webhook calls a day.

Webhooks are also one of the most common integration attack surfaces. They are HTTP endpoints exposed to the public internet, they accept data that is typically written directly into business-critical systems, and they are often treated as an afterthought by teams that built the “real” API with far more rigor.

The security patterns for webhooks are well understood. They are just inconsistently applied. This piece is a short, opinionated guide to the controls that matter, the mistakes that repeat, and what a production-grade webhook receiver looks like.

Signing: the non-negotiable control

The single most important webhook security control is cryptographic signing. The sender computes an HMAC (typically HMAC-SHA256) over the request body using a shared secret, and delivers the signature in a header. The receiver recomputes the signature using the same secret and compares.

Without signing, a webhook endpoint is effectively anonymous. Anyone who learns the URL — and URLs leak, through logs, browser history, and copy-paste into Slack — can send a crafted payload that looks like a legitimate event. With signing, an attacker would need the shared secret, which never travels over the wire.

Three implementation details matter:

Sign the exact payload the receiver will process. If the signature covers only the body but the receiver also trusts query-string parameters or headers, those untrusted inputs are an attack surface. The sender and receiver must agree on exactly what is signed, documented in the integration contract.

Use a constant-time comparison. Naive string comparison on signatures is vulnerable to timing attacks. Every mature HTTP framework ships a constant-time comparison helper; use it.

Pick a real algorithm. HMAC-SHA256 is the sensible default. MD5-based signatures (still seen in older integrations) are no longer acceptable. If you control the sender, choose SHA256 and move on.

Timestamp validation to prevent replay

A signed request is authentic but not necessarily fresh. An attacker who captures a legitimate signed webhook (from a log, a proxy, or a man-in-the-middle on a misconfigured network) can replay it indefinitely. If the webhook delivers “payment succeeded,” replay means repeated fulfillment.

The control is timestamp validation. The sender includes a timestamp in the signed payload or in a signed header. The receiver rejects any request whose timestamp is more than a small window old — five minutes is a common choice. The timestamp must be inside the signed envelope, so it cannot be tampered with.

For idempotency on top of replay protection, include an event ID in every webhook and maintain a processed-events table. A replayed event (same ID, different timestamp — or the same request sent twice legitimately due to a network blip) is recognized and acknowledged without re-executing the business logic.

IP allowlisting: defense in depth, not primary control

Many SaaS vendors publish the IP ranges their webhooks originate from. Restricting your webhook endpoint to those ranges at the firewall or reverse proxy is a reasonable defense-in-depth measure.

Reasonable, but not primary. Three caveats:

Vendor IP ranges change. Vendors add data centers, move to new cloud regions, and occasionally rotate ranges without clear advance notice. An allowlist that is not actively maintained becomes a source of silent integration outages.
Shared cloud IP space means the allowlist is often wider than it looks. A range owned by a major cloud provider includes thousands of other tenants.
Allowlisting alone is not a substitute for signing. If the signature is absent or weak, an attacker who gains access to any server in the allowlisted range can forge webhooks.

Treat IP allowlisting as a second gate, not a first one. If the signing is correct, the allowlist is a useful way to reduce noise in logs and discourage opportunistic scanning. If the signing is wrong, the allowlist is a false sense of security.

TLS enforcement

This should be obvious, and in 2026 it mostly is. Webhook endpoints must be HTTPS only. The receiver should reject HTTP entirely — not redirect, not warn, reject. HSTS should be enabled on the domain. Modern TLS (1.2 minimum, 1.3 preferred) with strong cipher suites is the floor.

One subtlety worth mentioning: mutual TLS (mTLS) is sometimes proposed as an alternative to HMAC signing for webhook authentication. It works, and in some regulated environments it is preferred. The operational cost is higher — certificate rotation, private key management, trust store updates — and for most integrations HMAC is the better fit. Do not use it as an excuse to skip signing.

Secret rotation

The shared secret that powers HMAC signing is long-lived credential material. It must be rotatable without downtime.

The usual pattern: the receiver accepts signatures from any of a small set of active secrets (current and previous). The sender is updated to use the new secret; after a grace period the old secret is retired. Rotations happen on a planned cadence — annually is a common choice — and immediately on any suspected compromise.

The rotation plan is part of the integration design, not a future concern. If your webhook implementation cannot rotate secrets without coordinating a simultaneous cutover with every integration partner, the implementation is not finished.

Sample-rate logging for forensics

When something goes wrong with a webhook flow — a missed event, a duplicate, a suspicious pattern — the investigation depends on having logs. Logging every webhook body in full is usually too expensive and creates a sensitive-data retention problem. Logging nothing is worse.

The pragmatic compromise: log every webhook’s metadata (ID, timestamp, source, signature validity, event type, size) indefinitely. Sample-rate log the full body for a small percentage of requests, with automatic redaction of obvious sensitive fields. On a validation failure, log the full context regardless of sampling, because that is the event you will want to investigate later.

The mistakes that repeat

Four patterns account for the majority of webhook security failures we have seen in production audits.

Signing the body but not the query string. The sender signs the JSON body. The receiver uses query parameters — perhaps for routing, perhaps for a trusted hint — that are not covered by the signature. An attacker who can forge query parameters can alter behavior without touching the signed portion. Sign the full request surface the receiver depends on, or explicitly document what is untrusted.

Accepting any timestamp. Timestamp validation is in the spec but the receiver never enforces it. Replay attacks become possible. Add the check, with a reasonable window, and alert on stale timestamps.

No rotation plan. The secret was generated at integration time and has never been rotated. If the secret is compromised, the incident response is “panic and coordinate a global cutover.” Build rotation in from day one.

Treating webhook input as trusted. Signing proves the payload came from the expected sender. It does not prove the payload is well-formed, free of injection patterns, or internally consistent. Webhook data flows into databases, downstream services, and sometimes rendered UIs. Validate schema, enforce size limits, sanitize before persistence, and do not trust a signed payload with unsanitized SQL or HTML any more than you would trust a user form submission.

A production-grade receiver in one paragraph

A webhook endpoint, built correctly, does the following in order: verifies the TLS handshake, checks the source IP against a soft allowlist (log and alert on misses rather than blocking, initially), reads the request body, extracts the signature and timestamp, rejects if the timestamp is outside the acceptable window, computes the expected HMAC and compares in constant time, rejects on mismatch, looks up the event ID in an idempotency table, returns 200 immediately on duplicate, validates the payload schema, writes the event to durable storage before acknowledging, and only then returns 200. Business logic runs asynchronously off the durable record. Every step has a metric attached, every rejection has a log entry, and the receiver survives a broker outage, a downstream outage, and a secret rotation without losing events.

That is the shape of a webhook receiver you want to be running in production. Anything less is a post-incident writeup waiting to be authored.

DEV Community