Ama

Posted on Feb 27

I ship a lot of API/webhook integrations. Here’s how I make them NOT hurt in production 🔥

#ai #webdev #programming #javascript

I ship a lot of API/webhook integrations. Here’s how I make them NOT hurt in production 🔥

If you do freelance backend long enough, you start noticing a pattern:

Clients don’t pay for “beautiful code”.
They pay for it working tomorrow.

And webhook integrations are the fastest way to get random chaos:

duplicate events
out of order delivery
retries that DDoS you
and the classic “it worked yesterday 🤡”

So here’s my real-world baseline for building webhook/API integrations that don’t wake me up at 3AM.

No theory. Just a practical checklist + a simple architecture that scales.

1) Assume the webhook will be duplicated. Because it will. ✅

If you process every incoming request as “unique”, you’re cooked.

Rule: every webhook must be idempotent.

That means you need an event id or a hash that lets you say:

“Seen it. Skipping.”

Real workflow:

extract event_id from payload (or generate a hash from stable fields)
store it with a status
on repeat: return 200 OK and do nothing

Because if you return 500, they will retry harder.

2) Acknowledge fast. Process async. ⚡

A webhook handler that does real work inside the HTTP request is a trap.

It feels fine until:

your DB is slow for 5 seconds
the provider timeout hits
retries begin
now you’re processing the same event 5 times

My default:

Receive webhook
Validate signature / basic checks
Save event to DB (raw payload + metadata)
Return 200 OK fast
Process the event in a worker/job queue

This makes your system calm.

3) Store raw payloads. Future you will thank you 🧠

When something breaks, the client will say:

“I don’t know, it just didn’t send.”

If you don’t store raw payloads, you have no evidence and no replay.

I always store:

full raw JSON payload
headers (at least important ones)
provider name
received timestamp
processing status
error message if failed

Then you can:

replay events
debug edge cases
prove what happened

It turns “guessing” into “knowing”.

4) Security: verify signatures or don’t pretend it’s secure 🔒

If the provider supports signatures, verify them.

Not later. Not “we’ll add it after MVP”.

Right away.

Because otherwise you’re basically running:

public endpoint that triggers actions

That’s how you get spam, abuse, or worse.

5) Rate limits and backoff: retries are not your enemy, your implementation is 😅

When processing fails, don’t do instant retries like a maniac.

Use backoff:

1 min
5 min
30 min
2 hours
dead-letter queue (manual review)

Most integrations fail because:

temporary provider downtime
temporary DB issue
network nonsense

Backoff makes it survive like a tank.

6) Logging that actually helps, not “we logged something” 📝

I log at two layers:

Request layer

request id
provider
event id
status returned

Job layer

event id
job attempt
result
full error stack (if any)

And one extra rule:
If job fails, save a short human-readable error near the event record.

So later I can scan the DB and instantly see patterns.

7) My minimal scalable structure (simple but powerful)

I like separating responsibilities like this:

webhook_controller
accepts HTTP, validates, stores event, returns response fast
event_store
saves raw payloads, dedup keys, statuses
processor
contains business logic: “what do we do with this event”
adapters
provider-specific mapping (CRM A vs CRM B)
queue/worker
runs processing asynchronously with retry rules

This lets you add new integrations without rewriting everything.

You just add a new adapter.

Common production “gotchas” (learned the annoying way) 🤝

Out-of-order events

You might receive “updated” before “created”.

Solution:

allow upserts
store event history
process based on current state

Provider sends partial data

Sometimes they send only IDs and you must fetch details.

Solution:

use a “hydration step” in the worker (API pull)
cache if needed

Webhook timeouts

If you process inside request, you lose.

Solution:

fast ACK, async processing

TL;DR 🧾

If you want webhook integrations that behave in production:

idempotency is mandatory
acknowledge fast, process async
store raw payloads
verify signatures
implement sane retries
log like you’ll debug it later (because you will)

If you’ve ever shipped webhooks in production, you already know:

it’s never “done”.

it’s “stable enough to survive real traffic” 😄

Drop your worst webhook horror story below 👇

DEV Community

I ship a lot of API/webhook integrations. Here’s how I make them NOT hurt in production 🔥

I ship a lot of API/webhook integrations. Here’s how I make them NOT hurt in production 🔥

1) Assume the webhook will be duplicated. Because it will. ✅

2) Acknowledge fast. Process async. ⚡

3) Store raw payloads. Future you will thank you 🧠

4) Security: verify signatures or don’t pretend it’s secure 🔒

5) Rate limits and backoff: retries are not your enemy, your implementation is 😅

6) Logging that actually helps, not “we logged something” 📝

Request layer

Job layer

7) My minimal scalable structure (simple but powerful)

Common production “gotchas” (learned the annoying way) 🤝

Out-of-order events

Provider sends partial data

Webhook timeouts

TL;DR 🧾

Top comments (0)