Stripe sent one webhook. Your database has three orders. What happened?
This is not a hypothetical. I've seen it in production — on vatnode.dev, on pikkuna.fi, on pi-pi.ee. The Stripe webhook hits your endpoint, your server takes 800ms to respond, Stripe times out and queues a retry. Meanwhile your server did complete the work. Now you have a duplicate. Then another one arrives 30 minutes later.
The naive implementation — parse the JSON, run your business logic, return 200 — fails in ways that only show up under real traffic. Here's what production actually looks like.
Why the Naive Implementation Breaks
Stripe's retry policy is more aggressive than most developers expect. If your endpoint returns anything other than a 2xx status code, or takes longer than 30 seconds to respond, Stripe retries. The schedule: immediately, then at 5 minutes, 30 minutes, 2 hours, 5 hours, 10 hours, and so on — up to 72 hours and roughly 15–18 total attempts.
That means if your server returned 500 at 9 AM on Monday due to a database hiccup, Stripe will still be trying to deliver the same event on Tuesday morning. If your server is back up, it will process the event — possibly creating a duplicate subscription, a duplicate order, or sending a duplicate email to your customer.
The naive handler looks like this:
// app/api/webhooks/stripe/route.ts — DON'T do this
export async function POST(request: Request) {
const body = await request.json(); // Wrong: parsed body won't verify
const event = body as Stripe.Event;
if (event.type === "payment_intent.succeeded") {
await createOrder(event.data.object); // No idempotency check
}
return new Response("ok", { status: 200 });
}
Three problems here: no signature verification, no protection against replay attacks, and synchronous processing that blocks the response while your business logic runs. If createOrder takes 3 seconds and Stripe's timeout is strict, you'll get retries even when the order was created successfully.
name="How to handle Stripe webhooks reliably in production"
totalTime="PT3H"
tools={[
"Next.js 16",
"TypeScript",
"Stripe SDK",
"BullMQ",
"ioredis",
"Drizzle ORM",
"PostgreSQL",
]}
steps={[
{
name: "Verify the signature using raw bytes",
text: "Use request.arrayBuffer() — not request.json() — to get the raw body before passing it to stripe.webhooks.constructEvent. Parsing the body first changes the bytes and makes signature verification always fail.",
},
{
name: "Implement dual-layer idempotency",
text: "Store processed Stripe event IDs in PostgreSQL as the source of truth. Add a Redis pre-check layer for sub-millisecond lookups on hot paths. Return 200 immediately when a duplicate is detected.",
},
{
name: "Enqueue the event with BullMQ",
text: "Return 200 from the HTTP handler within 50ms by enqueuing to BullMQ using event.id as the jobId for deduplication. Never run business logic synchronously inside the webhook handler.",
},
{
name: "Process events in a separate worker",
text: "Run a BullMQ worker with concurrency 5 that handles payment_intent.succeeded, customer.subscription.deleted, and invoice.payment_failed. Double-check idempotency inside the worker to cover edge cases like mid-job restarts.",
},
{
name: "Wrap critical handlers in transactions",
text: "Use a database transaction for payment_intent.succeeded so that partial writes never leave inconsistent state. Send confirmation emails outside the transaction — a failed email shouldn't roll back the order.",
},
{
name: "Test with Stripe CLI locally",
text: "Use stripe listen --forward-to localhost:3000/api/webhooks/stripe and stripe trigger payment_intent.succeeded to replay events. Verify idempotency by re-sending the same event ID and confirming the duplicate response.",
},
]}
/>
Idempotency — The Foundation
slug="api-integrations"
text="Need a production-grade Stripe webhook pipeline — idempotency, BullMQ, signature verification, retry handling — wired into your SaaS or e-commerce stack? This is what I build."
/>
Before any queue, before any worker, you need idempotency: the guarantee that processing the same event twice produces the same result as processing it once.
Stripe makes this straightforward — every event has a unique id field (e.g., evt_1OqXyz...). Store this ID when you process the event. On subsequent attempts, check the store first and short-circuit.
I use PostgreSQL for durable idempotency keys and Redis for a fast pre-check layer. Here's the Drizzle ORM schema:
// packages/db/schema/webhook-events.ts
import { pgTable, text, timestamp, jsonb } from "drizzle-orm/pg-core";
export const webhookEvents = pgTable("webhook_events", {
id: text("id").primaryKey(), // Stripe event ID — evt_1OqXyz...
type: text("type").notNull(), // payment_intent.succeeded, etc.
processedAt: timestamp("processed_at").notNull().defaultNow(),
payload: jsonb("payload").notNull(),
});
And the idempotency check — run this before any business logic:
// lib/webhook-idempotency.ts
import { db } from "@/packages/db";
import { webhookEvents } from "@/packages/db/schema/webhook-events";
import { eq } from "drizzle-orm";
import { redis } from "@/lib/redis"; // ioredis instance
export async function isAlreadyProcessed(eventId: string): Promise<boolean> {
// Fast path: Redis check first (sub-millisecond)
const cached = await redis.get(`webhook:processed:${eventId}`);
if (cached) return true;
// Slow path: database check (handles Redis eviction or restarts)
const existing = await db
.select({ id: webhookEvents.id })
.from(webhookEvents)
.where(eq(webhookEvents.id, eventId))
.limit(1);
if (existing.length > 0) {
// Restore to Redis cache so future checks are fast
await redis.set(`webhook:processed:${eventId}`, "1", "EX", 86400 * 7);
return true;
}
return false;
}
export async function markAsProcessed(
eventId: string,
type: string,
payload: unknown
): Promise<void> {
// Write to DB first — this is the source of truth
await db.insert(webhookEvents).values({
id: eventId,
type,
payload,
});
// Then cache in Redis for fast future lookups
await redis.set(`webhook:processed:${eventId}`, "1", "EX", 86400 * 7);
}
Signature Verification in Next.js App Router
Here's the subtle thing that breaks almost every Next.js webhook tutorial: request.json() returns a parsed object. Stripe's signature verification requires the raw bytes of the original request body. Once parsed, the signature check will always fail.
In the Pages Router, you'd disable bodyParser. In the App Router, you use request.arrayBuffer():
// app/api/webhooks/stripe/route.ts
import Stripe from "stripe";
import { NextResponse } from "next/server";
import { isAlreadyProcessed } from "@/lib/webhook-idempotency";
import { webhookQueue } from "@/lib/webhook-queue";
const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!, {
apiVersion: "2025-01-27.acacia",
});
const webhookSecret = process.env.STRIPE_WEBHOOK_SECRET!;
export async function POST(request: Request) {
// arrayBuffer() gives us raw bytes — required for signature verification
const rawBody = await request.arrayBuffer();
const signature = request.headers.get("stripe-signature");
if (!signature) {
return NextResponse.json({ error: "Missing signature" }, { status: 400 });
}
let event: Stripe.Event;
try {
// Convert ArrayBuffer to Buffer for the Stripe SDK
event = stripe.webhooks.constructEvent(Buffer.from(rawBody), signature, webhookSecret);
} catch (err) {
console.error("Webhook signature verification failed:", err);
return NextResponse.json({ error: "Invalid signature" }, { status: 400 });
}
// Check idempotency before doing anything else
const alreadyProcessed = await isAlreadyProcessed(event.id);
if (alreadyProcessed) {
// Return 200 so Stripe stops retrying — this is intentional
return NextResponse.json({ received: true, duplicate: true });
}
// Enqueue the event — don't process synchronously
await webhookQueue.add(
event.type,
{ event },
{
jobId: event.id, // BullMQ deduplicates by jobId within the active window
attempts: 5,
backoff: { type: "exponential", delay: 5000 },
}
);
// Return immediately — Stripe considers this success
return NextResponse.json({ received: true });
}
Two things to notice. First, the idempotency check runs before enqueuing — so if Stripe retries and the job is still in the queue (not yet processed), you return 200 and BullMQ's jobId deduplication handles the rest. Second, the endpoint returns immediately after enqueuing. This response time is typically under 50ms, well within Stripe's timeout window.
BullMQ Queue Architecture
Processing webhooks synchronously in the HTTP handler means your response time is tied to every downstream API call — CRM updates, email sends, database writes. One slow dependency and you're getting retries.
The right architecture: webhook endpoint enqueues, a separate worker processes.
// lib/webhook-queue.ts
import { Queue } from "bullmq";
import { redis } from "@/lib/redis";
export const webhookQueue = new Queue("stripe-webhooks", {
connection: redis,
defaultJobOptions: {
attempts: 5,
backoff: {
type: "exponential",
delay: 5000, // 5s, 10s, 20s, 40s, 80s
},
removeOnComplete: { count: 1000 }, // Keep last 1000 for debugging
removeOnFail: false, // Keep failed jobs in DLQ for inspection
},
});
// workers/stripe-webhook.worker.ts
import { Worker, Job } from "bullmq";
import Stripe from "stripe";
import { redis } from "@/lib/redis";
import { isAlreadyProcessed, markAsProcessed } from "@/lib/webhook-idempotency";
import { handlePaymentSucceeded } from "./handlers/payment-succeeded";
import { handleSubscriptionDeleted } from "./handlers/subscription-deleted";
import { handleInvoicePaymentFailed } from "./handlers/invoice-payment-failed";
const worker = new Worker(
"stripe-webhooks",
async (job: Job<{ event: Stripe.Event }>) => {
const { event } = job.data;
// Double-check idempotency inside the worker — edge case: worker restart mid-job
const alreadyProcessed = await isAlreadyProcessed(event.id);
if (alreadyProcessed) {
return { skipped: true, reason: "duplicate" };
}
switch (event.type) {
case "payment_intent.succeeded":
await handlePaymentSucceeded(event.data.object as Stripe.PaymentIntent);
break;
case "customer.subscription.deleted":
await handleSubscriptionDeleted(event.data.object as Stripe.Subscription);
break;
case "invoice.payment_failed":
await handleInvoicePaymentFailed(event.data.object as Stripe.Invoice);
break;
default:
console.log(`Unhandled webhook type: ${event.type}`);
return { skipped: true, reason: "unhandled_type" };
}
// Mark processed only after the handler succeeds
// If the handler throws, BullMQ retries the job — we don't mark it done
await markAsProcessed(event.id, event.type, event.data.object);
return { processed: true };
},
{
connection: redis,
concurrency: 5,
}
);
worker.on("failed", (job, err) => {
// After all retries exhausted, alert your team
console.error(`Webhook job ${job?.id} failed permanently:`, err);
});
The dead letter queue behavior is built into BullMQ — failed jobs (after all retries) stay in the failed state. I keep a Telegram alert on the failed event so I know immediately when something needs manual intervention.
Handling the Key Events
payment_intent.succeeded — the most critical handler. Wrap it in a database transaction so partial writes don't leave inconsistent state:
// workers/handlers/payment-succeeded.ts
import Stripe from "stripe";
import { db } from "@/packages/db";
import { orders, users } from "@/packages/db/schema";
import { eq } from "drizzle-orm";
export async function handlePaymentSucceeded(paymentIntent: Stripe.PaymentIntent): Promise<void> {
const customerId = paymentIntent.customer as string;
const metadata = paymentIntent.metadata;
await db.transaction(async (tx) => {
await tx.insert(orders).values({
stripePaymentIntentId: paymentIntent.id,
customerId,
amount: paymentIntent.amount,
currency: paymentIntent.currency,
status: "paid",
});
if (metadata.plan) {
await tx
.update(users)
.set({ plan: metadata.plan, planActivatedAt: new Date() })
.where(eq(users.stripeCustomerId, customerId));
}
});
// Send confirmation email outside the transaction — failure here
// doesn't roll back the order; it just retries the email separately
await sendOrderConfirmation({ paymentIntentId: paymentIntent.id });
}
customer.subscription.deleted — downgrade the user, don't delete their data. Keep at least 30 days of data post-cancellation. Stripe fires this for both immediate cancellations and end-of-period ones — check cancel_at_period_end to differentiate.
invoice.payment_failed — send a dunning email, don't immediately revoke access. Stripe's Smart Retries will attempt the charge again. Give the customer a grace period to update their payment method.
Testing with Stripe CLI
# Forward all events to your local Next.js dev server
stripe listen \
--forward-to localhost:3000/api/webhooks/stripe
# Trigger specific events
stripe trigger payment_intent.succeeded
stripe trigger customer.subscription.deleted
stripe trigger invoice.payment_failed
For testing idempotency: copy the event ID from the CLI output and call your endpoint twice with the same payload — the second call should return { received: true, duplicate: true } within milliseconds.
Gotchas That Cost Me Real Time
request.json() silently breaks signature verification. The raw body and the serialized JSON aren't byte-for-byte identical. Always use request.arrayBuffer(). This is the most common issue I see in Next.js webhook implementations.
BullMQ jobId deduplication only covers active jobs. If a job is completed or failed, BullMQ will accept a new job with the same ID. That's why the database idempotency check is still necessary — it covers retries arriving weeks after the original was processed.
STRIPE_WEBHOOK_SECRET differs between environments. The CLI secret (whsec_... with CLI prefix) is different from the dashboard secret. Use separate environment variables for each environment.
Never trust event.data.object amounts without checking currency. EUR amounts are in cents (100 = €1.00). JPY has no minor unit (100 = ¥100). Always pair the amount with the currency field when storing.
What This Looks Like in Production
On vatnode.dev, webhooks hit the endpoint and return within 40–60ms. The BullMQ worker processes the actual subscription logic asynchronously, with 5 concurrent workers handling bursts. Zero duplicate subscriptions created since launch.
On pikkuna.fi, the same architecture drives the full order pipeline — Stripe fires the webhook, the worker triggers Zoho CRM, PostNord shipment creation, Netvisor invoice, and Mailgun confirmation email in sequence. The webhook endpoint returns 200 within 50ms; the full chain completes in under 2 minutes.
If you're building a SaaS or e-commerce platform with Stripe, you'll hit exactly these problems — usually at the worst moment, like during a product launch or after a server restart.
I've built reliable Stripe integrations across several production systems, from subscription SaaS (vatnode.dev) to high-volume international e-commerce (pikkuna.fi). Once the webhook pipeline is solid, the order automation layer on top becomes straightforward — see how the full e-commerce order pipeline works.
If you need a senior developer who can own the payment infrastructure end-to-end — get in touch. I'm available for API integration and e-commerce development projects and long-term engagements.
Related projects: Pikkuna E-commerce Platform — full order pipeline with Stripe, Zoho CRM, and PostNord integration. Vatnode VAT validation SaaS — subscription billing where this webhook architecture is in production.
Top comments (0)