When we launched DBOS Pro in April, we needed its subscription management and billing system to be secure, reliable, scalable, and easy to maintain. To achieve this, we “ate our own dog food.” We developed a subscription management and billing app with DBOS Transact and Stripe and deployed it to DBOS Cloud. The entire production application, including cloud deployment and CI/CD, requires <500 lines of code. In this blog post, we’ll explain how it works.
The entire production application is available on GitHub.
Building a Reliable Webhook
When a customer clicks “Upgrade to DBOS Pro” on the DBOS website, the subscription management app redirects them to Stripe, where they enter payment information. After they pay, the following happens:
- Stripe sends an event to a webhook endpoint on the subscription management app.
- The webhook retrieves the customer’s updated subscription status from Stripe.
- The webhook maps the customer’s Stripe customer ID to a DBOS Cloud account ID.
- The webhook updates the customer’s subscription status in DBOS Cloud.
While this may sound simple, there are some notoriously difficult challenges that any webhook must solve to be robust and scalable. We’ll explain how hard it is to solve those challenges today, and how we solved them with Stripe and DBOS.
Challenge 1: Webhook endpoints must respond quickly.
According to Stripe best practices, upon receiving an event, a webhook should immediately acknowledge it and return a successful 2XX HTTP status code, doing any complex processing asynchronously. A typical strategy is to have the webhook write to a message queue then build another service to consume messages from the queue. This works, but introduces a lot of complexity. A “simple” subscription management app now needs three services: the webhook, the message queue, and the queue consumer.
Challenge 2: Event processing must run to completion.
Each time the subscription management app receives an event, it should process the event to completion and change the user’s subscription status in DBOS Cloud, regardless of what interruptions and transient failures may occur along the way. Otherwise, a user might pay for DBOS Pro without unlocking Pro-tier features. A typical strategy is to use an orchestration service like AWS Step Functions and configure it to automatically retry each step in the workflow. This works, but also adds complexity (and Step Functions cost!) to the increasingly less simple app.
Challenge 3: Event processing must correctly handle duplicates.
According to Stripe, webhook endpoints may receive the same event more than once. However, for each event we should only upgrade or downgrade a customer once. A common practice is to make event processing idempotent, for example by recording which events have been processed in a persistent store and never processing those events again. This works, but adds yet another service, the persistent store, to the now-rather-complex app.
To sum it all up, here’s the architecture diagram you’d need to implement to build a robust and scalable webhook on AWS today following best practices. In addition to the resources we already mentioned above, we need to configure API Gateway and EventBridge to trigger Lambda or Step Functions. To put it mildly, this isn’t easy to build, test, or maintain.
Reliable Event Processing with DBOS Transact
Now, let's look at how we implement webhook event processing with DBOS. Here's the architecture diagram. As you can imagine, this is much easier to build, test, and maintain.
To ensure the event processing code always runs to completion (Challenge 2), we implement it as a DBOS Transact reliable workflow. The workflow retrieves the customer’s DBOS Cloud user ID and Stripe subscription status, then sends a request to DBOS Cloud to update the customer’s subscription status. DBOS Transact guarantees that if a workflow is interrupted for any reason, it automatically resumes from where it left off. Moreover, it automatically retries transient failures in external API calls such as those to Stripe and DBOS Cloud, following declaratively configurable policies.
Here’s the full production implementation of the workflow (at the time of this writing) as a simple Typescript function (source code).
@Workflow()
static async stripeEventWorkflow(ctxt: WorkflowContext, subscriptionID: string, customerID: string) {
// Retrieve the updated subscription from Stripe
const status = await ctxt.invoke(Utils).getSubscriptionStatus(subscriptionID);
// Map the Stripe customer ID to DBOS Cloud user ID
const dbosAuthID = await ctxt.invoke(Utils).findAuth0UserID(customerID);
// Send a request to the DBOS Cloud admin API to change the user's subscription status
switch (status) {
case 'active':
case 'trialing':
await ctxt.invoke(Utils).updateCloudEntitlement(dbosAuthID, DBOSPlans.pro);
break;
case 'canceled':
case 'unpaid':
case 'paused':
await ctxt.invoke(Utils).updateCloudEntitlement(dbosAuthID, DBOSPlans.free);
break;
default:
ctxt.logger.info(`Do nothing for ${status} status.`);
}
}
The webhook endpoint simply starts this workflow, then responds with a 204 HTTP status code without waiting for the workflow to finish (Challenge 1). It can do this safely because, as we mentioned earlier, DBOS workflows are reliable–once started, they always run to completion regardless of interruptions, no message queue required.
To make sure each event is processed exactly-once in the presence of duplicates (Challenge 3), the webhook invokes the workflow using Stripe’s event.id as an idempotency key. Here’s the full production implementation of the webhook (source code):
@PostApi('/stripe_webhook')
static async stripeWebhook(ctxt: HandlerContext) {
// Verify the request is actually from Stripe
const req = ctxt.request;
const event = Utils.verifyStripeEvent(req.rawBody, req.headers);
switch (event.type) {
// Handle events when a user subscribes, cancels, or updates their subscription
case 'customer.subscription.created':
case 'customer.subscription.deleted':
case 'customer.subscription.updated': {
const subscription = event.data.object as Stripe.Subscription;
// Start the workflow with event.id as the idempotency key without waiting for it to finish
await ctxt.startWorkflow(Utils, event.id).stripeEventWorkflow(subscription.id, subscription.customer as string);
break;
}
// Handle the event when a user completes payment for a subscription
case 'checkout.session.completed': {
const checkout = event.data.object as Stripe.Checkout.Session;
if (checkout.mode === 'subscription') {
await ctxt.startWorkflow(Utils, event.id).stripeEventWorkflow(checkout.subscription as string, checkout.customer as string);
}
break;
}
default:
ctxt.logger.info(`Unhandled event type ${event.type}`);
}
}
Simple and Scalable Deployment with DBOS Cloud
We host the cloud subscription app serverlessly on DBOS Cloud. Using a serverless solution dramatically simplifies cloud deployment, as instead of provisioning fixed infrastructure we can simply deploy the app and rely on the cloud platform to scale it. Here’s the entire production deployment script (source code) (GitHub action):
#!/bin/bash
npx dbos-cloud login --with-refresh-token ${DBOS_DEPLOY_REFRESH_TOKEN}
npx dbos-cloud db status ${DBOS_APP_DB_NAME} > /dev/null
if [[ $? -ne 0 ]]; then
npx dbos-cloud db provision ${DBOS_APP_DB_NAME} -U subscribe -W ${DBOS_DB_PASSWORD}
fi
npx dbos-cloud app status
if [[ $? -ne 0 ]]; then
npx dbos-cloud app register -d ${DBOS_APP_DB_NAME}
fi
npx dbos-cloud app deploy
Use DBOS Transact and DBOS Cloud for free
To get started with DBOS Transact, check out the quickstart and docs. After you’ve built an application, you can serverlessly deploy it to DBOS Cloud for free. To join our community:
- Check out our code and contribute on GitHub.
- Join us on Discord.
The entire production subscription management app is open-source and available here.
Top comments (4)
This is great @qianl15 . Good explanation of DBOS 👍
Thanks Varshith! 🤗
This is a very insightful post! Could you dive deeper into how the DBOS Transact handles duplicate Stripe events?
Thank you! In our case, we invoke the event processing workflow using Stripe’s event.id as an idempotency key. DBOS Transact records the event.id when a workflow starts. If a duplicate Stripe event arrives, DBOS Transact checks if the idempotency key (event.id) has been processed before and doesn't start a new workflow. Therefore, Transact guarantees a workflow executes exactly once.