<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: mithileshchellappan</title>
    <description>The latest articles on DEV Community by mithileshchellappan (@mithileshchellappan).</description>
    <link>https://dev.to/mithileshchellappan</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F624820%2F51ced713-f0be-4546-a8e7-4c232a25fd7d.png</url>
      <title>DEV Community: mithileshchellappan</title>
      <link>https://dev.to/mithileshchellappan</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mithileshchellappan"/>
    <language>en</language>
    <item>
      <title>Exactly-once billing for LLM apps with one DynamoDB transaction</title>
      <dc:creator>mithileshchellappan</dc:creator>
      <pubDate>Mon, 29 Jun 2026 06:22:01 +0000</pubDate>
      <link>https://dev.to/mithileshchellappan/exactly-once-billing-for-llm-apps-with-one-dynamodb-transaction-39e4</link>
      <guid>https://dev.to/mithileshchellappan/exactly-once-billing-for-llm-apps-with-one-dynamodb-transaction-39e4</guid>
      <description>&lt;p&gt;If your product sells credits that users spend on AI calls, there's a decent chance you've already shipped this bug without noticing. It never shows up in a demo. It shows up in production, on your provider invoice.&lt;/p&gt;

&lt;p&gt;Here's the bug, and the database pattern that makes it impossible instead of just unlikely.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bug everyone ships first
&lt;/h2&gt;

&lt;p&gt;Every AI product sells usage somehow: credit packs, token bundles, plan quotas. And most of them enforce it the same wrong way.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getBalance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;// read&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;cost&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;        &lt;span class="c1"&gt;// compare&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;charge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;cost&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                 &lt;span class="c1"&gt;// write&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Read, compare, write. It looks fine in review and works in every demo. Then two requests land in the same millisecond, both read the same balance, both pass the check, and both charge. A user with 5 credits fires 20 parallel requests and gets 20 answers. You eat the provider bill for the other 15.&lt;/p&gt;

&lt;p&gt;This is the default failure mode of "budget" features in LLM gateways, because the check and the write are two separate operations with a gap between them. You can paper over it with locks, but a lock on a hot per-user row brings its own latency and contention problems.&lt;/p&gt;

&lt;p&gt;The fix isn't a better check. It's removing the gap.&lt;/p&gt;

&lt;h2&gt;
  
  
  Billing enforcement is a transaction, not middleware
&lt;/h2&gt;

&lt;p&gt;Card networks solved this a long time ago with authorize, capture, release. Before the goods ship, you place a hold for the estimated maximum. When the real cost is known, you capture it and release the remainder. The authorization is atomic, so there's no window where two charges both think the money is there.&lt;/p&gt;

&lt;p&gt;The same three steps map cleanly onto an LLM request:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Estimate a max cost (input tokens plus &lt;code&gt;max_tokens&lt;/code&gt;, priced at the model's rate) and place a hold.&lt;/li&gt;
&lt;li&gt;Stream the response.&lt;/li&gt;
&lt;li&gt;Capture the actual cost from the provider's reported usage, and release the difference back to the wallet.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The important part is step 1. Make it a single atomic, conditional operation and concurrency can't slip through. DynamoDB does this natively.&lt;/p&gt;

&lt;h2&gt;
  
  
  The reserve: one TransactWriteItems
&lt;/h2&gt;

&lt;p&gt;A wallet is one item, keyed by user. The reserve is a single &lt;code&gt;TransactWriteItems&lt;/code&gt; with a conditional decrement plus an idempotency record.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ddb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TransactWriteCommand&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;TransactItems&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;Update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;PK&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`T#&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;SK&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`U#&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;uid&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="c1"&gt;// Decrement only if the balance covers the hold, the user is active,&lt;/span&gt;
        &lt;span class="c1"&gt;// and the period quota isn't exhausted, all in one expression.&lt;/span&gt;
        &lt;span class="na"&gt;ConditionExpression&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;bal &amp;gt;= :hold AND #st = :active AND (attribute_not_exists(#q) OR #q &amp;lt; :qmax)&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;UpdateExpression&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;SET bal = bal - :hold ADD #q :one&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;ExpressionAttributeNames&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#st&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;status&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#q&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`q_&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;period&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_fast`&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;ExpressionAttributeValues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;:hold&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;hold&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;:active&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;active&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;:qmax&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;qmax&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;:one&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;ReturnValuesOnConditionCheckFailure&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ALL_OLD&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Exactly-once: the idempotency key can be written at most once.&lt;/span&gt;
      &lt;span class="na"&gt;Put&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;Item&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;PK&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`T#&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;SK&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`IDEM#&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;idemKey&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reqId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;ttl&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;ConditionExpression&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;attribute_not_exists(PK)&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;}))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The condition and the write are the same operation. Under any number of concurrent requests, DynamoDB serializes the conditional updates on that item: the first N that fit suceed, and the rest fail the &lt;code&gt;ConditionExpression&lt;/code&gt;. No lock, no read-modify-write loop in your app. And because it's a transaction, the wallet decrement and the idempotency &lt;code&gt;Put&lt;/code&gt; either both happen or neither does.&lt;/p&gt;

&lt;p&gt;When the transaction is cancelled, you read &lt;code&gt;CancellationReasons&lt;/code&gt; to map the failure to the right HTTP status:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;condition failed on the wallet item maps to &lt;code&gt;402 insufficient_credits&lt;/code&gt; (or &lt;code&gt;429&lt;/code&gt; if it was the quota counter; inspect &lt;code&gt;ALL_OLD&lt;/code&gt; to tell them apart)&lt;/li&gt;
&lt;li&gt;condition failed on the idempotency item maps to &lt;code&gt;409 duplicate_request&lt;/code&gt;: return the original result and charge nothing&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Capture and release: no dangling money
&lt;/h2&gt;

&lt;p&gt;The hold is deliberately conservative. It caps output at &lt;code&gt;max_tokens&lt;/code&gt; and pads the input estimate. On completion you capture the real cost and refund the rest.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;actual&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;creditsForUsage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;billedOutputTokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;inputTokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;refund&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;hold&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;actual&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;         &lt;span class="c1"&gt;// &amp;gt;= 0 by construction&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;ddb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UpdateCommand&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;walletKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;UpdateExpression&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;SET bal = bal + :refund&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;ExpressionAttributeValues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;:refund&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;refund&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every error path releases the full hold. The invariant is that no code path exits withmoney still held. As a backstop, the hold and idempotency items carry a TTL, so an abandoned request expires itself. TTL as a financial primitive.&lt;/p&gt;

&lt;p&gt;One LLM-specific gotcha: reasoning models bill "thinking" tokens that don't show up in &lt;code&gt;completion_tokens&lt;/code&gt;. Charge &lt;code&gt;max(completion_tokens, total_tokens - prompt_tokens)&lt;/code&gt; so reasoning gets captured no matter how the provider buckets it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The proof: 50 parallel requests, exactly 5 succeed
&lt;/h2&gt;

&lt;p&gt;The test that matters: seed a wallet with 5 credits, fire 50 parallel reserves of 1 credit each, and assert that exactly 5 succeed and the balance lands on exactly 0.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;firing 50 parallel reserves of 1000 milli-credits each…
  insufficient_credits: 45
  ok: 5
  final balance: 0 milli-credits
RACE TEST PASSED
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No oversell, no undersell, no negative balance. It holds across any number of stateless gateway instances, because the correctness lives in the database, not in process memory.&lt;/p&gt;

&lt;p&gt;A practical note from running this against real DynamoDB: under heavy same-item contention you'll see &lt;code&gt;TransactionConflict&lt;/code&gt; cancellations. Those mean "the conditions weren't evaluated," not "the condition failed." Retry them with jittered backoff, and only treat &lt;code&gt;ConditionalCheckFailed&lt;/code&gt; as a verdict. The SDK does not auto-retry &lt;code&gt;TransactionConflict&lt;/code&gt; for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why DynamoDB for this, and Aurora DSQL for the ledger
&lt;/h2&gt;

&lt;p&gt;The money-critical operation is a conditional update on a single, highly contended item: one user's wallet under parallel fire. That's DynamoDB's native workload. You get flat single-digit-millisecond conditional writes regardless of table size, plus &lt;code&gt;TransactWriteItems&lt;/code&gt; to bundle the decrement, the quota increment, and the idempotency record into one atomic unit.&lt;/p&gt;

&lt;p&gt;The financial record is a different shape. A double-entry ledger, config history, and an audit log want SQL, with joins and time-range queries. That lives in Aurora DSQL (Postgres, strong consistency, optimistic concurrency). OCC is basically free for an append-only ledger, since appends never conflict. It would abort-storm on a hot wallet row, though, which is exactly why the wallet stays in DynamoDB. Each store gets the workload it was built for, and the ledger write happens off the request latency path (via Vercel's &lt;code&gt;after()&lt;/code&gt;), with a DynamoDB-backed retry enqueue if it ever fails.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F888oj16d6qkcee0fg9y2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F888oj16d6qkcee0fg9y2.png" alt="Architecture: the hot path enforces credits in one DynamoDB transaction; the ledger settles asynchronously in Aurora DSQL" width="800" height="528"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This billing core is one piece of a larger project I'm building, Switchboard: a bring-your-own-key LLM gateway where the integration is two lines (point your SDK at the gateway, swap the model name for a flag), and everything else lives in a dashboard. Pricing, routing, A/B tests, kill switches, and refills via Stripe and RevenueCat&lt;br&gt;
webhooks. The billing pattern above is the part I'd reach for in any product that sells usage, gateway or not.&lt;/p&gt;

&lt;p&gt;So that's the idea in one line: don't enforce budgets in middleware that reads then writes. Make the authorization a conditional transaction, and overspend stops being a bug you catch after the fact. It becomes a state the database refuses to enter.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built for the H0 Hackathon (Hack the Zero Stack with Vercel v0 and AWS Databases).&lt;br&gt;
Switchboard runs on Amazon DynamoDB and Aurora DSQL, deployed on Vercel, scaffolded with&lt;br&gt;
v0. #H0Hackathon&lt;/em&gt;&lt;/p&gt;

</description>
      <category>h0</category>
      <category>v0</category>
      <category>dynamodb</category>
      <category>awschallenge</category>
    </item>
  </channel>
</rss>
