DEV Community

Cover image for Engineering Nano Banana Pro: Hybrid Media Pipelines and Transactional Credits
Nico Hayes
Nico Hayes

Posted on

Engineering Nano Banana Pro: Hybrid Media Pipelines and Transactional Credits

When building an AI SaaS like Nano Banana Pro, you quickly run into two major engineering headaches that simple tutorials don't cover:

  1. The "Hot Potato" Asset Problem: AI APIs return temporary URLs that expire. You need to persist them to your own storage (S3/R2) without bankrupting your backend bandwidth.
  2. The "Credit Race Condition": How do you deduct credits for a job that takes 30 seconds to complete and might fail?

Here is how we solved these problems in our GenerationService.

1. The Hybrid "Frontend-First" Media Pipeline

Most AI providers (like Fal, Replicate, etc.) return a temporary URL to the generated image. If you don't save it, it's gone in an hour.

The Naive Approach:
Backend receives the webhook -> Backend downloads the file (10MB) -> Backend uploads to S3 -> Backend saves URL.
Result: Your server bandwidth bill explodes, and your Node.js event loop gets blocked by I/O.

Our Solution: Optimistic Frontend Transfer with Backend Fallback

We implemented a hybrid strategy in GenerationService.ts.

Step A: The Frontend Try

When the generation finishes, we send the temporary URL to the client. The frontend immediately attempts to upload it to our R2 storage using a presigned URL. This offloads 90% of the bandwidth cost to the user's browser.

Step B: The Backend Safety Net

But what if the user closes the tab? Or their network fails?
We can't lose the user's paid generation.

In GenerationService.completeGeneration, we schedule a delayed check:

// lib/services/generation.ts (Simplified)

private static scheduleTransferCheck(mediaFileId: string, originalUrl: string) {
  // Wait 60 seconds
  setTimeout(async () => {
    const mediaFile = await MediaFileModel.findByUuid(mediaFileId);

    // Check if the URL is still the external provider's URL
    const isStillExternal = !mediaFile.cdn_url.includes(process.env.STORAGE_DOMAIN);

    if (isStillExternal) {
      console.log('Frontend transfer failed, executing backend fallback...');
      // Backend takes over: Download -> Upload to R2 -> Update DB
      await FileTransferService.transferFromUrl(originalUrl, ...);
    }
  }, 60 * 1000);
}
Enter fullscreen mode Exit fullscreen mode

This "lazy check" ensures that we only pay for bandwidth when absolutely necessary (the fallback scenario), while guaranteeing data persistence.

2. Transactional Credit System (Freeze & Consume)

Deducting credits before generation is bad UX (if it fails, you have to refund).
Deducting after is risky (user might spend the same credits twice in parallel requests).

We implemented a Two-Phase Commit pattern using a freeze_token.

Phase 1: The Freeze

When a user requests a generation, we don't deduct the balance. We freeze it.

// Phase 1: Start Generation
const freezeToken = await CreditTransactionService.freezeCredits({
  userUuid,
  amount: cost,
  reason: 'Generation Pending',
});

// Store the token with the job
await GenerationModel.create({
  ...,
  freeze_token: freezeToken
});
Enter fullscreen mode Exit fullscreen mode

Phase 2: The Commit (or Rollback)

Depending on the outcome, we resolve the frozen state.

On Success:

// Phase 2a: Job Succeeded
await CreditTransactionService.unfreezeCredits({
  freezeToken,
  action: 'confirm', // Actually deducts the balance
  reason: 'Generation completed',
});
Enter fullscreen mode Exit fullscreen mode

On Failure:

// Phase 2b: Job Failed
await CreditTransactionService.unfreezeCredits({
  freezeToken,
  action: 'rollback', // Releases the freeze, balance returns to original
  reason: 'Generation failed',
});
Enter fullscreen mode Exit fullscreen mode

This ensures that a user with 10 credits can't start five 5-credit jobs simultaneously. The first two will freeze the balance, and the subsequent requests will fail with "Insufficient Funds" before they even hit the GPU.

Conclusion

Building Nano Banana Pro wasn't just about wrapping an API. It was about building a resilient system that handles the specific quirks of AI workloads—long latencies, large files, and high failure rates.

The Hybrid Transfer Strategy alone saved us significant infrastructure costs, while the Freeze/Consume pattern eliminated race conditions in our billing logic.


If you're interested in the specific Drizzle schemas or the R2 integration code, let me know in the comments!

Top comments (0)