When building an AI SaaS like Nano Banana Pro, you quickly run into two major engineering headaches that simple tutorials don't cover:
- The "Hot Potato" Asset Problem: AI APIs return temporary URLs that expire. You need to persist them to your own storage (S3/R2) without bankrupting your backend bandwidth.
- The "Credit Race Condition": How do you deduct credits for a job that takes 30 seconds to complete and might fail?
Here is how we solved these problems in our GenerationService.
1. The Hybrid "Frontend-First" Media Pipeline
Most AI providers (like Fal, Replicate, etc.) return a temporary URL to the generated image. If you don't save it, it's gone in an hour.
The Naive Approach:
Backend receives the webhook -> Backend downloads the file (10MB) -> Backend uploads to S3 -> Backend saves URL.
Result: Your server bandwidth bill explodes, and your Node.js event loop gets blocked by I/O.
Our Solution: Optimistic Frontend Transfer with Backend Fallback
We implemented a hybrid strategy in GenerationService.ts.
Step A: The Frontend Try
When the generation finishes, we send the temporary URL to the client. The frontend immediately attempts to upload it to our R2 storage using a presigned URL. This offloads 90% of the bandwidth cost to the user's browser.
Step B: The Backend Safety Net
But what if the user closes the tab? Or their network fails?
We can't lose the user's paid generation.
In GenerationService.completeGeneration, we schedule a delayed check:
// lib/services/generation.ts (Simplified)
private static scheduleTransferCheck(mediaFileId: string, originalUrl: string) {
// Wait 60 seconds
setTimeout(async () => {
const mediaFile = await MediaFileModel.findByUuid(mediaFileId);
// Check if the URL is still the external provider's URL
const isStillExternal = !mediaFile.cdn_url.includes(process.env.STORAGE_DOMAIN);
if (isStillExternal) {
console.log('Frontend transfer failed, executing backend fallback...');
// Backend takes over: Download -> Upload to R2 -> Update DB
await FileTransferService.transferFromUrl(originalUrl, ...);
}
}, 60 * 1000);
}
This "lazy check" ensures that we only pay for bandwidth when absolutely necessary (the fallback scenario), while guaranteeing data persistence.
2. Transactional Credit System (Freeze & Consume)
Deducting credits before generation is bad UX (if it fails, you have to refund).
Deducting after is risky (user might spend the same credits twice in parallel requests).
We implemented a Two-Phase Commit pattern using a freeze_token.
Phase 1: The Freeze
When a user requests a generation, we don't deduct the balance. We freeze it.
// Phase 1: Start Generation
const freezeToken = await CreditTransactionService.freezeCredits({
userUuid,
amount: cost,
reason: 'Generation Pending',
});
// Store the token with the job
await GenerationModel.create({
...,
freeze_token: freezeToken
});
Phase 2: The Commit (or Rollback)
Depending on the outcome, we resolve the frozen state.
On Success:
// Phase 2a: Job Succeeded
await CreditTransactionService.unfreezeCredits({
freezeToken,
action: 'confirm', // Actually deducts the balance
reason: 'Generation completed',
});
On Failure:
// Phase 2b: Job Failed
await CreditTransactionService.unfreezeCredits({
freezeToken,
action: 'rollback', // Releases the freeze, balance returns to original
reason: 'Generation failed',
});
This ensures that a user with 10 credits can't start five 5-credit jobs simultaneously. The first two will freeze the balance, and the subsequent requests will fail with "Insufficient Funds" before they even hit the GPU.
Conclusion
Building Nano Banana Pro wasn't just about wrapping an API. It was about building a resilient system that handles the specific quirks of AI workloads—long latencies, large files, and high failure rates.
The Hybrid Transfer Strategy alone saved us significant infrastructure costs, while the Freeze/Consume pattern eliminated race conditions in our billing logic.
If you're interested in the specific Drizzle schemas or the R2 integration code, let me know in the comments!
Top comments (0)