You know that feeling when you find a better tool, but the pricing page makes you close the tab?
I run RepoClip, a SaaS that generates promo videos from GitHub repos using AI. The pipeline analyzes source code with Gemini, generates scene images, adds narration, and renders a video — all automatically. Images are at the center of the output quality. Every video has 6 scene images, and they're what users see first.
I'd been using FLUX.2 [dev] via Fal.ai for image generation since launch. It worked. Then I saw that Nano Banana 2 — Google's new image model — had landed on Fal.ai. I decided to test it with the exact same prompt.
The results were not close.
Same Prompt, Different Universe
I built a comparison script that runs the same prompt through both models across 4 visual styles (Tech, Realistic, Minimal, Vibrant). Here's the prompt:
"A digital dashboard showing interconnected data nodes and flowing information streams, representing an intelligent automation platform that connects multiple services and workflows seamlessly"
Tech Style
| FLUX.2 | Nano Banana 2 |
|---|---|
![]() |
![]() |
FLUX.2 gives you a flat infographic with a UI overlay. Nano Banana 2 produces a cinematic, three-dimensional data flow with depth and lighting that looks like concept art.
Realistic Style
| FLUX.2 | Nano Banana 2 |
|---|---|
![]() |
![]() |
FLUX.2 renders a network graph that looks like a screenshot from a monitoring tool. Nano Banana 2 creates a photorealistic control room scene — you can almost feel the screen glow.
Vibrant Style
| FLUX.2 | Nano Banana 2 |
|---|---|
![]() |
![]() |
This is where the gap is widest. FLUX.2 gives a cartoon explosion of color. Nano Banana 2 produces a bold, structured composition with neon circuit aesthetics that actually looks intentional.
The difference wasn't subtle. Every style, every prompt — Nano Banana 2 was producing images that looked like they belonged in a final product, not a prototype.
But then I checked the pricing.
The 6.7x Problem
| FLUX.2 [dev] | Nano Banana 2 | |
|---|---|---|
| Cost per image | ~$0.012 | ~$0.08 |
| Generation time | ~7.7s | ~31.3s |
That's a 6.7x increase in per-image cost and 4x slower generation. For a bootstrapped SaaS, these numbers don't inspire confidence.
My gut said "too expensive." But I'd learned from past experience that gut feelings about costs are usually wrong — you need actual numbers.
Running the Simulation
Instead of staring at a spreadsheet, I asked Claude Code to run a scenario:
Assumptions:
- 10 Starter users ($29/mo each) generating 50 videos total
- 3 Pro users ($79/mo each) generating 50 videos total
- 6 images per video
- Only revenue: subscription fees. Only cost: image generation API.
FLUX.2 Nano Banana 2
─────────────────────────────────────────────────
Images generated 600 600
Image cost $7.20 $48.00
Monthly revenue $527.00 $527.00
Profit $519.80 $479.00
Margin 98.6% 90.9%
The difference was $40.80/month. That's it.
Even in this conservative scenario — just 13 paying users, no free tier buffer, ignoring all other operational costs — the profit impact was under 8 percentage points. With real-world numbers where hosting, AI analysis, and TTS costs dominate the bill, image generation was a rounding error either way.
The quality gap was obvious. The cost gap was negligible. Decision made.
Problem #1: File Size Explosion
The first thing I noticed after switching: Nano Banana 2's PNG outputs were massive.
| Format | File Size (same prompt) |
|---|---|
| FLUX.2 PNG | 978 KB |
| Nano Banana 2 PNG | 2,043 KB |
| Nano Banana 2 JPEG | 401 KB |
Nano Banana 2 PNGs were 2x larger than FLUX.2. For a video pipeline that downloads 6 images per generation and passes them to a Lambda renderer, this matters — both for speed and storage costs.
The fix was one line:
const result = await fal.subscribe("fal-ai/nano-banana-2", {
input: {
prompt: enhancedPrompt,
aspect_ratio: aspectRatio,
resolution: "1K",
output_format: "jpeg", // <-- this
num_images: 1,
},
});
Since these images are frames in a video (which itself uses lossy H.264 compression), lossless PNG was overkill from the start. JPEG at default quality cut the size by 80% with no visible difference in the final rendered video.
Problem #2: The 4x Slower Generation
Nano Banana 2 takes ~31 seconds per image versus FLUX.2's ~8 seconds. For 6 scenes, that's 186 seconds sequential — over 3 minutes just on images.
But I was already generating all scene images in parallel:
const images = await Promise.all(
scenes.map(scene => generateImage(scene.imagePrompt, aspectRatio, visualStyle))
);
With Promise.all, the wall-clock time is the slowest single image, not the sum. In practice, that's ~36 seconds — about 28 seconds more than before. Against a pipeline timeout of 15 minutes, this was a non-issue.
If you're calling AI APIs sequentially and wondering why things are slow, this is your sign to parallelize.
Problem #3: Brand Logo Hallucination
This one surprised me. Nano Banana 2 is significantly better at rendering recognizable imagery — and that's not always a good thing.
When the scene prompt contained words like "GitHub" or "Python," FLUX.2 would generate abstract tech art. Nano Banana 2 would render the actual Octocat logo or a realistic Python logo. For a product that generates promotional videos, having trademarked imagery appear in user content is a liability.
The fix was adding explicit exclusion instructions to every prompt:
const enhancedPrompt = `${prompt}. ${stylePrompt}. No text, no UI elements, no screenshots, no logos, no brand imagery, no mascots.`;
The last three exclusions (no logos, no brand imagery, no mascots) were added specifically for Nano Banana 2. FLUX.2 never needed them because it wasn't capable enough to reproduce them.
The Migration Diff
The actual code change was small. Here's the core of fal.ts before and after:
- const result = await fal.subscribe("fal-ai/flux-2", {
+ const result = await fal.subscribe("fal-ai/nano-banana-2", {
input: {
prompt: enhancedPrompt,
- image_size: "landscape_16_9",
- num_inference_steps: 28,
+ aspect_ratio: aspectRatio,
+ resolution: "1K",
+ output_format: "jpeg",
num_images: 1,
},
});
Different model, different API shape, but the wrapper function signature stayed the same. Downstream code didn't change at all.
What I Learned
1. Always simulate before you panic. A 6.7x per-unit cost increase sounds terrifying in isolation. In the context of actual revenue and usage patterns, it was noise. Run the numbers before making decisions based on sticker shock.
2. Lossy formats are fine for intermediate assets. If your images are being consumed by a video encoder, compressed for web display, or otherwise transformed downstream, PNG is a waste of bandwidth. Match the format to the use case.
3. Better models bring new problems. FLUX.2 couldn't render brand logos, so I never had to worry about it. Nano Banana 2 can, so now I need explicit exclusion prompts. Capability upgrades aren't just free wins — they shift the problem space.
4. Parallelization absorbs latency. A 4x slower model barely matters when you're already running requests concurrently. Design for parallelism from the start and model speed becomes a secondary concern.
The Stack
- Framework: Next.js (App Router) + TypeScript
- Orchestration: Inngest
- Code Analysis: Gemini 2.5 Flash
- Image Generation: Nano Banana 2 via Fal.ai
- Narration: OpenAI TTS
- Video Rendering: Remotion Lambda
- Database/Auth/Storage: Supabase
- Deployment: Vercel
Try It
If you want to see what Nano Banana 2 produces in a real pipeline, try it on your own repo: repoclip.io
The free tier gives you 2 videos/month. Paste a GitHub URL and you'll have a narrated demo video in a few minutes.
I'd love to hear from the community:
- Have you switched AI models in production and been surprised by the cost impact?
- What's your approach to evaluating model upgrades — vibes, benchmarks, or simulations?
Drop a comment or find me on GitHub.
The comparison images in this post were generated with the exact same prompt, same visual style settings, same pipeline. The only variable was the model. Sometimes the upgrade really is worth it.






Top comments (0)