Dylan HUANG

Posted on Mar 11 • Edited on Mar 17 • Originally published at nanobanana2.com

I Built a Free AI Image Generator That Knows What's Happening Right Now

#ai #webdev #nextjs #cloudflare

I Built a Free AI Image Generator That Knows What's Happening Right Now

Every AI image generator has the same blind spot: they don't know what's happening today.

Ask Midjourney to generate "Oscar 2026 Best Picture poster" — it hallucinates. Ask DALL-E for "cherry blossom forecast Japan 2026" — it guesses. These models are frozen at their training cutoff.

I built Nano Banana 2 to fix this. It uses Google's Gemini image models with Web Search Grounding — meaning the AI can query Google Search in real-time during image generation.

What Web Search Grounding Actually Does

When you type a prompt on Nano Banana 2, the AI doesn't just rely on what it learned during training. It:

Parses your intent — figures out what real-world information would help
Searches Google — pulls relevant, current data
Generates with context — creates images informed by real-time knowledge

For example:

"Samsung Galaxy S26 Ultra in a coffee shop" → knows the actual phone design
"Super Bowl 2026 halftime show poster" → references the real performers
"trending anime style March 2026" → adapts to what's actually trending

This is fundamentally different from every other image generator on the market.

The Numbers (4 months in)

Since launching in November 2025:

Metric	Value
Registered users	4,700+
Activation rate	77%
Most used feature	Edit mode (80% of generations)
Revenue model	Freemium ($0.90 - $29.90/mo)
Traffic source	100% organic / Google Search
Paid ad spend	$0 (until last week)

The 77% activation rate surprised me. Most SaaS products struggle to get 20-30%. I think it's because we give 5 free credits with no login required — users see value before they commit anything.

Tech Stack Deep Dive

Frontend & Runtime

Next.js (App Router) → OpenNext → Cloudflare Workers

Why Cloudflare Workers instead of Vercel? Cost. At our scale, Vercel would cost 3-5x more. OpenNext bridges the gap, though it comes with quirks:

No Node.js native modules: Every SDK needs explicit HTTP client configuration. We learned this the hard way when Stripe's new Stripe() silently failed for 24 hours because it tried to use Node's http module. Fix: httpClient: Stripe.createFetchHttpClient().
Cold start optimization: Workers have ~5ms cold starts vs Lambda's ~200ms. Users notice.
Global edge deployment: Every request hits the nearest Cloudflare PoP. Our TTFB is under 50ms worldwide.

Database

Supabase (PostgreSQL) — chosen for:

Real-time subscriptions (used for generation status updates)
Row-level security (users can only see their own data)
REST API that works perfectly in edge runtimes (no connection pooling issues)

AI Pipeline

User prompt → Gemini API (with grounding) → Image generation
                                          → Quality check
                                          → 4K upscaling (optional)

The magic is in the grounding configuration. We pass google_search_retrieval as a tool, and Gemini decides when to search based on the prompt content.

Image Generation Cost Structure

Google's Gemini image generation API has a free tier, but at our current volume we're in the paid tier. The per-image cost is low enough to maintain healthy margins with our pricing model, but it's not zero.

Our main costs: API usage (variable, scales with generations), hosting ($45/mo), database ($28/mo), email ($20/mo), plus Stripe's 3.4% transaction fee and video generation APIs for the ~1% of users who generate video.

11-Language i18n

We support English, Chinese, Korean, Japanese, Portuguese, Russian, Indonesian, Arabic, German, French, and Spanish. This wasn't just a nice-to-have — our user base is spread across 23+ countries.

Implementation: next-intl with JSON message files per locale. The tricky part is SEO — each language gets its own URL prefix (/en/, /zh/, /ko/), which multiplies our sitemap from ~70 pages to 733.

Edit Mode: The Killer Feature

80% of our generations use Edit mode — upload a photo, describe what you want changed, and the AI applies it. No Photoshop skills needed.

Use cases we've seen:

E-commerce: Product photos in different backgrounds (our top paying customer is a Singapore bedding company)
Architecture: Visualizing buildings in different seasons/weather
Academic: Beautifying charts and figures for papers
Content creation: Quick social media assets

We didn't plan to be an editing tool. But users told us what they wanted by how they used the product.

What I'd Do Differently

Validate payment flow on every deploy: One bad deploy broke Stripe for 24 hours. We now have a webhook reconciliation cron that runs every 2 hours.
Email domain verification from day 1: Our welcome emails silently failed for 4 months because we hadn't verified the sender domain in Resend.
Don't use ORMs in edge runtimes: Drizzle ORM's async writes silently fail on Cloudflare Workers. We switched to direct REST API calls.

Try It

Nano Banana 2 — 5 free credits, no login required. Generate your first AI image in 10 seconds.

The product is built by a solo developer. If you're interested in the technical details or have questions about building on Cloudflare Workers, I'm happy to chat in the comments.

If you found this useful, consider sharing it — it helps a solo developer keep building. 🙏

DEV Community

I Built a Free AI Image Generator That Knows What's Happening Right Now

I Built a Free AI Image Generator That Knows What's Happening Right Now

What Web Search Grounding Actually Does

The Numbers (4 months in)

Tech Stack Deep Dive

Frontend & Runtime

Database

AI Pipeline

Image Generation Cost Structure

11-Language i18n

Edit Mode: The Killer Feature

What I'd Do Differently

Try It

Top comments (0)