Guillaume Vernade for Google AI

Posted on Mar 5 • Edited on Mar 20

Getting the most out of Nano-Banana 2: tips & prompt guide

#ai #gemini #nanobanana #promptengineering

Following our previous Developer Guide and Prompting Guide, this post dives into the brand new capabilities of Nano-Banana 2 (aka. "Gemini 3.1 Flash Image"), when you should (and shouldn't) use it, and how to prompt its newest features effectively.

Here's what you'll find in this article:

The Model Matrix: Nano-Banana 1 vs. 2 vs. Pro
The Game Changer: Visual Grounding with Google Search
New Parameters: Extreme Ratios & 512px Resolutions
Controlling "Thinking" Mode
Prompt Examples
What about apps?

1. The Model Matrix: Nano-Banana 1 vs. 2 vs. Pro

With three distinct models now in the Nano-Banana lineup, choosing the right engine for your specific workflow is crucial. Here is how the new Nano-Banana 2 fits into the ecosystem.

Nano-Banana 1 vs. Nano-Banana 2

Don't count Nano-Banana 1 out just yet. If you have an existing application or workflow that uses Nano-Banana 1 and it is handling your use cases perfectly, stick with it! There is no forced migration (yet...).

Nano-Banana 1 remains the absolute cheapest option and is still faster than Nano-Banana 2 since it's not a thinking model. However, for any new pipeline that requires more nuance, better prompt adherence, or the new Image Grounding features, Nano-Banana 2 is absolutely worth the slight bump in price.

Also you'll save on having to migrate from NB to NB2 in the future, so start testing your prompts on the new model instead of the old one.

Pro-tip: Create 512px images with NB2 to keep more or less the same prices as NB1.

Nano-Banana Pro vs. Nano-Banana 2

The biggest question for developers and creators right now is: Why use Pro if 2 is so good?

Think of Nano-Banana 2 (Gemini-3.1-Flash) as offering roughly 95% of Pro's capabilities at a fraction of the cost. For almost all new projects, Nano-Banana 2 should be your immediate default. It handles text rendering, complex styles, and the new visual grounding exceptionally well.

You should only step up to Nano-Banana Pro when you hit a wall. If Nano-Banana 2 consistently fails a highly complex, multi-layered prompt, or struggles with extreme logical constraints, Pro remains the ultimate heavy lifter.

(Note: If you find specific edge cases where Pro consistently beats Nano-Banana 2, please drop them in the comments! We need to know what to improve.)

The Summary Matrix

Here is a quick reference guide to help you route your API calls:

2. The Game Changer: Visual Grounding with Google Search

While Nano-Banana Pro introduced the ability to search the web for textual information, Nano-Banana 2 takes a massive leap forward: Image Grounding.

The model can now search the internet for specific images to understand exactly what a real-world subject looks like before generating it. This is incredibly powerful when you need to represent specific locations, monuments, or highly specific biological species just as they appear in reality.

Best Practices:

Locations: Ask for specific churches, bridges, city squares, or niche buildings.
Nature: Ask for exact animal species, breeds, or insects.
Limitation to keep in mind: The model cannot search for people.

Example Prompts:

Specific Location Grounding:
"Generate a cinematic, golden-hour photograph of the main historical church in Voiron, France. Ensure the architectural details, the spire, the surrounding square, and the landscape (mountains) are accurate to reality." (change the city for your hometown)

Specific Species Grounding:
"Create a realistic picture of a machaon butterfly and a flambé one, and highlight their differences to show how to differentiate them."

If you want to know how to use this new image grounding tool in code, check the documentation or this Python colab from our cookbook.

3. New Parameters: Extreme Ratios & 512px Resolutions

Nano-Banana 2 introduces several new parameters that give developers and creators tighter control over output formats and cost optimization.

The 512px Batch-to-Upscale Workflow

Nano-Banana 2 introduces the ability to generate images at 512-pixel resolutions. With these new resolutions, the generation is slightly faster and the cost is driven down to roughly the same price as Nano-Banana 1.

Pro-tip: If you are a developer looking to optimize your costs while maintaining high-end output, here is the golden workflow:

Use the Batch API (which gives a 50% discount) to generate dozens of variations of your prompt at 512px.

Review the grid and select the absolute best composition.

Ask Nano-Banana 2 to upscale that specific image to 1K, 2K, or 4K.

Extreme Aspect Ratios (1:8 & 1:4)

Nano-Banana 2 also introduces extreme new aspect ratios—1:8 and 1:4—available in both vertical and horizontal formats. These are perfect for web banners, continuous scrolling assets, and comic book (BD) layouts.

Example Prompt:

Horizontal Comic Strip:
"Create a 4-panel horizontal comic strip (aspect ratio 4:1). The story follows a mischievous cat trying to steal a fish from a kitchen counter that ends with a twist. Use a vibrant, Franco-Belgian comic book style. Keep the cat's design consistent across all panels."

4. Controlling "Thinking" Mode

Like its predecessor, Nano-Banana 2 has a "Thinking" mode where it reasons about the prompt before generating. However, you can now toggle this feature ON or OFF.

My Recommendation: Keep it OFF by default.
For standard image generation, turning it off saves time and processing. You should only turn Thinking ON if:

The model is generating nonsensical results and needs help reasoning through the prompt.
You are generating highly complex infographics.
You are combining complex Image Grounding with spatial reasoning.

(Again, if you find amazing use-cases where turning "Thinking" ON completely changes the game, let me know in the comments!)

5. Prompt Examples

A nano-banana guide without some prompt examples would be like a meal without cheese, so here are my favorite ones at the moment:

Cartoon Portraits: Transform personal photos into stylized, high-fidelity 3D characters interacting with their real-life selves.

Prompt: Based strictly on the uploaded reference image, create a photorealistic scene featuring the real human standing next to a giant 3D animation-style version of themselves. Both must have identical facial structures, clothing, and poses. The real person is smiling naturally with their hand on the 3D character's shoulder. The 3D version is proportionally larger, anatomically identical but stylized, with expressive eyes and a playful smirk. Clean gray-blue studio background, cinematic lighting, crisp textures. (Note: Requires uploading an image).

Animation to Image: Upload animated stills and utilize the model to interpret those outlines into hyper-realistic, photographic images.

Prompt: Convert this uploaded animated still into an ultra-realistic, cinematic, and fully photorealistic scene. Transform the animated characters into real humans while perfectly preserving their original identities, facial structures, outfits, expressions, and overall likeness. (Note: Requires uploading an image).

(original image from https://archive.org/details/mobile_suit_gundam_coloring_book)

History on Maps: Generate hyper-realistic, Maps-style street view imagery that "reimagines" historical events (like the 800 AD crowning of Charlemagne) as if captured by modern 360-degree cameras.

Prompt: Generate a hyper-realistic image of the crowning of Charlemagne on December 25, 800 AD, perfectly replicating a Google Maps Street View capture. Show Pope Leo III placing the imperial crown on a kneeling Charlemagne inside Old St. Peter's Basilica. Include a 123-degree wide-angle barrel distortion, a semi-transparent Google Maps UI overlay (navigation compass, 2D map thumbnail, white directional chevron arrows floating over the stone floor), and a '© Google 800' watermark. Automatically blur the faces of Charlemagne, the Pope, and surrounding medieval nobles for privacy. Use warm, dim torchlight and candlelight filtering through the basilica, dramatic shadows, and high-ISO digital noise typical of a 360-degree camera struggling in a low-light interior.

Kindergarten Filter: Celebrate human imperfection and childhood nostalgia by generating intentionally messy, waxy crayon doodles on lined paper.

Prompt: A child's crayon drawing on white lined notebook paper of maple taffy on snow. Use chunky wax-crayon strokes, wobbly outlines, and bright bold colors that messily overflow the lines. Include visible heavy pressure marks, waxy smudges, and uneven scribble shading. Draw important elements disproportionately large with simple flat shapes, round friendly faces, dot eyes, and big curved smiles. Add a classic large yellow sun in the corner, puffy clouds, and zero realistic perspective. Joyful, naive art style.

6. What about apps?

Now that you know the new capabilities of Nano-Banana 2, it's time to build!

Here are a couple of cool apps that you can use as a starting point:

Window seat: Generate photorealistic window views based on live weather and specific locations.
Pet passport adventure: Send your pet on a global adventure using Nano-Banana.
Global Kit Generator: Developer tool for scaling localized marketing assets.

Please share your best apps in the comments, it's always great to see how creative everybody is!

Top comments (5)

milam • Mar 9 • Edited

Thinking mode exists but the author recommends keeping it off unless you're doing something really complex. For most new projects, NB2 is the obvious default — Pro is only tropical-casino.com/ worth it if NB2 genuinely can't handle your use case.

Guillaume Vernade Google AI • Mar 9

Yes, unless you absolutely need thinking, I think it's safe to start without it first.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.