DEV Community: Guillaume Vernade

Vibe-coding in Google AI Studio: my tips to prompt better and create amazing apps

Guillaume Vernade — Thu, 19 Mar 2026 17:48:19 +0000

You might already know Google AI Studio as a sandbox to play with the Deepmind models and tinker with all their parameters. But did you know that you can also vibe-code webapps for free and publish them in a few clicks?

Its Build section is a game-changer for "vibe coding" and generating functional applications without writing a single line of code. It allows you to rapidly build and iterate on ideas using the power of Gemini models, moving from simple concepts to fully deployed prototypes in minutes.

Following my own experiments with the platform over the last year, this guide covers the core capabilities of AI Studio, how it compares to other tools, and how to prompt it effectively to build your apps.

Here's what you'll find in this article:

0. Why use AI Studio?
1. The App Gallery & Remixing
2. Get started with Vibe Coding
3. Create apps with databases
4. My tips to better Vibe Code
5. Publish your app
6. AI Studio vs. Antigravity: When to use which?
7. My favorite creations

0. Why use AI Studio? (Native Gemini and Privacy)

Before diving into the "how," let's address the most common question: Why use AI Studio over other popular AI app builders on the market?

The first reason is for AI Studio's native Gemini usage. It can create apps that are using the Gemini models, in a way that (as long as you stay in AI Studio) you don't have anything to set up so you, and the folks you're sharing your app with, can use the free tier and enjoy Gemini-powered apps for free.

Note: Some advanced models require a paid API key, but there's always an alternative with a free tier.

But the main differentiator is Privacy.

On the free tiers of many competing platforms, unless you're paying, all the applications you generate are public by default. Anyone can see what you are working on. On AI Studio, your apps remain strictly private. This is a huge advantage when you are prototyping personal ideas, working on sensitive client projects, or just want to experiment freely without worrying about public visibility.

Sharing uses the same system as any Google Drive file, which makes sharing your apps easy and lets people try them without having to create a new account.

Pro tip: As with any Drive file, you can set your apps to be accessible to whoever has the link. That's what I do when I post on LinkedIn (cf. the last section of this post for examples).

In any case, even if you don't use AI Studio, my tip should still be relevant as most vibe coding agents are working similarly.

1. The App Gallery & Remixing

If you are new to vibe coding, the best way to understand how the code is generated is to explore the App Gallery directly within AI Studio.

Best Practices:

Explore: Check out the impressive examples already built by the AI Studio team. Two of my personal favorites are the Spatial understanding and the Comic Book Creator (which need a paid API key to use Nano-banana Pro, but you can try remixing it to only use Nano-Banana's free tier).
Check the code: For each app, you can click "code" in the top left corner to access all of the app's code and check how things are done (or more likely copy-paste it to an AI coding agent).
Remix: When you like an app and just want to create your own flavor of it, click "remix" to create a copy of it that you'll own. It's an excellent way to start from an existing, working codebase and make it your own.

2. Get started with Vibe Coding

Ready to build your own? The principle is incredibly straightforward: open the build page, write what you want the app to do in a prompt, hit enter, and watch the coding agent (similar to the Antigravity one) generate the UI and logic.

Note: the generation can take quite some time (about 5 mins on average), so go get a coffee or read a blog post and come back after the coding agent has finished its job.

When you get a working app, you can start adding new features by continuing to prompt in the code assistant chatbox on the left.

New: One of the cool new additions in the past weeks is that the code assistant now works server-side, which means you can close the tab or change devices and it will continue to work for you.

Depending on the case, you can also use those two buttons to provide visual clues to the model by drawing things on it, which is very convenient to give UI feedback.

Another option is to dictate what changes you need. It's very convenient when you want to add a new feature on-the-fly while on your phone, but I would not recommend it for very precise updates.

3. Create apps with databases

Since this week, you can also ask the coding agent to create apps that can save things between sessions or users. You just need to ask it to specifically use a database:

(yes, I've been wanting to create my own grocery list app for a very long time)

Just click "Enable" when asked and the magic will happen.

What it will do behind the scenes is setting up the Firebase integration and a Firestore to store your data. It will also add authentication using a Google account to your app so it knows who's trying to access which data.

You don't need to know how your database is structured, the code agent will manage everything for you depending on what your app needs. You want each user to have their own grocery list? Boom, it's done! You now want them to be able to have shared lists, that's also done! Add labels to the items, easy peasy.

Your imagination is the limit!

4. My tips to better Vibe Code

Nowadays, "vibe coding" has become a reflex for me. It is the absolute best way to prototype a user experience before potentially moving to a complex IDE. But if you're not careful, you can easily end up losing a lot of time to make the agent work in an efficient way.

So here are my top tricks to get the most out of AI Studio (in no particular order).

Design your app before building it

If you have opinions about what your app should look like (personally I usually don't, yolo), a good idea is to iterate on designs for it using something like Stitch (that is using Nano-Banana) and give the images to the coding agent so it knows what's expected.

Save your progress so you can revert (and learn when to do it)

AI makes mistakes. It might misunderstand your prompt or write code that breaks a previously working app. When this happens, you can ask it to "fix the error" and most of the time it works, but sometimes it doesn't.

One very important skill to learn when vibe coding is when to try to fix things using AI, when to start anew, and when to go fix things yourself.

My personal advice is that if the agent can't figure out how to fix something after 2 rounds, stop insisting and go back to a previous version otherwise you might end up spending an hour arguing with the AI for nothing. And when you think you're spending as much time explaining what you want than to actually do it yourself (a good example is "change this time for another"), just do it yourself.

Thankfully AI Studio makes it easy for you to go back to a previous version:

Checkpoints

Checkpoints are the built-in version history to instantly revert to the last working state. They are the most convenient way to go back to a previous working version.

Warning: Just be careful of something: you can revert the code, but not the database changes, so don't load a checkpoint that was before a database update (what I would do is load the checkpoint, copy the code, load the more recent/broken code, ask the assistant to fix it based on how it was before).

Github

Github is what I would recommend to save milestone versions. You should use it to save the state of your app when you reach a certain milestone, like when you finish adding a new feature. You can enable it in a few clicks:

And then it will just be about describing your new feature and committing it to GitHub.

One current limitation though is that the sync is one-way, so it's a good way to save your status in a place where you can easily reuse it, but you can't update your code in GitHub and sync it to AI Studio (yet).

Use Multi-Modal Prompting

Stop relying purely on text. As I said before, AI studio gives you other options:

Voice: Incredibly practical for iterating quickly, especially if you are tweaking an app from your phone.
The "Annotate App" Tool: This is my absolute favorite feature for UI work. Take a screenshot of your app, draw directly on it ("Move this button here", "Remove this menu"), and send it.

Pro Tip: Always combine the annotated image with a clear text explanation to give the model maximum context.

Split Your Files! (Avoid the Monolith)

As your app grows, the model might start to "hallucinate", forget earlier features, or tangle the logic. This is almost always a structural issue.

By default, the AI tends to cram everything into one massive app.tsx file. Veto this immediately.

Golden Rule: Tell the model from the very beginning to separate features into distinct files and components.
Why? It drastically reduces errors and makes generation faster. It also allows you to instantly spot if the AI is messing up (e.g., If you ask for a UI color change and it starts rewriting auth-service.js, you know it lost the plot and you can stop it immediately). It will save you a lot of time when reviewing and at least gives you an at a glance confidence that the right part of the codebase was updated.

Force the AI to Write Documentation

To also help the AI remembering what the app is meant to do, have it maintain as much documentation as possible (from micro to macro):

Docstrings: Always forces the app to document all its functions, what they do, what the inputs and the outputs are.
File documentation: Since you're creating a file per feature, tell the AI to maintain some documentation at the top of them to detail what the feature is about, what use cases should be covered, etc.
Design.md: Finally, ask it to maintain a design doc of the whole app at the root of it.
Why? By having the AI repeat everything multiple times you both help it (and potentially yourself) find where everything is being done and what is the expected behavior. Kind of like how error correction codes work, having something written multiple times reduces the chances that they will be deleted by mistake.

Supercharge with System Instructions

After some time you'll realize that you're always giving the same instructions to the coding agent and will get tired of repeating yourself. That's why AI Studio allows you to customize the underlying "System Instructions." Don't leave this blank! You can define your preferred tech stack, frameworks, coding style, and of course everything I mentioned before!

Think of it as the onboarding package for your new junior developer, they need to know how you are expecting them to work, how to code, document, communicate, etc... You might not get it right the first time, but it's important to reflect on it and to keep on improving your package so that the next newcomers will be better onboarded and thus get productive faster.

Here's the ones I'm always using on top of more specialized instructions (like trusting me on model names and not changing them):

## Coding/documenting guidelines

* Create a file per feature or related features, split as much as possible in different files;
* add docstrings to all functions to explain what they do;
* start each file with a long comment explaining in detail what the feature is about and the different use cases;
* maintain a `Design.md` document at the root of the app that documents all the features of the app;
* log as info all function calls (with their parameters) and log all genai calls with all their parameters (model used, prompt, config) and their outputs, just strip inline data;
* group all configurable items (like model names) in a centralized file;
* always create a way to test the scripts without altering the data;

You'll see that I also added some instruction about logging (as it always help debugging) and dry run as these are both good practices, vibe coding or not.

Try them and tell me if that improved your vibe-coding experience!

5. Publish your app

You are now happy with your app and want to share it with the world (or maybe a subset of it), AI Studio offers you two ways of publishing your app:

Share it in AI Studio

The easiest way is to just use AI Studio sharing capability.

You can either decide to share the app with specific people or to make it available to whoever has its link (that's what I use on LinkedIn for ex.).

One of the key benefits is that they will also get access to the code and be able to Remix it if they want. But you can also send a link that opens the app full screen and hides the code agent to your less technical friends.

Another nice benefit is that if your app is using Gemini, your friends will use their free tier when using the app (or their API key if using a paid model), which means it won't cost you anything.

Publish the app on Cloud run

This is what you should do if you want to publish the app for real to actual users. In a few clicks it will create a cloud run container, publish the app online and give you a URL for anyone to access it.

You'll then be able to buy a domain and give it a proper URL, deploy in different regions, automatically scale, etc... But then you'll also be the one paying for usage as it's your own app now.

6. AI Studio vs. Antigravity: When to use which?

Since AI Studio uses a similar underlying coding agent as Google's Antigravity, you might be wondering when to use which tool. Here is my rule of thumb:

Use AI Studio when:

You are prototyping a front-end UI or a lightweight full-stack application.
You want to genuinely "vibe code" using multimodal inputs (like drawing directly on the app's UI or using your voice).
You want to instantly share a working prototype with stakeholders or friends via a simple link, without managing hosting.
You want zero-setup, native access to the Gemini models to build AI features quickly.

Use Antigravity when:

You are building a production-grade, complex application with deep backend infrastructure requirements.
You need fine-grained control over your dependencies, complex build steps, and deployment pipelines.
You are integrating the AI coding agent into an existing, large-scale codebase rather than starting a project from a blank slate.

Think of AI Studio as your creative sketchbook for rapid iteration, and Antigravity as your full-fledged developer workshop.

7. My favorite creations

Now that you have mastered the basics of vibe coding, the best way to learn is by doing. I didn't follow all these rules perfectly when I started, but making mistakes is how you refine your workflow!

To show you what's possible, here are a few applications I vibe-coded entirely from scratch using most of those methods:

[AI-powered resume]: An AI-powered resume. Don't just read it, but ask Gemini questions about me (it will know some anecdotes that are not written), tailor it according to the role you want to propose to me or even ask for an audio overview. Check it out here
[Talk coach]: A coach for your talks. Give it a recording or youtube link and it will tell you how to get even better. Check it out here
[FreshList]: A copy of the app I'm working on to simplify the groceries Check it out here

See other examples in the repo I created when I thought I will have the time to vibe code an app per week.

Getting the most out of Nano-Banana 2: tips & prompt guide

Guillaume Vernade — Thu, 05 Mar 2026 22:28:22 +0000

Following our previous Developer Guide and Prompting Guide, this post dives into the brand new capabilities of Nano-Banana 2 (aka. "Gemini 3.1 Flash Image"), when you should (and shouldn't) use it, and how to prompt its newest features effectively.

Here's what you'll find in this article:

The Model Matrix: Nano-Banana 1 vs. 2 vs. Pro
The Game Changer: Visual Grounding with Google Search
New Parameters: Extreme Ratios & 512px Resolutions
Controlling "Thinking" Mode
Prompt Examples
What about apps?

1. The Model Matrix: Nano-Banana 1 vs. 2 vs. Pro

With three distinct models now in the Nano-Banana lineup, choosing the right engine for your specific workflow is crucial. Here is how the new Nano-Banana 2 fits into the ecosystem.

Nano-Banana 1 vs. Nano-Banana 2

Don't count Nano-Banana 1 out just yet. If you have an existing application or workflow that uses Nano-Banana 1 and it is handling your use cases perfectly, stick with it! There is no forced migration (yet...).

Nano-Banana 1 remains the absolute cheapest option and is still faster than Nano-Banana 2 since it's not a thinking model. However, for any new pipeline that requires more nuance, better prompt adherence, or the new Image Grounding features, Nano-Banana 2 is absolutely worth the slight bump in price.

Also you'll save on having to migrate from NB to NB2 in the future, so start testing your prompts on the new model instead of the old one.

Pro-tip: Create 512px images with NB2 to keep more or less the same prices as NB1.

Nano-Banana Pro vs. Nano-Banana 2

The biggest question for developers and creators right now is: Why use Pro if 2 is so good?

Think of Nano-Banana 2 (Gemini-3.1-Flash) as offering roughly 95% of Pro's capabilities at a fraction of the cost. For almost all new projects, Nano-Banana 2 should be your immediate default. It handles text rendering, complex styles, and the new visual grounding exceptionally well.

You should only step up to Nano-Banana Pro when you hit a wall. If Nano-Banana 2 consistently fails a highly complex, multi-layered prompt, or struggles with extreme logical constraints, Pro remains the ultimate heavy lifter.

(Note: If you find specific edge cases where Pro consistently beats Nano-Banana 2, please drop them in the comments! We need to know what to improve.)

The Summary Matrix

Here is a quick reference guide to help you route your API calls:

2. The Game Changer: Visual Grounding with Google Search

While Nano-Banana Pro introduced the ability to search the web for textual information, Nano-Banana 2 takes a massive leap forward: Image Grounding.

The model can now search the internet for specific images to understand exactly what a real-world subject looks like before generating it. This is incredibly powerful when you need to represent specific locations, monuments, or highly specific biological species just as they appear in reality.

Best Practices:

Locations: Ask for specific churches, bridges, city squares, or niche buildings.
Nature: Ask for exact animal species, breeds, or insects.
Limitation to keep in mind: The model cannot search for people.

Example Prompts:

Specific Location Grounding:
"Generate a cinematic, golden-hour photograph of the main historical church in Voiron, France. Ensure the architectural details, the spire, the surrounding square, and the landscape (mountains) are accurate to reality." (change the city for your hometown)

Specific Species Grounding:
"Create a realistic picture of a machaon butterfly and a flambé one, and highlight their differences to show how to differentiate them."

If you want to know how to use this new image grounding tool in code, check the documentation or this Python colab from our cookbook.

3. New Parameters: Extreme Ratios & 512px Resolutions

Nano-Banana 2 introduces several new parameters that give developers and creators tighter control over output formats and cost optimization.

The 512px Batch-to-Upscale Workflow

Nano-Banana 2 introduces the ability to generate images at 512-pixel resolutions. With these new resolutions, the generation is slightly faster and the cost is driven down to roughly the same price as Nano-Banana 1.

Pro-tip: If you are a developer looking to optimize your costs while maintaining high-end output, here is the golden workflow:

Use the Batch API (which gives a 50% discount) to generate dozens of variations of your prompt at 512px.

Review the grid and select the absolute best composition.

Ask Nano-Banana 2 to upscale that specific image to 1K, 2K, or 4K.

Extreme Aspect Ratios (1:8 & 1:4)

Nano-Banana 2 also introduces extreme new aspect ratios—1:8 and 1:4—available in both vertical and horizontal formats. These are perfect for web banners, continuous scrolling assets, and comic book (BD) layouts.

Example Prompt:

Horizontal Comic Strip:
"Create a 4-panel horizontal comic strip (aspect ratio 4:1). The story follows a mischievous cat trying to steal a fish from a kitchen counter that ends with a twist. Use a vibrant, Franco-Belgian comic book style. Keep the cat's design consistent across all panels."

4. Controlling "Thinking" Mode

Like its predecessor, Nano-Banana 2 has a "Thinking" mode where it reasons about the prompt before generating. However, you can now toggle this feature ON or OFF.

My Recommendation: Keep it OFF by default.
For standard image generation, turning it off saves time and processing. You should only turn Thinking ON if:

The model is generating nonsensical results and needs help reasoning through the prompt.
You are generating highly complex infographics.
You are combining complex Image Grounding with spatial reasoning.

(Again, if you find amazing use-cases where turning "Thinking" ON completely changes the game, let me know in the comments!)

5. Prompt Examples

A nano-banana guide without some prompt examples would be like a meal without cheese, so here are my favorite ones at the moment:

Cartoon Portraits: Transform personal photos into stylized, high-fidelity 3D characters interacting with their real-life selves.

Prompt: Based strictly on the uploaded reference image, create a photorealistic scene featuring the real human standing next to a giant 3D animation-style version of themselves. Both must have identical facial structures, clothing, and poses. The real person is smiling naturally with their hand on the 3D character's shoulder. The 3D version is proportionally larger, anatomically identical but stylized, with expressive eyes and a playful smirk. Clean gray-blue studio background, cinematic lighting, crisp textures. (Note: Requires uploading an image).

Animation to Image: Upload animated stills and utilize the model to interpret those outlines into hyper-realistic, photographic images.

Prompt: Convert this uploaded animated still into an ultra-realistic, cinematic, and fully photorealistic scene. Transform the animated characters into real humans while perfectly preserving their original identities, facial structures, outfits, expressions, and overall likeness. (Note: Requires uploading an image).

(original image from https://archive.org/details/mobile_suit_gundam_coloring_book)

History on Maps: Generate hyper-realistic, Maps-style street view imagery that "reimagines" historical events (like the 800 AD crowning of Charlemagne) as if captured by modern 360-degree cameras.

Prompt: Generate a hyper-realistic image of the crowning of Charlemagne on December 25, 800 AD, perfectly replicating a Google Maps Street View capture. Show Pope Leo III placing the imperial crown on a kneeling Charlemagne inside Old St. Peter's Basilica. Include a 123-degree wide-angle barrel distortion, a semi-transparent Google Maps UI overlay (navigation compass, 2D map thumbnail, white directional chevron arrows floating over the stone floor), and a '© Google 800' watermark. Automatically blur the faces of Charlemagne, the Pope, and surrounding medieval nobles for privacy. Use warm, dim torchlight and candlelight filtering through the basilica, dramatic shadows, and high-ISO digital noise typical of a 360-degree camera struggling in a low-light interior.

Kindergarten Filter: Celebrate human imperfection and childhood nostalgia by generating intentionally messy, waxy crayon doodles on lined paper.

Prompt: A child's crayon drawing on white lined notebook paper of maple taffy on snow. Use chunky wax-crayon strokes, wobbly outlines, and bright bold colors that messily overflow the lines. Include visible heavy pressure marks, waxy smudges, and uneven scribble shading. Draw important elements disproportionately large with simple flat shapes, round friendly faces, dot eyes, and big curved smiles. Add a classic large yellow sun in the corner, puffy clouds, and zero realistic perspective. Joyful, naive art style.

6. What about apps?

Now that you know the new capabilities of Nano-Banana 2, it's time to build!

Here are a couple of cool apps that you can use as a starting point:

Window seat: Generate photorealistic window views based on live weather and specific locations.
Pet passport adventure: Send your pet on a global adventure using Nano-Banana.
Global Kit Generator: Developer tool for scaling localized marketing assets.

Please share your best apps in the comments, it's always great to see how creative everybody is!

Lyria RealTime: The Developer’s Guide to Infinite Music Streaming

Guillaume Vernade — Mon, 08 Dec 2025 21:31:01 +0000

You love generating static songs with classic text to music models? Prepare to conduct a never-ending symphony. Introducing Lyria RealTime, Google DeepMind’s experimental model that doesn't just generate music—it jams with you like it did during the Toro y Moi IO pre-show:

While traditional music generation models work like a jukebox (input prompt -> wait -> get song), Lyria RealTime operates on the principle of "Music as a Verb." It creates a persistent, bidirectional streaming connection that produces a continuous 48kHz stereo stream. You can steer, warp, and morph the audio in the moment, making it the first generative model truly designed for interactive experiences.

And the best part? Right now the model is free to use!

Here's quick summary of what you'll learn about in this guide:

This guide will walk you through building with Lyria RealTime using the Gemini API.

This guide will cover:

How Lyria RealTime Works (The "Goldfish Memory" Architecture)
Project Setup
Basic Streaming (The "Hello World" of Music)
Steering the Stream (Weighted Prompts)
Advanced Configuration (BPM, Density, & Scale)
Blueprints for the Future: Advanced Use Cases
Prompting Strategies & Best Practices
Where to play with Lyria Real Time

Jump directly to the last section if you want to play directly with Lyria RealTime, for ex. as a DJ, driving a spaceship or using your camera.

Note: for an interactive version of this post, checkout the python cookbook.

1) How Lyria RealTime Works

Lyria RealTime uses a low-latency WebSocket connection to maintain a live communication channel with the model. Unlike offline models that plan a whole song structure (Intro-Verse-Chorus), Lyria operates on a chunk-based autoregression system.

It generates audio in 2-second chunks, looking back for a few seconds of context to maintain the rhythmic "groove" while looking forward at your current controls to decide the style. This means the model doesn't "compose songs" in the traditional sense; it navigates musical states.

2) Project Setup

To follow this guide, you will need:

An API key from Google AI Studio (it can be a free one).
The Google Gen AI SDK.

Install the SDK:
Python (3.12+ recommended):

pip install "google-genai>=1.52.0"

JavaScript / TypeScript:
You'll need at least the 1.30 version of the JS/TS SDK

npm install @google/genai

Note: The following examples use the Python SDK for demonstration. For JS/TS code sample, check the AI studio Apps.

3) Basic Streaming

To start a session, you connect to the model (models/lyria-realtime-exp), send an initial configuration, and start the stream. The interaction loop is asynchronous: you send commands, and the server continuously yields raw audio chunks.

[Note: Ensure you are using the v1alpha API version for experimental models like Lyria].

import asyncio
from google import genai
from google.genai import types

client = genai.Client(http_options={'api_version': 'v1alpha'})

async def main():
    async def receive_audio(session):
        """Background task to process incoming audio chunks."""
        while True:
            async for message in session.receive():
                if message.server_content.audio_chunks:
                    # 'data' is raw 16-bit PCM audio at 48kHz
                    audio_data = message.server_content.audio_chunks.data
                    # Add your audio playback logic here!
            await asyncio.sleep(10**-12)

    async with (
        client.aio.live.music.connect(model='models/lyria-realtime-exp') as session,
        asyncio.TaskGroup() as tg,
    ):
        # 1. Start listening for audio
        tg.create_task(receive_audio(session))

        # 2. Send initial musical concept
        await session.set_weighted_prompts(
            prompts=[types.WeightedPrompt(text='elevator music', weight=1.0)]
        )

        # 3. Set the vibe (BPM, Temperature)
        await session.set_music_generation_config(
            config=types.LiveMusicGenerationConfig(bpm=90, temperature=1.0)
        )

        # 4. Drop the beat
        await session.play()

        # Keep the session alive
        await asyncio.sleep(30) 

if __name__ == "__main__":
    asyncio.run(main())

Congratulations, you've got some elevator music!

Not impressed? That's just the beginning, dear padawan, now comes the cool part.

4) Steering the Stream (Weighted Prompts)

This is where the magic happens. Unlike static generation, you can send new WeightedPrompt messages while the music is playing to smoothly transition the genre, instruments, or mood.

The weight parameter is your fader. A weight of 1.0 is standard, but you can use multiple prompts to blend influences.

Example: Morphing from Piano to Live Performance

from google.genai import types

# Send this while the loop is running to shift the style
await session.set_weighted_prompts(
    prompts=[
        # Keep the piano strong
        {"text": "Piano", "weight": 2.0},
        # Add a subtle meditative layer
        types.WeightedPrompt(text="Meditation", weight=0.5),
        # Push the 'Live' feeling
        types.WeightedPrompt(text="Live Performance", weight=1.0),
    ]
)

Note: As the model generates chunks after chunks, the changes can take a few seconds (usually around 2s) to be reflected in the music.

Pro Tip: Cross-fading

Drastic prompt changes can be abrupt. For professional results, implement client-side cross-fading by sending intermediate weight values rapidly (e.g., every 500ms) to "morph" the music smoothly.

Example: The "Morph" Function

import asyncio
from google.genai import types

async def cross_fade(session, old_prompt, new_prompt, duration=2.0, steps=10):
    """Smoothly morphs from one musical idea to another."""
    step_time = duration / steps

    for i in range(steps + 1):
        # Calculate the blend ratio (alpha goes from 0.0 to 1.0)
        alpha = i / steps

        await session.set_weighted_prompts(
            prompts=[
                # Fade out the old
                types.WeightedPrompt(text=old_prompt, weight=1.0 - alpha),
                # Fade in the new
                types.WeightedPrompt(text=new_prompt, weight=alpha),
            ]
        )
        await asyncio.sleep(step_time)

# Usage in your main loop:
# Morph from 'Ambient' to 'Techno' over 5 seconds
await cross_fade(session, "Ambient Drone", "Hard Techno", duration=5.0)

Note that this code sample assumes all your prompts have a weight of 1 which might not be the case.

5) Advanced Configuration (The Knobs)

Lyria RealTime exposes parametric controls that change the structure of the music. If you aren't a musician, think of these controls as the physics of the audio world:

Density (0.0 - 1.0): Think of this as "Busyness."
- Low (0.1): A lonely drummer playing once every few seconds. Sparse.
- High (0.9): A chaotic orchestra where everyone plays at once. Intense.
Brightness (0.0 - 1.0): Think of this as "Muffled vs. Crisp."
- Low (0.1): Listening to music from outside a club, through a wall. Dark and bass-heavy.
- High (0.9): Listening through high-end headphones. Sharp, clear, and treble-heavy.
BPM (60 - 200): The heartbeat of the track (Beats Per Minute).
Scale: The "Mood." It forces the music into a specific set of notes (Key/Mode).

Important: While density and brightness can be changed smoothly on the fly, changing the BPM or Scale is a fundamental structural shift. You must call reset_context() for these changes to take effect. This will clear the model's "short-term memory," causing a hard cut in the audio.

Example: The "Hard Drop"

# Changing structural parameters requires a context reset
await session.set_music_generation_config(
    config=types.LiveMusicGenerationConfig(
        bpm=140, 
        scale=types.Scale.C_MAJOR_A_MINOR, # Force happy/neutral mood
    )
)

# This command is mandatory for BPM/Scale changes to apply!
await session.reset_context()

6) Blueprints for the Future: Advanced Use Cases

We’ve covered basic streaming, but Lyria’s parametric controls allow for applications that connect the physical world to the audio stream. Here are four ideas to get you started.

Use Case A: The "Biometric Beat" (Fitness & Health)

Most fitness apps use static playlists that rarely match your actual pace. Because Lyria allows for real-time bpm and density control, you can build a music engine that is biologically coupled to the user.

Heart Rate Monitor (HRM) -> BPM: Map the user's heart rate directly to the track's tempo.
Accelerometer -> Density: If the user is sprinting (high variance in movement), increase density to 1.0 to add percussion and complexity. If they stop to rest, drop density to 0.2 for an ambient breakdown.

Use Case B: The "Democratic DJ" (Social Streaming)

Since WeightedPrompts accept float values, you can build a collaborative radio station for Twitch streams or Discord bots where the audience votes on the genre. Instead of a winner-take-all system, Lyria can blend the votes.

Input: 100 users vote. 60 vote "Cyberpunk", 30 vote "Jazz", 10 vote "Reggae".
Normalization: Convert votes to weights (0.6, 0.3, 0.1).
Result: The model generates a dominant Cyberpunk track with clear Jazz harmonies and a subtle Reggae backbeat and changes it overtime according to the votes.

Use Case C: "Focus Flow" (Productivity)

Deep work requires different audio textures than brainstorming. You can map Lyria's brightness and guidance parameters to a Pomodoro timer to guide the user's cognitive state.

Deep Work Phase: Low brightness (darker, warmer sounds), Low density (minimal distractions), High guidance (repetitive, predictable).
Break Phase: High brightness (energetic, crisp), High density, Low guidance (creative, surprising).

Use Case D: "Realtime Game music" (Gaming)

Coming from the gaming industry I could not avoid thinking of a gaming idea for Lyria Real Time. You could have Lyria create the music of the game in real time based on:

The game's own style: a bunch of prompts that defines the game and the overall ambiance,
The environment: different prompts depending on whether you're in a busy city, in a forest or sailing the Greek seas,
The player's action: are they fighting, then add the "epic" prompt, investigating instead, change it for the "mysterious" one,
The players' current condition: You could change the BPM and the weight of a "danger" prompt depending on the player's health bar. The lower it is, the more stressful the music would be.

7) Prompting Strategies & Best Practices

The Prompt Formula:
Through testing, a reliable formula has emerged: [Genre Anchor] + [Instrumentation] + [Atmosphere]...

Instruments: 303 Acid Bass, Buchla Synths, Hang Drum, TR-909 Drum Machine...
Genres: Acid Jazz, Bengal Baul, Glitch Hop, Shoegaze, Vaporwave...
Moods: Crunchy Distortion, Ethereal Ambience, Ominous Drone, Swirling Phasers...

Developer Best Practices:

Buffer Your Audio: Because this is real-time streaming over the network, implement client-side audio buffering (2-3 chunks) to handle network jitter and ensure smooth playback.
The "Settling" Period: When you start a stream or reset context, the model needs about 5-10 seconds to "settle" into a stable groove.
Safety Filters: The model checks prompts against safety filters. Avoid asking for specific copyrighted artists ("Style of Taylor Swift"); instead, deconstruct their sound into descriptors ("Pop, female vocals, acoustic guitar").
Instrumental Only: The model is only instrumental. While you can set music_generation_mode to VOCALIZATION, it produces vocal-like textures (oohs/aahs), not coherent lyrics.
Session duration limit: The session are currently limited to 10mn, but you can just restart a new one afterwards.

More details and prompt ideas in Lyria RealTime's documentation.

8. Ready to Jam? Choose your preferred way to play with Lyria RealTime

One of the easiest places to try is AI Studio, where a couple of cool apps are available for you to play with, and to vibe-customize to your needs:

Prompt DJ, MIDI DJ and MusicFX (US only) let you add and mix multiple prompts in real time:

Space DJ lets you navigate the universe of music genders with a spacecraft! I personally love navigating around the italo-disco and euro-france planets.

Lyria Camera creates music in real time based on what it sees. I'd love to have that connected to my dashcam!

The Magenta website also features a lot of cool demos. It's also a great place to get more details on Deepmind's music generation models.
Finally, check the magical mirror demo I made that uses Lyria to create background music according to what it tells (Gemini generates the prompts on the fly):

And now the floor is yours, what will you create using Lyria RealTime?

Resources:

Documentation
Magenta website and blog for the latest news on the music generation models.
AI Studio gen-media apps

Nano-Banana Pro: Prompting Guide & Strategies

Guillaume Vernade — Thu, 27 Nov 2025 18:04:07 +0000

Nano-Banana Pro is a significant leap forward from previous generation models, moving from "fun" image generation to "functional" professional asset production. It excels in text rendering, character consistency, visual synthesis, world knowledge (Search), and high-resolution (4K) output.

Following the developer guide on how to get started with AI Studio and the API, this guide covers the core capabilities and how to prompt them effectively.

Here's what you'll find in this article:

0. The Golden Rules of Prompting
1. Text Rendering, Infographics & Visual Synthesis
2. Character Consistency & Viral Thumbnails
3. Grounding with Google Search
4. Advanced Editing, Restoration & Colorization
5. Dimensional Translation (2D ↔ 3D)
6. High-Resolution & Textures
7. Thinking & Reasoning
8. One-Shot Storyboarding & Concept Art
9. Structural Control & Layout Guidance
10. What's Next?

🛑 Section 0: The Golden Rules of Prompting

Nano-Banana Pro is a "Thinking" model. It doesn't just match keywords; it understands intent, physics, and composition. To get the best results, stop using "tag soups" (e.g., dog, park, 4k, realistic) and start acting like a Creative Director.

1. Edit, Don't Re-roll
The model is exceptionally good at understanding conversational edits. If an image is 80% correct, do not generate a new one from scratch. Instead, simply ask for the specific change you need.

Example: "That's great, but change the lighting to sunset and make the text neon blue."

2. Use Natural Language & Full Sentences
Talk to the model as if you were briefing a human artist. Use proper grammar and descriptive adjectives.

❌ Bad: "Cool car, neon, city, night, 8k."
✅ Good: "A cinematic wide shot of a futuristic sports car speeding through a rainy Tokyo street at night. The neon signs reflect off the wet pavement and the car's metallic chassis."

3. Be Specific and Descriptive
Vague prompts yield generic results. Define the subject, the setting, the lighting, and the mood.

Subject: Instead of "a woman," say "a sophisticated elderly woman wearing a vintage chanel-style suit."
Materiality: Describe textures. "Matte finish," "brushed steel," "soft velvet," "crumpled paper."

4. Provide Context (The "Why" or "For whom")
Because the model "thinks," giving it context helps it make logical artistic decisions.

Example: "Create an image of a sandwich for a Brazilian high-end gourmet cookbook." (The model will infer professional plating, shallow depth of field, and perfect lighting).

1. Text Rendering, Infographics & Visual Synthesis

Nano-Banana Pro has SOTA capabilities for rendering legible, stylized text and synthesizing complex information into visual formats.

Best Practices:

Compression: Ask the model to "compress" dense text or PDFs into visual aids.
Style: Specify if you want a "polished editorial," a "technical diagram," or a "hand-drawn whiteboard" look.
Quotes: Clearly specify the text you want in quotes.

Example Prompts:

Earnings Report Infographic (Data Ingestion):
[Input PDF of Google's latest earnings report]
"Generate a clean, modern infographic summarizing the key financial highlights from this earnings report. Include charts for 'Revenue Growth' and 'Net Income', and highlight the CEO's key quote in a stylized pull-quote box."

Try it in AI Studio (Note: Requires uploading a PDF)

Retro Infographic:
"Make a retro, 1950s-style infographic about the history of the American diner. Include distinct sections for 'The Food,' 'The Jukebox,' and 'The Decor.' Ensure all text is legible and stylized to match the period."

Try it in AI Studio

Technical Diagram:
"Create an orthographic blueprint that describes this building in plan, elevation, and section. Label the 'North Elevation' and 'Main Entrance' clearly in technical architectural font. Format 16:9."

Try it in AI Studio

Whiteboard Summary (Educational):
"Summarize the concept of 'Transformer Neural Network Architecture' as a hand-drawn whiteboard diagram suitable for a university lecture. Use different colored markers for the Encoder and Decoder blocks, and include legible labels for 'Self-Attention' and 'Feed Forward'."

Try it in AI Studio

2. Character Consistency & Viral Thumbnails

Nano-Banana Pro supports up to 14 reference images (6 with high fidelity). This allows for "Identity Locking"—placing a specific person or character into new scenarios without facial distortion.

Best Practices:

Identity Locking: Explicitly state: "Keep the person's facial features exactly the same as Image 1."
Expression/Action: Describe the change in emotion or pose while maintaining the identity.
Viral Composition: Combine subjects with bold graphics and text in a single pass.

Example Prompts:

The "Viral Thumbnail" (Identity + Text + Graphics):
"Design a viral video thumbnail using the person from Image 1. Face Consistency: Keep the person's facial features exactly the same as Image 1, but change their expression to look excited and surprised. Action: Pose the person on the left side, pointing their finger towards the right side of the frame. Subject: On the right side, place a high-quality image of a delicious avocado toast. Graphics: Add a bold yellow arrow connecting the person's finger to the toast. Text: Overlay massive, pop-style text in the middle: '3分钟搞定!' (Done in 3 mins!). Use a thick white outline and drop shadow. Background: A blurred, bright kitchen background. High saturation and contrast."

Try it in AI Studio (Note: Requires uploading a reference image)

The "Fluffy Friends" Scenario (Group Consistency):
[Input 3 images of different plush creatures]
"Create a funny 10-part story with these 3 fluffy friends going on a tropical vacation. The story is thrilling throughout with emotional highs and lows and ends in a happy moment. Keep the attire and identity consistent for all 3 characters, but their expressions and angles should vary throughout all 10 images. Make sure to only have one of each character in each image."

Try it in AI Studio (Note: Requires uploading reference images)

Brand Asset Generation:
[Input 1 image of a product]
"Create 9 stunning fashion shots as if they’re from an award-winning fashion editorial. Use this reference as the brand style but add nuance and variety to the range so they convey a professional design touch. Please generate nine images, one at a time."

Try it in AI Studio (Note: Requires uploading a reference image)

3. Grounding with Google Search

Nano-Banana Pro uses Google Search to generate imagery based on real-time data, current events, or factual verification, reducing hallucinations on timely topics.

Best Practices:

Ask for visualizations of dynamic data (weather, stocks, news).
The model will "Think" (reason) about the search results before generating the image.

Example Prompts:

Real-Time Data Visualization:
"Visualize the current stock value of the main tech companies and the current trends. For each add some explanation on what happened recently which could explain that trend."

Try it in AI Studio

Event Visualization:
"Generate an infographic of the best times to visit the U.S. National Parks in 2025 based on current travel trends."

Try it in AI Studio

Just be careful that it tend to mix a lot of information together, so as always with AI, double-check everything!

4. Advanced Editing, Restoration & Colorization

The model excels at complex edits via conversational prompting. This includes "In-painting" (removing/adding objects), "Restoration" (fixing old photos), "Colorization" (Manga/B&W photos), and "Style Swapping."

Best Practices:

Semantic Instructions: You do not need to manually mask; simply tell the model what to change naturally.
Physics Understanding: You can ask for complex changes like "fill this glass with liquid" to test physics generation.

Example Prompts:

Object Removal & In-painting:
"Remove the tourists from the background of this photo and fill the space with logical textures (cobblestones and storefronts) that match the surrounding environment."

Try it in AI Studio (Note: Requires uploading a photo)

Manga/Comic Colorization:
[Input black and white manga panel]
"Colorize this manga panel. Use a vibrant anime style palette. Ensure the lighting effects on the energy beams are glowing neon blue and the character's outfit is consistent with their official colors."

Try it in AI Studio (Note: Requires uploading an image)

Localization (Text Translation + Cultural Adaptation):
[Input image of a London bus stop ad]
"Take this concept and localize it to a Tokyo setting, including translating the tagline into Japanese. Change the background to a bustling Shibuya street at night."

Try it in AI Studio (Note: Requires uploading an image)

Lighting/Seasonal Control:
[Input image of a house in summer]
"Turn this scene into winter time. Keep the house architecture exactly the same, but add snow to the roof and yard, and change the lighting to a cold, overcast afternoon."

Try it in AI Studio (Note: Requires uploading an image)

5. Dimensional Translation (2D ↔ 3D)

A powerful new capability is translating 2D schematics into 3D visualizations, or vice versa. This is ideal for interior designers, architects, and meme creators.

Example Prompts:

2D Floor Plan to 3D Interior Design Board:
"Based on the uploaded 2D floor plan, generate a professional interior design presentation board in a single image. Layout: A collage with one large main image at the top (wide-angle perspective of the living area), and three smaller images below (Master Bedroom, Home Office, and a 3D top-down floor plan). Style: Apply a Modern Minimalist style with warm oak wood flooring and off-white walls across ALL images. Quality: Photorealistic rendering, soft natural lighting."

Try it in AI Studio (Note: Requires uploading a floor plan)

2D to 3D Meme Conversion:
"Turn the 'This is Fine' dog meme into a photorealistic 3D render. Keep the composition identical but make the dog look like a plush toy and the fire look like realistic flames."

Try it in AI Studio

6. High-Resolution & Textures

Nano-Banana Pro supports native 1K to 4K image generation. This is particularly useful for detailed textures or large-format prints.

Best Practices:

Explicitly request high resolutions (2K or 4K) if your API/Interface allows.
Describe high-fidelity details (imperfections, surface textures).

Example Prompts:

4K Texture Generation:
"Harness native high-fidelity output to craft a breathtaking, atmospheric environment of a mossy forest floor. Command complex lighting effects and delicate textures, ensuring every strand of moss and beam of light is rendered in pixel-perfect resolution suitable for a 4K wallpaper."

Try it in AI Studio

Complex Logic (Thinking Mode):
"Create a hyper-realistic infographic of a gourmet cheeseburger, deconstructed to show the texture of the toasted brioche bun, the seared crust of the patty, and the glistening melt of the cheese. Label each layer with its flavor profile."

Try it in AI Studio

7. Thinking & Reasoning

Nano-Banana Pro defaults to a "Thinking" process where it generates interim thought images (not charged) to refine composition before rendering the final output. This allows for data analysis and solving visual problems.

Example Prompts:

Solve Equations:
"Solve log_{x^2+1}(x^4-1)=2 in C on a white board. Show the steps clearly."

Try it in AI Studio

Visual Reasoning:
"Analyze this image of a room and generate a 'before' image that shows what the room might have looked like during construction, showing the framing and unfinished drywall."

Try it in AI Studio (Note: Requires uploading an image)

8. One-Shot Storyboarding & Concept Art

You can generate sequential art or storyboards without a grid, ensuring a cohesive narrative flow in a single session. This is also popular for "Movie Concept Art" (e.g., fake leaks of upcoming films).

Example Prompt:

"Create an addictively intriguing 9-part story with 9 images featuring a woman and man in an award-winning luxury luggage commercial. The story should have emotional highs and lows, ending on an elegant shot of the woman with the logo. The identity of the woman and man and their attire must stay consistent throughout but they can and should be seen from different angles and distances. Please generate images one at a time. Make sure every image is in a 16:9 landscape format."

Try it in AI Studio

9. Structural Control & Layout Guidance

Input images aren't limited to character references or subjects to edit. You can use them to strictly control the composition and layout of the final output. This is a game-changer for designers who need to turn a napkin sketch, a wireframe, or a specific grid layout into a polished asset.

Best Practices:

Drafts & Sketches: Upload a hand-drawn sketch to define exactly where the text and object should sit.
Wireframes: Use screenshots of existing layouts or wireframes to generate high-fidelity UI mockups.
Grids: Use grid images to force the model to generate assets for tile-based games or LED displays.

Example Prompts:

Sketch to Final Ad:
"Create a ad for a [product] following this sketch."

Try it in AI Studio (Note: Requires uploading a sketch)

UI Mockup from Wireframe:
"Create a mock-up for a [product] following these guidelines."

Try it in AI Studio (Note: Requires uploading a wireframe)

Pixel Art & LED Displays:
"Generate a pixel art sprite of a unicorn that fits perfectly into this 64x64 grid image. Use high contrast colors."
(Tip: Developers can then programmatically extract the center color of each cell to drive a connected 64x64 LED matrix display).

Try it in AI Studio (Note: Requires uploading a grid image)

Sprites:
"Sprite sheet of a woman doing a backflip on a drone, 3x3 grid, sequence, frame by frame animation, square aspect ratio. Follow the structure of the attached reference image exactly.."
(Tip: You can then extract each cell and make a gif)

Try it in Colab

10. What's Next?

Now that you have mastered the basics of prompting, here is how you can start building:

Experiment in the UI: Google AI Studio is the fastest way to test prompts and parameters.
Check really cool Nano-banana powered app in the App Gallery.
Vibe-code you dream app: Transform you best prompt into an app that you can easily share with your friends in AI Studio Build.
Build Applications: Ready to code? Check out the developer guide or the Gemini API Cookbook for guides and code snippets.
Technical Deep Dive: Read the full Gemini API Documentation for details on rate limits, pricing, and integration.

Introducing Nano Banana Pro: Complete Developer Tutorial

Guillaume Vernade — Fri, 21 Nov 2025 21:30:41 +0000

You loved Nano-Banana? Created figurine images of all your friends and ghost faces behind all your foes? Here now comes the not-so-nano "Gemini 3 Pro Image" model, that you will all prefer calling Nano Banana Pro!

While the Flash model (Nano Banana) brought speed and affordability, the Pro version introduces "thinking" capabilities, search grounding, and high-fidelity 4K output. It's time to go bananas with complex creative tasks!

This guide will walk you through the advanced features of Nano Banana Pro using the Gemini Developer API. For prompting advices, the prompting guide will be your next read.

This guide will cover:

Using Nano Banana Pro in Google AI Studio
Project setup
Initialize the Client
Basic Generation (The Classics)
The "Thinking" Process
Search Grounding
High-Resolution 4K Generation
Multilingual Capabilities
Advanced Image Mixing
Pro-Exclusive Demos

Note: for an interactive version of this post, checkout the python cookbook or the AI Studio's Javascript Notebook.

1) Using Nano Banana Pro in Google AI Studio

While end-users can access Nano Banana Pro in the Gemini app, the best environment for developers to prototype and test prompts is Google AI Studio. AI Studio is a playground to experiment with all available AI models before writing any code, and it's also the entry point for building with the Gemini API.

You can use Nano Banana Pro within AI Studio. To get started, go to aistudio.google.com, sign in with your Google account, and select Nano Banana Pro (Gemini 3 Pro Image) from the model picker.

Contrary to Nano-Banana, the pro version doesn't have a free tier, which means you need to select an API key with billing enabled (see "project setup" section below).

Tip: You can also vibe code Nano Banana web apps directly in AI Studio at ai.studio/apps, or explore the code and remix one of the existing apps.

2) Project setup

To follow this guide, you will need the following:

An API key from Google AI Studio.
Billing set up for your project.
The Google Gen AI SDK for Python or JavaScript/TypeScript.

If you already are a hardcore Gemini API user with all of that, great! just skip this section and move to the next one. Otherwise, here's how to get started:

Step A: Get your API Key

When you first log in on AI Studio, a Google Cloud project and an API key should be automatically created.

Open the API key management screen and click on the "copy" icon to copy your API key.

Step B: Enable Billing

Since Nano Banana Pro doesn't have a free tier. You must enable billing on your Google Cloud project.

In the API key management screen, click Set up billing next to your project and follow the on-screen instructions.

How much does Nano Banana Pro cost?

Image generation with Nano Banana Pro is more expensive than the Flash version, especially for 4K images. At the time this post is published, a 1K or 2K image costs $0.134, while a 4K one costs $0.24 (plus the token cost of the input and the text output).

Check the pricing in the documentation for the latest details.

Pro tip: To save 50% on your generation costs, you can use the Batch API. In exchange you might have to wait up to 24h before getting your images.

Step C: Install the SDK

Choose the SDK for your preferred language.

Python:
You'll need Python 3 and at least the 1.52 version of the Python SDK

pip install -U "google-genai>=1.52.0"
# Install the Pillow library for image manipulation
pip install Pillow

JavaScript / TypeScript:
You'll need at least the 1.30 version of the JS/TS SDK

npm install @google/genai

Note: The following examples use the Python SDK for demonstration. Equivalent code snippets to use Nano Banana in JavaScript are provided in this JS Notebook.

3) Initialize the Client

To use the Pro model, you'll need to use the gemini-3-pro-image-preview model ID.

from google import genai
from google.genai import types

# Initialize the client
client = genai.Client(api_key="YOUR_API_KEY")

# Set the model ID
PRO_MODEL_ID = "gemini-3-pro-image-preview"

4) Basic Generation (The Classics)

Before we get into the fancy stuff, let's look at a standard generation. You can control the output using response_modalities (to get text and images or only images) and aspect_ratio.

prompt = "Create a photorealistic image of a siamese cat with a green left eye and a blue right one"
aspect_ratio = "16:9" # "1:1","2:3","3:2","3:4","4:3","4:5","5:4","9:16","16:9" or "21:9"

response = client.models.generate_content(
    model=PRO_MODEL_ID,
    contents=prompt,
    config=types.GenerateContentConfig(
        response_modalities=['Text', 'Image'], # Or just ['Image']
        image_config=types.ImageConfig(
            aspect_ratio=aspect_ratio,
        )
    )
)

# Save the image
for part in response.parts:
    if image:= part.as_image():
        image.save("cat.png")

Chat mode is also an option (it's actually what I would recommend for multi-turn editing). Check the 8th example, "Polyglot Banana", for an example.

5) The "Thinking" Process (It's alive!)

Nano Banana Pro isn't just drawing; it's thinking. This means it can reason through your most complex, twisted prompts before generating an image. And the best part? You can peek into its brain!

To enable this, set include_thoughts=True in the thinking_config.

prompt = "Create an unusual but realistic image that might go viral"
aspect_ratio = "16:9"

response = client.models.generate_content(
    model=PRO_MODEL_ID,
    contents=prompt,
    config=types.GenerateContentConfig(
        response_modalities=['Text', 'Image'],
        image_config=types.ImageConfig(
            aspect_ratio=aspect_ratio,
        ),
        thinking_config=types.ThinkingConfig(
            include_thoughts=True # Enable thoughts
        )
    )
)

# Save the image and thoughts
for part in response.parts:
  if part.thought:
    print(f"Thought: {part.text}")
  elif image:= part.as_image():
    image.save("viral.png")

And you should get something like:

## Imagining Llama Commuters

I'm focusing on the llamas now. The goal is to capture them as
daily commuters on a bustling bus in La Paz, Bolivia. My plan
involves a vintage bus crammed with amused passengers. The image
will highlight details like one llama looking out the window,
another interacting with a passenger, all while people take
photos.

[IMAGE]

## Visualizing the Concept

I'm now fully immersed in the requested scenario. My primary
focus is on the "unusual yet realistic" aspects. The scene is
starting to take shape with the key elements established.

This transparency helps you understand how the model interpreted your request. It's like having a conversation with your artist!

6) Search Grounding (Real-time magic)

One of the most game-changing features is Search Grounding. Nano Banana Pro isn't stuck in the past; it can access real-time data from Google Search to generate accurate, up-to-date images. Want the weather? You got it.

For example, you can ask it to visualize the current weather forecast:

prompt = "Visualize the current weather forecast for the next 5 days in Tokyo as a clean, modern weather chart. add a visual on what i should wear each day"

response = client.models.generate_content(
    model=PRO_MODEL_ID,
    contents=prompt,
    config=types.GenerateContentConfig(
        response_modalities=['Text', 'Image'],
        image_config=types.ImageConfig(
            aspect_ratio="16:9",
        ),
        tools=[{"google_search": {}}] # Enable Google Search
    )
)

# Save the image
for part in response.parts:
    if image:= part.as_image():
        image.save("weather.png")

# Display sources (you must always do that)
print(response.candidates[0].grounding_metadata.search_entry_point.rendered_content)

7) Go Big or Go Home: 4K Generation

Need print-quality images? Nano Banana Pro supports 4K resolution. Because sometimes, bigger is better.

prompt = "A photo of an oak tree experiencing every season"
resolution = "4K" # Options: "1K", "2K", "4K", be careful lower case do not work.

response = client.models.generate_content(
    model=PRO_MODEL_ID,
    contents=prompt,
    config=types.GenerateContentConfig(
        response_modalities=['Text', 'Image'],
        image_config=types.ImageConfig(
            aspect_ratio="1:1",
            image_size=resolution
        )
    )
)

Note: 4K generation comes at a higher cost, so use it wisely!

8) Polyglot Banana (Multilingual Capabilities)

The model can generate and even translate text within images across over a dozen languages. It's basically a universal translator for your eyes.

# Generate an infographic in Spanish
message = "Make an infographic explaining Einstein's theory of General Relativity suitable for a 6th grader in Spanish"

response = chat.send_message(message,
    config=types.GenerateContentConfig(
        image_config=types.ImageConfig(aspect_ratio="16:9")
    )
)

# Save the image
for part in response.parts:
    if image:= part.as_image():
        image.save("relativity.png")

# Translate it to Japanese
message = "Translate this infographic in Japanese, keeping everything else the same"
response = chat.send_message(message)

# Save the image
for part in response.parts:
    if image:= part.as_image():
        image.save("relativity_JP.png")

9) Mix it up! (Advanced Image Mixing)

While the Flash model can mix up to 3 images, the Pro model can handle up to 14 images! That's a whole party in one prompt. Perfect for creating complex collages or showing off your entire product line.

# Mix multiple images
response = client.models.generate_content(
    model=PRO_MODEL_ID,
    contents=[
        "An office group photo of these people, they are making funny faces.",
        PIL.Image.open('John.png'),
        PIL.Image.open('Jane.png'),
        # ... add up to 14 images
    ],
)

# Save the image
for part in response.parts:
    if image:= part.as_image():
        image.save("group_picture.png")

Note: If you want very high fidelity for your characters, limit yourself to 5, which is already more than enough for a party night!

10) Show off time! (Pro-Exclusive Demos)

Here are some examples of what's possible only with Nano Banana Pro. Prepare to be amazed:

Personalized Pixel Art (Search Grounding)

Prompt: "Search the web then generate an image of isometric perspective, detailed pixel art that shows the career of Guillaume Vernade"

This uses search grounding to find specific information about a person and visualizes it in a specific style.

Complex Text Integration

Prompt: "Show me an infographic about how sonnets work, using a sonnet about bananas written in it, along with a lengthy literary analysis of the poem. Good vintage aesthetics"

The model can generate coherent, lengthy text and integrate it perfectly into a complex layout.

High-Fidelity Mockups

Prompt: "A photo of a program for the Broadway show about TCG players on a nice theater seat, it's professional and well made, glossy, we can see the cover and a page showing a photo of the stage."

Create photorealistic mockups of print materials with accurate lighting and texture.

11) Best Practices and prompting tips for Nano Banana and Nano Banana Pro

To achieve the best results with the Nano Banana models, follow these prompting guidelines:

Be Hyper-Specific: The more detail you provide about subjects, colors, lighting, and composition, the more control you have over the output.
Provide Context and Intent: Explain the purpose or desired mood of the image. The model's understanding of context will influence its creative choices.
Iterate and Refine: Don't expect perfection on the first try. Use the model's conversational ability to make incremental changes and refine your image.
Use Step-by-Step Instructions: For complex scenes, break your prompt into a series of clear, sequential instructions.
Use Positive Framing: Instead of negative prompts like "no cars," describe the desired scene positively: "an empty, deserted street with no signs of traffic."
Control the Camera: Use photographic and cinematic terms to direct the composition, such as "wide-angle shot", "macro shot", or "low-angle perspective".
Use search grounding to your advantage: When you know that you want the model to use real-time or real-world data, be very precise about it. "Search the web about the Olympique Lyonnais' last games and make an infographics" will work better than just "an infographics of the OL last games" (which should still work, but don't take chances).
Use the Batch API to lower your costs and get more quota: The batch API is a way to send small or very large batches of requests together. They might take up to 24 to be processed, but in exchange you can save 50% on your generation costs. And the quota is also higher!

For a deeper dive into best practices, check the prompting guide in the documentation and the prompting best practices for Nano Banana published on the official blog.

Wrap up

Nano Banana Pro (Gemini 3 Pro Image) opens up a new frontier for AI image generation. With its ability to think, search, and render in 4K, it's a tool for serious creators (and serious fun).

Ready to try it out? Check my prompt guide, head over to Google AI Studio, try or customize our Apps or check out the cookbook.