DEV Community

We're the Google DeepMind Team building Gemini, Google AI Studio, and more! Ask Us Anything.

Paige Bailey on August 19, 2025

Hey DEV community! 👋 We're the team behind Google AI Studio and the Gemini API at Google DeepMind. We'll be answering your questions live on Aug...

Read full post

daniele pelleri • Aug 20

Curious if there are upcoming releases for Gemini CLI. In my tests it’s excellent at whole-repo analysis and strategy, but it often stumbles in execution (tools break and it loops).
Are any major releases planned? What kind, and on what timeline?
And will there be multi-agent support?

Paige Bailey Google AI • Aug 28

Hey there! Am glad to hear that you've been using and loving the Gemini CLI (us, too! 😄).

This update is via the Google Cloud folks who are building out the CLI:

Gemini CLI is constantly improving with new releases every week! You can expect broader quality fixes to be landing around the end of September. As for multi-agent support, that’s on the roadmap and is expected to be available in mid to late October -- stay tuned!

Attah Ephraim • Aug 30

Please how do I get to work for Google 🙏.

Anna Villarreal • Aug 20

Vector Databases and VR Question:

Do you forsee AI/vector databases being gamified in such a way that we can throw on a VR headset and 'swim' through the vector database, so to speak, sort of as a fun way to explore and retreieve data?

I'd like to try that, sounds fun.

Thanks.

Paige Bailey Google AI • Aug 28

I love the idea of being "immersed" in your data, and to use 3D space as a path to spot unexpected relationships in datasets! In addition to the recommendations from other folks on this thread, you might also be interested in checking out the Embeddings Projector, as a fun way to view and manipulate data in 3D space.

Attah Ephraim • Aug 30

Please how do I get to work for Google 🙏.

Anna Villarreal • Aug 29

Thats awesome!

Devin • Aug 28

Wow

Ava Nichols • Aug 28

Are these mostly used for demo or are they useful for practitioners?

Prema Ananda • Aug 21

Excellent idea!
But the main challenge is how to display 512+ dimensions of embeddings in 3D VR space?
Perhaps through interactive projections or using additional channels (color, sound, vibration).

Anna Villarreal • Aug 21 • Edited

Hi Prema, thanks for your response.

Im assuming it would be approached by taking the overall (x,y,z) of each individual vector and assign it some set volume in space, with some padding, so a user could navigate through the 'cracks'.

It would essentially be like swimming through a gas, but the molecules are ginormous so they are visible to the user... big enough that a user could select each one to see the details.

But small enough to sneak by each one as they gently nudge out of the way but return back to there normal position.

I think this could be done several ways. In my expereince tools that come to mind right away are blender and three.js! Haha.

Could even have a temperature map overlay, so a user could 'jump in' and explore search results based on their custom query and see how closely they are related. Or perhaps a pattern overly, to be accommodating for more users?

You know what. This would be really awesome for music exploration.

Jay • Aug 23 • Edited

Former game dev here. Blender is 3d modeling software, not really ideal for your use case. I just wanted to say that if you have a big idea like this, your often better off to try to make it yourself.

There are a couple of game engines that are free to use such as Unreal and Unity that provide VR support as well as plenty of online resources.

I would recommend Unity for this due to a combination of community support regarding tutorials, and it using C# as it's primary coding language. Most AI is pretty good at writing C# scripts (As long as you keep them modular) so you don't need to be a master programmer.

You might even enjoy learning how to use the game engine. In regards to visuals, you would also want to learn Blender for the 3D assets.

I don't foresee Google making anything like this as it's very niche and they prefer broad strokes, not to mention they had a pretty massive failure in the game industry and likely aren't looking to try again (Stadia).

Anna Villarreal • Aug 23

Thanks for your wide-lensed feedback. I have used unreal engine a bit but not to any major extent. Any reason you would use unity over unreal for something like this? Based on your answer, sounds like my orignial question is at the very least, possible.

Jay • Aug 23 • Edited

I can sort of picture your concept in my head. Unreal is a lot more complex, for me at least, in regards to setting up a system like that cause your options are their visual blueprints or C++ and the engine itself is pretty heavy on resources. Unity is lighter and i think scripting a system like that would be much easier in C# as long as you can optimize it.

You could probably just instantiate new nodes as your going along and cull anything thats out of view. Since its VR its going to be a bit heaftier to run so the smaller engine would likely be more stable for the average person :)

My discord is on my profile if you want to discuss it more over there. I don't really have any other socials lol

Frédéric NERET • Aug 20

When will it be possible to vibe code with Google Apps script? Thanks

Paige Bailey Google AI • Aug 28

Thanks for the question!

You can already use the Gemini APIs and Gemini in AI Studio to generate Apps Script code, which you can then pull into Google Workspace products (like Sheets). The Google Cloud team also has a few codelabs showing how to use the Gemini APIs with Apps Script (example).

Jishna M • Aug 26

I really have keen interest in drug development and personalized medicine using AI. My master's thesis was on find suitable drug candidates for PSP using graph neural networking and other AI techniques. I did use DeepMind's Alphafold2 also in it. I learnt everything by myself for it through online resources. But I feel overwhelmed with the vast number of online resources, and they are not that helpful to make a proper plan with tangible result to get better in the domain. So, if I want to one day work in DeepMind and be part of novel drug discovery, what are the steps I need to take?

Paige Bailey Google AI • Aug 28

It’s great to hear that you’re interested in AI for drug discovery! Google DeepMind, Isomorphic Labs, and our colleagues in Google Research are all investing very heavily in AI for health and the medical domain.

The skill sets that you would need would depend on the role that you would be interested in taking - for example, engineering, product, research, marketing, and more are all role profiles that we hire for in our AI for health orgs. For each of those focus areas, I would recommend that you continue building your expertise in AI and in the medical / life sciences, and make sure to share your work visibly - either via GitHub for open-source and software projects, or by publishing the research that you've been pursuing.

I'd also recommend building on or evaluating some of the open models that Google has released in the healthcare space, like TxGemma and MedGemma. Good luck, and am looking forward to seeing what you build!

Attah Ephraim • Aug 30

I wish to work for Google as a C++ Developer 🙏.

Jishna M • Aug 30

I am an AI engineer by profession. So any specific guidelines I can follow to attain a position at Deep Mind in the Drug Research group? To attain an interview call, or what all should I prepare etc.

Ha3k • Aug 27

Can we have a virtual hackathon solely focused on building AI apps in ai.studio?

Patrick Loeber Google AI • Aug 28

I love that! We’re planning to run more hackathons later this year and I'll make sure to forward that idea!

Attah Ephraim • Aug 30 • Edited

I wish to work for Google as a C++ developer scraplinkecomarket.netlify.app
My work with html and css and js.

Herrmer • Aug 28

Yes!

Herrmer • Aug 28

On DEV!

Vivian Jair Google AI • Aug 28

Stay tuned here - may or may not have something coming soon!

Sherry Day • Aug 26

How does 'Search-grounded' mode work under the hood—are citations confidence-weighted and deduplicated? Can we constrain freshness windows, force certain domains, or provide our own corpus for grounding?

Alisa Fortin Google AI • Aug 28

The secret sauce is the same as Google Search because the tool relies on the Google Search Index. Currently, the groundingMetadata does not expose a direct confidence score for each citation. The presence of a citation indicates the model found that source relevant for generating a specific part of the response. In terms of deduping, the system generally attempts to provide unique and relevant sources. While you might see citations from different pages on the same domain if they each contribute distinct information, the goal is to provide a concise set of the most useful sources rather than a long list of redundant links.

For bring your own search scenarios, try using function calling with RAG flows

In terms of working under the hood, the first thing the tool will do is analyze your query. For example, a prompt like "Who won the F1 race last weekend?" will trigger a search, while "Write a poem about the ocean" likely won't. The model then formulates one or more search queries based on your prompt to find the most relevant information from the Google Search Index. The most relevant snippets and information from the search results are fed into the model's context window along with your prompt. The model uses this retrieved information as its source of truth to generate a "grounded" response. The API returns the response along with groundingMetadata. This metadata includes the source URLs for the information used, to build citation links back to the original content for verification.
We are working on a filter to constrain to date ranges. You cannot force certain domains (use URL Context for that), but you can exclude some domains from search. The “Bring your own search” option is available through Vertex.

Ava Nichols • Aug 19

How influenced are you by the work done from other companies (i.e. OpenAI releasing GPT-5 recently etc)

Paige Bailey Google AI • Aug 28 • Edited

It's always inspiring to see the recent surge in AI development – both in the modeling and product space! 😀

At Google, we ensure many different closed (ex: Anthropic) and open models are available to our customers on Google Cloud via the Vertex AI Model Garden. We also support many of the research labs via both our open machine learning frameworks (JAX) and hardware (TPUs and GPUs) for training on GCP, and have been excited to see many startups and enterprises adopt the Gemini and Gemma models.

Our DevX team has also been hard at work adding or improving support for the Gemini APIs and Gemma models into developer tools (like Roo Code, Cline, Cursor, Windsurf, etc.) and frameworks (LangGraph, n8n, Unsloth, etc.). More to come, we all go further when we're working together as one community.

Osinachi Okpara • Aug 26

What would it take to intern as a devrel for DeepMind?

Paige Bailey Google AI • Aug 28

We regularly have engineering and product internship roles available in Google and at Google DeepMind! I recommend checking out our careers pages, and searching for "internship".

If you’re interested in a career as a developer relations engineer, I would recommend building in the open - contributing to open-source projects, sharing your work publicly (on social media, and on GitHub) and investing in supporting your local and online developer communities. Many DevRel folks start their careers as software engineers, and then gradually move to a more community-facing role.

Devin • Aug 28

On this subject, how do you think the idea of internships will evolve in the future? There's so much written about how AI is particularly affecting entry-level jobs. What do you think needs to change for employers to be able to best support this kind of work?

ABDERRAHIM • Aug 21

Hi. I can’t turn my Gemini-powered voice assistant into a real product because the input price for audio on Gemini 2.5 Flash is $1.00. Any plans to make audio input pricing more like text input? (My assistant also pays for Google Cloud TTS). Who should I contact for this and to request higher usage limits?

Ivan Solovyev Google AI • Aug 28

While we don't have any immediate plans to reduce the audio input price, we're always looking for ways to make our models more accessible.
Feedback like yours directly influences our future plans, so thank you for raising this.

For higher usage limits, the best next step is to fill out the request form here: docs.google.com/forms/d/e/1FAIpQLS...

Kenneth Brown • Aug 28

What advice do you have for someone who is considering signing up to a CS Bootcamp vs. going all-in on building with AI tools?

Paige Bailey Google AI • Aug 28

Great question, and I know a lot of folks have this top-of-mind. 👍🏻

For programs like a CS Bootcamp or attending a university, I'd say the biggest value that you're really getting is the in-person community. Many educational structures are still catching up to state-of-the-art in AI and in building product-grade software systems, so the coursework you'd be completing might not be aligned with the latest model and product releases - and those features / models are changing minimally weekly, if not daily, which makes it a challenge for educators to keep their curriculum up-to-the-minute.

To build up expertise and the skill set for working with AI systems, I would strongly suggest to just start building: find a problem that really bugs you, use AI to automate it, and then share your work visibly externally -- via GitHub and social media. This is a really useful way to get product feedback, and to get inspired! There are also frequently AI hackathons happening, either in-person or online (ex: the Major League Hacking events list and DevPost are great places to look).

Jess Lee • Aug 28

You can also check out DEV Challenges 😇

Ben Halpern • Aug 28

What was it like at Google when Chatgpt launched?

Herrmer • Aug 26

Is Imagen legacy at this point?

Alisa Fortin Google AI • Aug 28

We do not consider the Imagen model family to be legacy. Imagen is still a specialized model that is lower latency, has different pricing options and is recommended for photorealistic images, sharper clarity, improved spelling and typography. You can use Google AI Studio to play with both Imagen and Gemini 2.5 Flash Image models and compare the results for your specific use case.

Apple Dev • Aug 26

Are there first-class APIs for plan-and-execute (task graphs, subgoals, retries) and multi-agent coordination? How do you sandbox tool scopes per agent and persist/restore agent state across sessions?

Dave Elliott Google AI • Aug 28

For complex AI agents, Google's Agent Development Kit (ADK) offers first-class APIs for advanced features.

Orchestration and Coordination: ADK provides built-in patterns for plan-and-execute workflows with LlmAgent and the PlanReActPlanner. For multi-agent systems, the Agent-to-Agent (A2A) protocol allows different agents to communicate and delegate tasks, and ADK is a primary client for this open standard.
Security and State: for tool sandboxing each tool is configured with its own authentication. ADK separates the agent's permissions from the tool's, allowing for precise control and security guardrails via IAM policies. For prersistent state, the SessionService in ADK manages an agent's memory. You can configure it to store session data in a database or, preferably, use the Vertex AI Agent Engine to automatically handle persistent state across sessions.

Alverto Ortega-Garcia • Aug 28 • Edited

I have a Bachelors degree in Computer science, it's been almost 5 years, I graduated during start of covid pandemic. I haven't had a "real" job working at tech company YET(I hope). I've just been doing freelance Web Development work since then, my concern and question is, do you think I still have some hope getting in the tech industry? it's been a bit scary especially now with all the AI surge and hearing of job losses... I could use some motivation, though I won't give up! hehe. Many thanks!
(this took some courage for me to open up about)

Patrick Loeber Google AI • Aug 29

Hi there, thank you so much for your courage in sharing this, it's a completely understandable feeling, especially with how chaotic the market feels right now. But I do think that with AI generated code, the demand for writing software is only increasing, and we will always need engineers who can solve real problems, architect systems, and guide those AI tools. Companies definitely want to see that you can leverage AI to ship fast now, so you're in a great position to learn and showcase that. I would fully embrace your freelance journey as your unique strength, keep building, share your projects in public, and keep applying :)

Alverto Ortega-Garcia • Aug 29

Thank you so much for taking the time to reply and to give me that hope! I will remember you, google and dev.to co when I achieve my goals ! :D Good things are coming your way.

Ashley Childress • Aug 21

I’ve prepped for the GenAI Leaders exam, taken the classes, studied the docs — I want to believe Gemini has the answers. But so far, I’m just not seeing it.

I’ve been an early adopter of Google products forever, and until Gemini, I’d never turned off a Google beta. This one? I hated it. When the model showed up in Copilot, it was barely functional for weeks. Fine, growing pains. But just this past weekend I gave it another shot and spent nearly two hours training and testing a Gemini Gem. It was underwhelming at best.

Here’s where I’m stuck: GitHub has the developers and Microsoft has the business. OpenAI has the public (and plenty of devs). Claude has its own cult following because it’s that trustworthy. Even Kiro, out of nowhere, has the latest in planning and developer flow.

I’ve read all the promises about making Gemini accessible to millions of developers. Accessibility is great, sure — but my toaster is accessible, and that doesn’t make me crave toast. 🍞

So my questions are:

As an end user (developer or otherwise), where does Gemini fit in the market? What’s its actual niche?
What does Gemini’s market share really look like? Is it accessibility? A superior product? Higher rate limits? Something else that makes it worth choosing over competitors?
If you could show me just one thing today that would change my mind and make me want to recommend Gemini, what would it be?

Paige Bailey Google AI • Aug 28

Thank you for the feedback! 🙏

The Gemini App team is a consumer product, working hard on things like personalization, integrations with Google Workspace and Google Maps, Gems for workflow automations, and more. They have also been investing in deep integrations with Google Search, like AI Mode - and built-in citations for its responses, grounded in up-to-date results. Subscribers for Gemini Pro and Gemini Ultra also get premium access to some of Google Labs' exciting new products, like Jules, Flow, and Whisk.

These use cases for the Gemini App are more focused on consumers, while the Gemini APIs and AI Studio (where the folks on this AMA sit! 🙂) are more focused on developer and information worker use cases.

Ashley Childress • Aug 29

Thanks for the detailed breakdown! Sorry I couldn’t catch y’all live.

To clarify, I wasn’t asking about Gemini the app — I meant Gemini the model. Do you think it can establish itself as a recognizable brand the way other companies have with their solutions? If you’re still game to answer, we can definitely frame it around developers as the main user group, since that seems most relevant to your day-to-day. 😁

Fraser Young • Aug 28

With 2M-token contexts, what are the semantics of memory garbage-collection (pinning, TTLs, eviction heuristics)? Can callers attach per-segment priorities and get visibility into which chunks were actually attended to?

Varshith V Hegde • Aug 21

I have been using gemini api for all my personal projects , thanks to your generous free teir it helped to get from my idea to app .

So what is the next plans of Google Ai studio , like recently I have seen to many changes Idx editor to Firebase studio and now AI studio . Is there any plans to create a platform to tweak model itself something like that , to make our usecase more specific may be ??

Paige Bailey Google AI • Aug 28

Thanks for testing out AI Studio and the Gemini APIs - especially the Build feature in AIS!

As you've probably seen first hand, AI Studio's Build feature gives you the ability to create and deploy apps, quickly and securely, via deep Google Cloud platform product integrations like Cloud Run and Logging. More production-grade features are soon to come, please keep sending your requests!

If you'd like to fine-tune / customize the Gemini models, or smaller open models (like our Gemma 3 model family), please make sure to check out the fine-tuning capabilities in the Vertex AI Model Garden. For most Gemini 2.5 use cases, we've seen folks have success with composing the APIs with things like vector databases for retrieval and prompt engineering - no fine-tuning required. 🙂

Astrodevil • Aug 19

OpenAI also released open models, while large models from China seem to be performing better on benchmarks.

Where do you see the main focus going at your side? more small open models or still large close models

I also tested Gemma3-270M finetuning and production usage, its really useful for lots of usecases.

Paige Bailey Google AI • Aug 28

Am so glad to hear that you've been enjoying the Gemma open models, please keep the feedback coming! :)

Google DeepMind is significantly investing in both our Gemini models (available as APIs) and our Gemma open model family, the latest releases being Gemma 3 and Gemma 3n. We also have open models in other domains, such as much generation (ex: Magenta). Stay tuned for more releases, both closed and open-source.

Ansil Graves • Aug 26

Does the Gemini API support schema-constrained decoding (e.g., JSON Schema) with hard guarantees that the stream is always valid JSON? If the model deviates mid-stream, can we enable automatic repair/retry, and can we surface token-level error locations for debugging?

Ivan Solovyev Google AI • Aug 28

Yes, our structured output feature is designed for exactly this. It guarantees that the model's output will be valid JSON that conforms to the schema you provide.
You can check out the documentation for it here: ai.google.dev/gemini-api/docs/stru...
Regarding automatic repair and surfacing token-level errors, those specific capabilities aren't available just yet.

Herrmer • Aug 28

Are you able to say the direction you're going with working on those capabilities?

Ishaan Sheikh • Aug 21

Do you think we are reaching the scaling limits for AI? And AGI is not possible with the current setup.

Jay • Aug 23

I'd like to add to this.

If scaling really is seemingly endless (per Zuck’s take), what’s DeepMind’s contingency for when raw parameter growth stops giving useful returns? Do you already have a ‘post-transformer’ plan?

Herrmer • Aug 19

How are non-text-based models like video/etc. similar and different from what I understand to be a text-based LLM?

Alisa Fortin Google AI • Aug 28

That's a great question! It highlights a fascinating area of AI research. LLMs are focused on understanding and generating human-like text; other non-text-based models have distinct differences in how they process and represent information.

Hello and very interesting question. All the models are based on neural networks foundations and learn from being exposed to vast amounts of data with a key goal to learn meaningful representations of the input data. For LLM this might be a numerical vector that captures a meaning of a specific word or sentence. For an image model it might be a vector that describes objects, textures or scenes in the image. This allows them to successfully generate new content. Just like LLMs, these models are also generative - whether they generate images or videos or compose music. The concepts of pre-training and fine-tuning for a specific use case are also similar with these models.
The differences are data structure and modality, text for LLMs, and 2D images for image and video. LLM process tokenize words, while image or video models process pixels or frames. GenMedia models are more costly computationally. LLMs lean into semantic understanding while genmedia is more perceptual understanding of frames, patterns, etc. The future is MULTIMODAL

Glenn Trojan • Aug 19

What type of developers besides academic AI specialists have roles in an AI product team?

Patrick Loeber Google AI • Aug 28

good q! Beyond academic AI specialists & researchers, a successful AI product team is built by a variety of developers who operationalize the core model: Data Engineer, MLEs, Backend/Frontend/Fullstack, MLOps, Site Reliability Engineers, and also customer facing engineers/developer experience engineers.

If you go to our careers page and filter for engineering & research, you'll get a good idea :) deepmind.google/about/careers/?cat...

Glenn Trojan • Aug 28

Do you find Google has explicitly changed many of their hiring practices in the advent of AI? Seems like a lot of the role titles I would expect are there, but are the expectations and pace etc different from when you were earlier in your career?

Devin • Aug 28

How do you see Gemini being used in robotics?

Paige Bailey Google AI • Aug 28

Absolutely love this question, and think that robotics is one of the most exciting use cases for our Gemini Live models, as well as fine-tuned versions of Gemma. The TL;DR is that we're using the Gemini APIs for things like live conversations (real-time dialogue with robots); triggering tools and on-device models as function calls; and planning robotic actions.

You can learn more about the embodied intelligence work that the DeepMind robotics team is doing with the Gemini APIs and fine-tuned open models (like Gemma) at the links below:

Devin • Aug 28

Who's the funniest person on the team?

Alisa Fortin Google AI • Aug 28

I consider myself to be pretty funny, but do not know if that sentiment is shared across the team.

Paige Bailey Google AI • Aug 28

+1, am voting for Alisa! Bonus points for the Nano-Banana reference 🤣

Devin • Aug 28

This is great

Jess Lee • Aug 28

bahaha

Herrmer • Aug 28

coppins • Aug 28

When I think DeepMind I think of AlphaGo and other more research-y areas.

How does the organization balance "research" vs. "available for production" offerings ?

Paige Bailey Google AI • Aug 28

In addition to driving truly groundbreaking research (ex: AlphaEvolve, AlphaStar, AlphaGo, etc.), DeepMind has expanded its scope in the last ~year to also own many product experiences - the Gemini APIs, AI Studio, and the Gemini App (just to name a few).

We're also increasingly seeing the line between "research" and "product" to be blurred: many new model capabilities inspire products and features, and users' explorations can inspire new frontier model capabilities and improvements. As a person who has worked both in a pure-research org, and in pure-product orgs, this blend of the two makes a ton of sense, and leads to much better product experiences for all of our customers. 😄

Jay • Aug 20

Regarding a Veo API, and I apologise, I haven't followed this project much, but are you going to make it possible for people to be able to download and run this locally or integrate it with offline stacks at all?

Also, do you have any plans to make any sort of opensource text-to-speech models? I have yet to find something viable for my use case that doesn't sound likea cursed speak and spell or spend a fortune on tokens through something like ElevenLabs.

Alisa Fortin Google AI • Aug 28

Hi! Great questions! Re: running Veo on prem. We don't have any concrete plans for that right now. Our priority is delivering a robust and scalable experience through the cloud-based Gemini API. That said, we are in contact with our on-prem teams to better understand the demand for this kind of solution - we know they are working hard to bring Flash and Pro to Google Distributed Cloud for example.

Re TTS: We definitely know that high quality open source TTS models are in demand and we do understand the need for them. I recommend keeping an eye on what the Gemma model team has been releasing, though we do not have concrete plans we can share at this time.

Jay • Aug 29 • Edited

Thanks for the response! I did end up finding an open-source TTS that works offline, which fits my needs for now. That said, I completely understand the business need for subscriptions. The challenge is that when everyone goes monthly, users hit fatigue—most of us end up picking which tools to keep and which ones to cut when finances get tight.

Personally, I’d happily pay a one-time license fee (say ~$200 CAD) or a fair upgrade price down the road. It’s a model a lot of us genuinely miss, and I think it’s a way to capture a user base that avoids ongoing subscriptions entirely. In my view, leaving that option off the table risks missing a meaningful segment of developers.

Parker Waiters • Aug 26

Will the Veo API expose keyframes, camera paths (dolly/orbit), shot length, and temporal consistency controls? Is video-in/video-out editing (masking, inpainting, style transfer) on the roadmap for programmatic workflows?

Alisa Fortin Google AI • Aug 28

Yes, we are planning on additional controls for Veo 3, and camera controls have been most in demand. What are the controls important for your use cases?

Sawyer Wolfe • Aug 26

For streaming vs non-streaming requests, what p50/p95 latencies should we plan for across Flash vs Pro-class models? Any difference between HTTP/2 vs gRPC endpoints, and do you recommend persistent connections for best tail latency?

Paige Bailey Google AI • Aug 28

Thank you for the question!

Google is constantly investing in improving the latency for our Gemini models, so the expected response time for your inference calls is expected to change. As a way to plan, I would recommend experimenting with different models, and taking a look at the model cards associated with each of our models - for example, Gemini 2.5 Flash (with Thinking Mode turned off) is much faster than Gemini 2.5 Flash (with Thinking) or Gemini 2.5 Pro, while still showing strong performance on model evals.

For especially latency-sensitive use cases, I would recommend testing out Gemini 2.5 Flash Lite, as well as some of our smaller, open on-device models (like Gemma 3 270M).

Dream • Aug 26

Any plans for unified, modality-agnostic pricing (normalized 'token equivalents') and hard monthly spend caps with circuit breakers? For audio/video, can we pre-quote cost from media duration/shape before we run?

Alisa Fortin Google AI • Aug 28

We definitely recognize the desire for a simpler, unified pricing metric. It's a key area of exploration for us. The current model prices each modality (text, image, audio) by its most direct computational unit, but we are actively investigating how to best abstract this into a more streamlined model for the future.

Monthly spend caps: While the Gemini API itself doesn't have a built-in spend cap, you can implement an effective circuit breaker today using the broader Google Cloud platform and setting up your billing budget.

Regarding pre-quoting costs, the pricing for audio and video is based on duration (e.g., per second or per minute). This means you can easily pre-calculate the cost. Before you make the API call, simply get the duration of your media file and multiply it by the rate listed on our pricing page. This allows you to build cost estimates directly into your application's workflow for full transparency. We are working on improving our billing dashboards to make things easier in the future though!

Harris French • Aug 26

Is there a low-latency 'RT' API that supports audio-in/audio-out with barge-in, partial tool calls, and server events/WebRTC? What is the expected end-to-end latency budget for full duplex speech interactions?

Ivan Solovyev Google AI • Aug 28

The Live API is exactly what you're looking for. It's our bidirectional streaming API designed for real-time, conversational AI and is powered by our latest native audio models.
This means it handles audio-in and audio-out directly, and can even process real-time video.
As for latency, we're targeting an end-to-end budget of 700ms. We're aiming for this to keep the interactions feeling natural, much like the normal pauses in a human conversation. We'd love for you to give it a try!

Glenn Trojan • Aug 28

By and large, how does your team think about AI slop, dead internet theory, misinformation, etc.

Jason Burkes • Aug 26

Will the Gemini CLI support a local emulator for tool/function schemas so we can unit-test prompts offline (goldens, snapshot diffs), and then replay the same transcripts against the cloud for parity checks?

Patrick Loeber Google AI • Aug 28

This is an awesome idea! While we don't currently support this at the moment, we’re happy to consider this! Gemini CLI is fully open-source so if you file a feature request with a detailed explanation of how you would expect it to work, we can then prioritize it. If you are super keen you can even contribute the feature yourself :)

Xavier Mac • Aug 26

Can developers tune safety thresholds per category (e.g., self-harm, medical, IP) and attach allow/deny lists or regex/DSL guardrails that run before/after model output? Any 'safe sampling' modes that reduce refusal rates without policy violations?

Alisa Fortin Google AI • Aug 28

The content safety filters are available for developers and are turned off by default. On the Gemini API, we do not provide an ability to tune any other safety thresholds at the moment because they are directly tied to our usage policies.
For larger enterprise use cases, Vertex AI does offer an option to turn off some filters for trusted and vetted customers that we know will continue to adhere to safety policies. Customers can usually apply using a form and there is a committee that reviews all requests and a team that manages allowlisting.

Hamm Gladius • Aug 26

Can we enforce zero-retention and region-locked processing (e.g., EU-only) per project? What compliance envelopes (SOC 2, ISO, HIPAA-adjacent) are supported, and how do we audit that our traffic stayed in-region?

Patrick Loeber Google AI • Aug 28

The Gemini API does not offer this yet, but for region-locking and zero-retention you can use Vertex. If you build with our GenAI SDKs you can easily migrate.

Some helpful links:

Richard Mirks • Aug 28

Imagine a wildfire response team with spotty connectivity using body-cams and drones. They need offline scene understanding (smoke/plume segmentation), on-device speech translation for evacuees, and syncing to cloud when a satellite link appears. What would a Gemini-based architecture look like (model choices, on-device vs cloud split, failover), and what reliability/validation metrics would you commit to for life-critical use?

BBM • Aug 26

Do you offer multilingual embeddings with consistent cross-lingual alignment? What dimensions/similarity metrics are recommended, and how stable are vector spaces across model updates (backward compatibility guarantees)?

Patrick Loeber Google AI • Aug 28

Do you offer multilingual embeddings with consistent cross-lingual alignment?
Yes. You may find some public evals in the tech report and the blog post:

What dimensions/similarity metrics are recommended?
We report evals and different dimensions in our tech report (see link above), and we use dot product.

How stable are vector spaces across model updates (backward compatibility guarantees)?
Vector spaces are not backward compatible.

Ethan Anderson • Aug 26

Do you publish first-party eval suites and regression dashboards with per-release deltas for coding, reasoning, safety, and multimodal? Can we pin a model+patch level and receive advance deprecation notices with auto-evals on our private test sets?

Christopher Wright • Aug 26

What are the semantics for parallel tool calls (fan-out/fan-in), backpressure, and partial tool result streaming? Can we cap concurrency per request and get cost/latency attribution per tool invocation?

Aaron Gibbs • Aug 26

Can we set a seed that yields deterministic outputs across regions and hardware targets? Are logprobs/logits available for all models, and do you publish drift/change logs so we can reproduce results after minor model revisions?

Patrick Loeber Google AI • Aug 28

The Gemini API supports a seed and logprobs, you'll find more info on the API reference docs
ai.google.dev/api/generate-content

For more advanced model skew and drift monitoring you can switch to Vertex and use Vertex model monitoring:
cloud.google.com/vertex-ai/docs/mo...

Grant Wakes • Aug 26

What fine-tuning paths exist today (SFT, LoRA/PEFT, preference tuning like DPO)? Is there a built-in eval harness in AI Studio to A/B base vs tuned models with statistical significance and dataset versioning?

Paige Bailey Google AI • Aug 28

Hey there, thank you for the question! The options you have for fine-tuning depend on both model (ex: Gemini vs. Gemma) and platform (ex: AI Studio vs. the Vertex AI platform). We recently dropped fine-tuning support for the Gemini APIs in AI Studio, and we do not have a built-in eval harness via AIS; however, there is a built-in eval framework in Vertex AI that might be useful for your use cases.

Related: the Gemini APIs team is partnering with open-source evals providers, like Promptfoo, to have more robust support for the Gemini APIs and all of its features; and with open-source fine-tuning frameworks (like Unsloth and Hugging Face) to offer fine-tuning support for our Gemma open model family.

Tullis • Aug 26

Is there server-side prompt+tool caching or response memoization we can opt into? If so, what’s the cache key (model/version, tools, system prompt, attachments, etc.), TTL, and invalidation behavior after model updates?

Kenneth Brown • Aug 28

Can you talk about the role of AI Benchmarks in model development? Are companies intentionally (or unintentionally) designing the models to excel at these benchmarks even if that sacrifices overall usefulness in a broader context?

Kate Olszewska Google AI • Aug 28

Our top priority is to design models that work well for users’ real-life use cases. We track that "usefulness" through a variety of signals one of which is through various AI benchmarks (both created privately and public) so we track them to the extent they are useful in providing us with that signal and communicating it externally.
Can't really speculate on what other companies are optimizing for

Kenneth Brown • Aug 28

Thank you!

Devin • Aug 28

Thank you

Ava Nichols • Aug 26

With extended context enabled, what chunking/windowing strategies do you recommend to keep retrieval precise (e.g., hierarchical summaries, segment embeddings)? Any guidance on time-to-first-token and throughput impacts at 2M tokens?

coppins • Aug 28

Can you help breakdown the general terms of "AGI" vs. "Super AGI" vs. "Super Intelligence" vs. "Recursive Self Improvement" vs. "Singularity" ?

I feel like there's a lot of crossover and ambiguity.

Also, any predictions on what the future looks like in this regard?

Nube Colectiva • Aug 22 • Edited

Thanks, The Veo AI tool for videos is perhaps the best right now. They've done a great job.

Kenneth Brown • Aug 28

Semi-random: I have friends in my personal life who criticize my use of AI for environmental reasons (water usage, etc.). My sense is that most of those statistics are overblown, especially in context of other regular activities.

But my question is: do you have folks in your personal life who are in any way critical of your line of work? How do you deal with that?

ABDERRAHIM • Aug 22

I am using the 'Gemini 2.5 flash' model in a translation app project, and everything works well. However, when I switch to the 'Gemini 2.5 flash-lite' model to save on costs, the latter model tends to repeat phrases (it translates the same phrase more than once, creating duplications) and is not as accurate as the other model. I hope you can fix this in the next version.

Richard Mirks • Aug 26

Do you support mask-based edits, reference-style adapters, or LoRA-style image adapters? Will outputs ship with C2PA provenance by default, and can developers opt in/out at the request level?

Alisa Fortin Google AI • Aug 28

We do not support mask-based edits, reference-style adapters, or LoRAs - but I recommend to give Gemini 2.5 Flash Image a try in AI Studio and see if the model is capable of addressing your use cases.
Our models are compliant with provenance and we use Synth ID as a digital watermark.

BernerT • Aug 26

Can Lyria export multitrack stems, tempo maps, and/or MIDI for downstream DAWs? What are the licensing/indemnity terms for commercial use, and are there guardrails around named-artist style emulation?

Ivan Solovyev Google AI • Aug 28

The current experimental version is built for real-time streaming, so it doesn't support exporting multitrack stems or MIDI for DAWs just yet. However, we are hard at work on our next generation of music models and APIs, so definitely stay tuned!

Regarding artist styles, we do have guardrails in place. To avoid generating pieces that are too similar to copyrighted content, the model is designed to filter out prompts that explicitly mention artists by name.

Ivan Isaac • Aug 26

Do you offer a compatibility shim for common OpenAI patterns (JSON mode, tool schema, vision/video inputs) and a guide on behavioral differences to avoid edge-case regressions during migration?

Patrick Loeber Google AI • Aug 28

Thanks for the question! yes we do have an OpenAI compatibility layer that supports most of those patterns, see here: ai.google.dev/gemini-api/docs/openai.

While we actively extend features, it may still have limitations. We recommend checking the docs and joining the Gemini developer forum for more questions around that.

coppins • Aug 28

Are you allowed to use non-Google models in your day-to-day work at Google? Like can you code with Anthropic LLMs or is that frowned upon?

Patrick Loeber Google AI • Aug 28

Other models like Anthropic's are accessible via Vertex, and exceptions may be made for specific tools if you request them, but we now primarily use Gemini 2.5 Pro for coding.

coppins • Aug 28

Thanks, make sense. Do y'all get access to early release models for in-house usage?

Patrick Loeber Google AI • Aug 28

yes, that's one benefit of being in our team

Ava Nichols • Aug 28

What's your org structure like and how has it changing over time?

Alisa Fortin Google AI • Aug 28

Google AI Studio started as a Google Labs project, but our API was so successful with developers that we moved to Google Cloud before finally landing in Google DeepMind. I love that we are able to be closer to research and this move really sped up our ability to get the newest models to developers.

We are a small scrappy team that has a tight knit relationship between PM, model teams, eng, Dev rel, and GTM teams. Because there are so few of us, we have to work as a single unit to bring new models to the world quickly, but also safely and successfully. I love the pace we work at, though I’m definitely taking a little vacation after #nanobanana

Доктор Денисиус • Aug 27

Can we have GeminiApp in Google Apps Script soon?

Paige Bailey Google AI • Aug 28

Thanks for the question!

Devin • Aug 28

How much work does AI do in your day-to-day and how has that changed over time?

Sherry Day • Aug 28

When models are hot-patched, will you publish reproducible “model-diff” artifacts (weight deltas, eval deltas) and support shadow deployments so teams can canary traffic with automatic rollback on regression?

Xavier Mac • Aug 28

What's your roadmap for robustness against multimodal adversarial attacks (audio perturbations, image patches, typographic illusions, UI overlays) and how can developers fuzz these systematically pre-launch?

Alisa Fortin Google AI • Aug 28

Great question. Think of our approach as a four-part defense plan. First, we ensure our training data is clean and secure. Second, we design the model's core architecture to be inherently skeptical, especially where text, images, and audio mix. Third, we constantly put the model through a "sparring gym" with adversarial training, teaching it to recognize and ignore attacks. Finally, our own teams are always trying to break it (red teaming) before it ever gets to you. To help with reactive safety, we are constantly shipping new classifiers to help us catch any bad actors across different areas.
For developers, the best strategy is to think like an attacker. Before you launch, try to fool your own app. Systematically fuzz it by adding subtle noise to audio commands, sticking weird patches on images, hiding tiny text prompts in pictures, or creating fake UI overlays. Automating these kinds of tests is the most effective way to find and patch up these multimodal weak spots before they cause real trouble.

Christopher Wright • Aug 28

What guidance (and product features) do you have for education that embraces AI: authentic-assessment patterns, provenance/watermarking for student work, and guardrails that promote learning over shortcutting?

Parker Waiters • Aug 28

How do you handle temporal reasoning (e.g., stale facts, embargoed knowledge, future-dated content)? Is there a notion of time-aware decoding or decay that prevents confident answers about outdated sources?

Charles Brown • Aug 28

For enterprise privacy, can we bring our own KMS/HSM with envelope encryption, receive zero-knowledge audit proofs for no-retention processing, and run privacy red-teaming that yields machine-readable findings?

Dave Elliott Google AI • Aug 28

You can implement most of these enterprise privacy controls with Google's AI and cloud services:

Key Management: Google Cloud supports BYOK and HYOK. You can use Cloud KMS and Cloud HSM to store your encryption keys in secure hardware that you control. For the highest level of control, Cloud External Key Manager (EKM) lets you manage your keys completely outside of Google's infrastructure.
Zero-Knowledge Processing: yep, you can configure Vertex AI to enforce zero data retention policies by disabling features like caching and prompt logging. This ensures your data is not stored after processing. While full zero-knowledge proofs for every AI interaction are still an emerging area, Google is actively researching and open-sourcing related technologies.
Privacy Red-Teaming: we have our own AI Red Team that proactively identifies privacy and security vulnerabilities in its AI systems. This work is part of Google's Secure AI Framework (SAIF), which provides security best practices. The methodologies and findings can be used to help you perform your own privacy assessments.

Noah Fischer • Aug 28

Will AI Studio support a true third-party ecosystem (tools, datasets, inspectors) with versioning, revenue share, code signing, and policy review—so developers can monetize capabilities safely inside other teams' apps?

coppins • Aug 28

Seemingly ever time I open my feed on X, Reddit, DEV, etc., I'm hit with a flurry of posts that make me feel like I'm being "left behind" by the latest/greatest in AI Land.

How do you deal with the dizzying rate of change?

Patrick Loeber Google AI • Aug 28

You’re not alone here, even we can’t keep up with everything. I recommend following a few selected newsletters.
For me news.smol.ai/ and X helps me keep up to date on a high level, and then I dive more into topics that interest me and/or are connected to my work.

BBM • Aug 28

Will you integrate formal methods (Z3/SMT, Coq/Lean proof hints) so codegen can emit verifiable contracts and proofs, not just tests, and surface failed proof obligations back to the prompt?

Ava Nichols • Aug 28

When did you decide you wanted to work in this field? (For anyone on the team)

Alisa Fortin Google AI • Aug 28

My "aha" moment really started back when I was a video game producer, seeing how we could build complex, interactive systems to help players be more effective and creative. My core passion has always been making people more productive, and I realized Gen AI is the ultimate tool to supercharge that for everyone.

When I worked in EdTech, I saw AI's potential to democratize education and make learning deeply personal and engaging, which always leads to better outcomes.

Getting to work on this team and focus on safety has been the most rewarding part, as it feels like I'm helping make this powerful technology a positive and safe force for the world.

Grant Wakes • Aug 28

Can Gemini expose a calibrated abstention mode that explicitly says “I don't know,” returns confidence intervals, and emits pointers for how a caller could reduce uncertainty?

Marcello Cultrera • Oct 14

Absolutely, rather than simply stating “I don’t know,” the system should provide a structured response that incorporates confidence intervals and practical uncertainty estimates.

It shouldn’t fail quietly but instead clearly communicate what specific information is lacking, suggesting ways to clarify ambiguity and propose follow-up questions shows cognitive flexibility and active reasoning.

A higher level of cognitive sophistication is diplayed by Claude and OpenAi which have worked on such structure.
The goal is to create systems that reason effectively under uncertainty and make that uncertainty transparent and understandable to the user or entropy just prevails.

Jerry Hargrive • Aug 28

Could you ship end-to-end “fact-chain” explanations (input → retrieval → intermediate steps → output) for non-trivial answers across text, code, image, and audio, with a public spec for how those traces are generated and verified?

Gohar • Aug 20

That sounds like a Reddit AMA (Ask Me Anything) announcement. It means the Google DeepMind team is inviting people to ask them questions about their projects like Gemini (Google’s advanced AI model), Google AI Studio (a platform for building with Gemini), and other research or tools they’re developing.

Ivan Isaac • Aug 28

How are accessibility features (screen readers, captions, image alt-text generation, haptics) designed and tested for developers and end users with diverse disabilities—and can we programmatically assert accessibility budgets in AI Studio projects?

Sejal • Aug 29

Hi Google DeepMind team! 👋

As developers increasingly adopt structured approaches to AI integration (like JSON prompting for better consistency), I'm curious about your perspective on API design patterns that maximize reliability.

What are the best practices your team recommends for integrating Gemini API into existing developer workflows to ensure scalability and reliability, especially for multi-modal AI applications? Are there specific patterns or architectural approaches you've seen work particularly well in production environments?

Also excited to try the extended context features in AI Studio - thanks for making these tools accessible to the developer community!

Rajesh Patel • Aug 29

Hey team! Really impressed with Gemini's multi-modal capabilities. 🎨

How does Google DeepMind approach balancing model complexity and performance in Gemini when supporting multi-modal inputs like video, images, and text simultaneously? Are there specific optimization techniques or architectural decisions that help maintain response speed while processing diverse data types?

Curious about this from both a technical implementation perspective and developer experience standpoint.

nikhil sachapara • Aug 29

Hello Google DeepMind team!

As AI applications scale from MVP to serving thousands of users, performance optimization becomes critical. What strategies does your team suggest for optimizing latency and throughput when deploying Gemini API in production environments with high concurrency?

Are there specific caching strategies, request batching techniques, or infrastructure patterns that work particularly well with your API? Also curious about monitoring and debugging best practices for production Gemini integrations.

Bright • Aug 29

Why do most metrics and surveys say that AI is making us 20% more productive? Even if we don't have physical AI and agents are trapped on software, the fact that they can generate and critic code at the speed of light, coupled with their humongous knowledge base and context window, should mean we should have orders of magnitude more productivity gains. Besides, why is the AI age not reflecting in economic data, i.e. GDP? What do you think are the bottlenecks to unlocking the economic and productivity benefits of AI? Why isn't it living up to expectations?