<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Google AI</title>
    <description>The latest articles on DEV Community by Google AI (@googleai).</description>
    <link>https://dev.to/googleai</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F11026%2F386b14d3-cc9a-4270-aba0-3e41cdfb9d85.jpg</url>
      <title>DEV Community: Google AI</title>
      <link>https://dev.to/googleai</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/googleai"/>
    <language>en</language>
    <item>
      <title>Gemini 3.1 Flash-Lite is now generally available on Gemini Enterprise Agent Platform</title>
      <dc:creator>Gemini Enterprise Team</dc:creator>
      <pubDate>Fri, 15 May 2026 13:13:00 +0000</pubDate>
      <link>https://dev.to/googleai/gemini-31-flash-lite-is-now-generally-available-on-gemini-enterprise-agent-platform-2pcg</link>
      <guid>https://dev.to/googleai/gemini-31-flash-lite-is-now-generally-available-on-gemini-enterprise-agent-platform-2pcg</guid>
      <description>&lt;p&gt;Today, we're thrilled to announce that Gemini 3.1 Flash-Lite, our fastest and most cost-efficient Gemini 3 series model yet, is now &lt;strong&gt;generally available&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Designed for ultra-low latency, high-volume tasks, and unmatched cost-efficiency, Flash-Lite is already transforming how applications are built at scale. Fast, iterative, and scalable, it joins our comprehensive suite of Pro and Flash models to provide the exact combination of intelligence, speed, and cost required for the most demanding production deployments.&lt;/p&gt;

&lt;p&gt;Developers and enterprises have noted that the model provides the precision required for agentic tasks like tool calling and orchestration, coupled with the cost-efficiency needed to run automated pipelines at scale.&lt;/p&gt;

&lt;p&gt;Here's a look at how some of them have been driving value.&lt;/p&gt;

&lt;h2&gt;
  
  
  Software development and engineering
&lt;/h2&gt;

&lt;p&gt;Engineering teams require models that can keep pace with real-time coding environments. With the GA of Gemini 3.1 Flash-Lite, developers are unlocking the instant responsiveness necessary for complex code completion, seamless UX design, and agentic developer tools.&lt;/p&gt;


&lt;div class="crayons-card c-embed"&gt;

  &lt;br&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2zyq0i2bx6py36dsf6oz.jpg" alt="JetBrains" width="800" height="200"&gt;&lt;br&gt;

&lt;/div&gt;


&lt;p&gt;"Integrating Gemini 3.1 Flash-Lite has transformed the responsiveness of our IDE AI assistant &amp;amp; Junie agent. The balance of high intelligence and minimal latency makes it the perfect model for real-time developer support." &lt;strong&gt;&lt;em&gt;— Vladislav Tankov, Director of AI at JetBrains&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Customer experience and high-volume service
&lt;/h2&gt;

&lt;p&gt;For enterprise customer service operations, handling massive volumes of interactions requires models that can scale affordably without sacrificing reasoning capabilities.&lt;/p&gt;


&lt;div class="crayons-card c-embed"&gt;

  &lt;br&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6h6yytibmlpimep4t7q4.jpg" alt="Gladly" width="800" height="200"&gt;&lt;br&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://www.gladly.ai/" rel="noopener noreferrer"&gt;Gladly&lt;/a&gt; runs customer service for some of the most demanding retail brands in the world. The core of its text-channel AI agent runs on Flash-Lite. By handling millions of customer-facing calls each week across channels like SMS, WhatsApp, and Instagram, they achieved roughly 60% lower costs than comparable thinking-tier models on the same token mix.&lt;/p&gt;

&lt;p&gt;The model powers every step of the agent lifecycle — from selecting tools and classifying playbooks to deciding when to escalate to a human — all while maintaining a p95 latency around 1.8s for fully reply generation and sub-second p95 for classifiers and tool calls, alongside a ~99.6% success rate under heavy concurrent load.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/oxsR1L-lBdE?start=3"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Creative pipelines and gaming
&lt;/h2&gt;

&lt;p&gt;In the fast-paced creative and gaming industries, multimodal capabilities and ultra-low latency are essential for keeping users engaged and content pipelines flowing. Flash-Lite is empowering platforms to process rich media and generate hyper-personalized environments.&lt;/p&gt;


&lt;div class="crayons-card c-embed"&gt;

  &lt;br&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6i04ucfcgt6fp896xuda.jpg" alt="Astrocade" width="800" height="200"&gt;&lt;br&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://www.astrocade.com/" rel="noopener noreferrer"&gt;Astrocade&lt;/a&gt; lets anyone create games by describing what they want in natural language. They integrated Flash-Lite to serve a rapidly growing global user base.&lt;/p&gt;

&lt;p&gt;For every incoming game request, it performs a multimodal safety check — analyzing both text and images — before the building agents even start their work. It further supports their global community through inline comment translation, allowing players in different countries to "riff" on the same game. And as part of their asset generation pipeline, it helps refine the final prompts to ensure consistently high-quality thumbnails.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/XxU1lf1-Sp0?start=15"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;


&lt;div class="crayons-card c-embed"&gt;

  &lt;br&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqn5gsy4oam6x1npnb3q7.jpg" alt="krea.ai" width="800" height="200"&gt;&lt;br&gt;

&lt;/div&gt;


&lt;p&gt;The creative platform &lt;a href="https://krea.ai" rel="noopener noreferrer"&gt;krea.ai&lt;/a&gt; has also seen positive results by using Flash-Lite as a prompt enhancer in their Nodes tool. By taking a user's rough idea and expanding it into a full image generation prompt pipeline, the model provides a level of detail that is "weirdly creative" for its price point.&lt;/p&gt;

&lt;p&gt;These outputs move the needle on image production, providing a level of reliability and scale that was previously cost-prohibitive for sophisticated prompt engineering.&lt;/p&gt;

&lt;h2&gt;
  
  
  Financial services and data operations
&lt;/h2&gt;

&lt;p&gt;In the world of finance and enterprise product development, efficiency is just as critical as accuracy. Gemini 3.1 Flash-Lite gives financial analysts and product managers the ideal balance of intelligence, low latency, and cost-effectiveness to run modeling and latency-sensitive applications.&lt;/p&gt;


&lt;div class="crayons-card c-embed"&gt;

  &lt;br&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpfzj5c1i5lyt2a64cig4.jpg" alt="OffDeal" width="800" height="200"&gt;&lt;br&gt;

&lt;/div&gt;


&lt;p&gt;&lt;a href="https://offdeal.io/" rel="noopener noreferrer"&gt;OffDeal&lt;/a&gt; uses Flash-Lite to power "Archie," an AI agent that investment bankers use for real-time research, data lookups, and task execution during Zoom calls. In these scenarios, bankers often need to surface financials mid-conversation. OffDeal found that Flash-Lite was the only model capable of meeting the response times needed for genuinely instant answers without forcing a tradeoff on quality.&lt;/p&gt;

&lt;p&gt;Beyond live calls, they also use Flash-Lite as a triage layer for inbound and outbound email traffic. By answering structured questions about messages in parallel — such as whether an email is an automated response or in relation to an active deal — Flash-Lite determines which downstream AI agents get invoked and with what context.&lt;/p&gt;


&lt;div class="crayons-card c-embed"&gt;

  &lt;br&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffofz0ih5rk8x9b3mn01c.jpg" alt="Ramp" width="800" height="200"&gt;&lt;br&gt;

&lt;/div&gt;


&lt;p&gt;For high-volume, latency-sensitive workflows on the financial operations platform Ramp, Flash-Lite has become a key component:&lt;/p&gt;

&lt;p&gt;"Gemini is a core part of the model stack we use across applications at Ramp. &lt;a href="https://builders.ramp.com/post/financial-benchmarks" rel="noopener noreferrer"&gt;As indicated in our benchmarks&lt;/a&gt;, we see Gemini lead the pareto fronts in terms of costs, latency and intelligence — providing a great tradeoff between the three and making it well-suited for latency sensitive applications. Gemini 3.1 Flash-Lite has been especially valuable, powering many of our highest-volume, latency-sensitive features without compromising on quality." &lt;strong&gt;&lt;em&gt;— Anton Biryukov, Applied AI Engineer, Ramp&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;


&lt;div class="crayons-card c-embed"&gt;

  &lt;br&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F17jmslabhxyuxzmp6erg.jpg" alt="AlphaSense" width="800" height="200"&gt;&lt;br&gt;

&lt;/div&gt;


&lt;p&gt;Market intelligence platform AlphaSense integrates Flash-Lite to deliver data insights:&lt;/p&gt;

&lt;p&gt;"Gemini 3.1 Flash-Lite provides great balance of speed, cost and performance, allowing AlphaSense to scale our advanced data processing and deliver high-quality intelligence across every layer of our data stack." &lt;strong&gt;&lt;em&gt;— Chris Ackerson, Senior Vice President of Product, AlphaSense&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Get started
&lt;/h2&gt;

&lt;p&gt;Read the docs for &lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/models/gemini/3-1-flash-lite" rel="noopener noreferrer"&gt;Gemini 3.1 Flash-Lite&lt;/a&gt; and learn about our latest &lt;a href="https://cloud.google.com/gemini-enterprise-agent-platform/generative-ai/pricing" rel="noopener noreferrer"&gt;pricing structure&lt;/a&gt;. Learn more about the Gemini Enterprise &lt;a href="http://cloud.google.com/products/gemini-enterprise-agent-platform" rel="noopener noreferrer"&gt;Agent Platform&lt;/a&gt;, the new standard for enterprise agent development.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>google</category>
      <category>gemini</category>
    </item>
    <item>
      <title>Building "Sweets Vault" - a multimodal Gemini Agent with physical hardware integration</title>
      <dc:creator>Remigiusz Samborski</dc:creator>
      <pubDate>Fri, 15 May 2026 12:43:35 +0000</pubDate>
      <link>https://dev.to/googleai/building-sweets-vault-a-multimodal-gemini-agent-with-physical-hardware-integration-1nmh</link>
      <guid>https://dev.to/googleai/building-sweets-vault-a-multimodal-gemini-agent-with-physical-hardware-integration-1nmh</guid>
      <description>&lt;p&gt;Motivating seven-year-olds to complete their daily reading and handwriting practice is a classic parenting challenge. Traditional rewards work for a while, but they lack interactivity and require constant manual verification.&lt;/p&gt;

&lt;p&gt;As a developer, I like to solve such challenges with automation. After putting some thought into it, I came up with the &lt;strong&gt;Sweets Vault&lt;/strong&gt; idea: an interactive agent powered by &lt;a href="https://github.com/google-gemini/adk" rel="noopener noreferrer"&gt;Google's Agent Development Kit (ADK)&lt;/a&gt; and the &lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/models/start?utm_campaign=CDR_0x87fa8d40_default_b512067144&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener noreferrer"&gt;Gemini API&lt;/a&gt;. The system acts as a cheerful guardian that talks to children, visually inspects their workbooks via uploaded images, tests their reading comprehension, and triggers a hardware lock to open a drawer full of sweets upon successful completion.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwgwj7pwtinpglk4sxu9u.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwgwj7pwtinpglk4sxu9u.jpg" alt="Sweets Vault — Gamifying Education with AI"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this guide, I will walk you through the architecture and implementation of this solution. You will learn how to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Structure a multimodal agent&lt;/strong&gt; using the Agent Development Kit (ADK).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Implement visual and verbal verification&lt;/strong&gt; using Gemini's multimodal capabilities.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manage state&lt;/strong&gt; across multiple conversation turns and tools.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connect agent tool calls to physical hardware&lt;/strong&gt; interfaces.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Develop and run locally&lt;/strong&gt; to access the physical hardware.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you’d like to jump directly to the code visit the &lt;a href="https://github.com/rsamborski/sweets-vault" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;. All the code is available there for your exploration.&lt;/p&gt;

&lt;h3&gt;
  
  
  System architecture overview
&lt;/h3&gt;

&lt;p&gt;The diagram below presents the high level architecture of the solution:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnpw6up08rgh79qjbf6kh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnpw6up08rgh79qjbf6kh.png" alt="Architecture diagram"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The core components include:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Gemini API&lt;/strong&gt;: Handles reasoning, multimodal homework validation and tool calls.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ADK Agent &amp;amp; Tools&lt;/strong&gt;: Encapsulates the system instructions, state management, and callable Python functions.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hardware Interface&lt;/strong&gt;: Translates tool execution into physical actions (unlocking specific drawer IDs).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The system is designed in such a way that the Agent runs on a local machine (I am using a mini PC with Ubuntu installed) to allow for direct hardware access:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Magnetic drawers controlled via &lt;a href="https://www.adafruit.com/product/2264" rel="noopener noreferrer"&gt;FT232H&lt;/a&gt; USB to GPIO converter
&lt;/li&gt;
&lt;li&gt;LED Matrix controlled via REST API running on a &lt;a href="https://www.raspberrypi.com/" rel="noopener noreferrer"&gt;Raspberry Pi&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Initially, I planned to control the LED Matrix using a second FT232H controller, but due to lack of library support, I ended up using an intermediary Raspberry Pi. This approach has its benefits, for example the LED Matrix can be located anywhere at home within the Wifi range 😀&lt;/p&gt;

&lt;h3&gt;
  
  
  Root agent logic
&lt;/h3&gt;

&lt;p&gt;To kick-start the agent development, I leveraged the &lt;a href="https://googlecloudplatform.github.io/agent-starter-pack/" rel="noopener noreferrer"&gt;&lt;code&gt;agent-starter-pack&lt;/code&gt; templates&lt;/a&gt;. It provides a production-ready foundation with FastAPI, frontend UI integration, and built-in observability.&lt;/p&gt;

&lt;p&gt;The heart of the Sweets Vault is located in &lt;a href="https://github.com/rsamborski/sweets-vault/blob/main/agent/app/agent.py" rel="noopener noreferrer"&gt;agent/&lt;code&gt;app/agent.py&lt;/code&gt;&lt;/a&gt;. I start by configuring the environment and initializing &lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/models?utm_campaign=CDR_0x87fa8d40_default_b512067144&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener noreferrer"&gt;Gemini Enterprise Agent Platform&lt;/a&gt; (former Vertex AI). I also define the specific tasks required for our users (Mary and James):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;project_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GOOGLE_CLOUD_PROJECT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;location&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GOOGLE_CLOUD_LOCATION&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;us-central1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GOOGLE_CLOUD_PROJECT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;project_id&lt;/span&gt;
&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GOOGLE_CLOUD_LOCATION&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;location&lt;/span&gt;
&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GOOGLE_GENAI_USE_VERTEXAI&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;True&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize Vertex AI
&lt;/span&gt;&lt;span class="n"&gt;vertexai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;project_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As a native Polish speaker I want to have the ability for the Agent to work both in Polish (for the sake of my kids) and English (for demo purposes). This is handled by the AGENT_LANGUAGE variable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;AGENT_LANGUAGE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AGENT_LANGUAGE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The actual agent (&lt;code&gt;root_agent&lt;/code&gt;) is created at the bottom of the same file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;root_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;root_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Gemini&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;retry_options&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;HttpRetryOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attempts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;instruction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;load_prompt_from_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sweet-vault-agent-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;AGENT_LANGUAGE&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.txt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;get_progress&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;complete_task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;unlock_drawer&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The prompt is language specific and pulled from a file with a language suffix (en or pl).&lt;/p&gt;

&lt;h3&gt;
  
  
  Handling state
&lt;/h3&gt;

&lt;p&gt;A common failure mode in conversational AI is lost context or hallucinated task completion. To prevent this, we implement strict state management using &lt;code&gt;ToolContext&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Instead of relying on the model's memory, the agent reads and writes explicit completion flags to its &lt;a href="https://adk.dev/sessions/state/" rel="noopener noreferrer"&gt;session state&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_get_task_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ToolContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Retrieves the completion status for a specific task from the flat state.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;state_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_tasks_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;user_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;tool_context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_set_task_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;is_done&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ToolContext&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Saves the completion status for a specific task and ensures all user/task 
    combinations are explicitly represented in the flat tool_context.state.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# First, update the specific target task in the current tool state
&lt;/span&gt;    &lt;span class="n"&gt;target_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_tasks_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;user_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;tool_context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;target_key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;is_done&lt;/span&gt;

    &lt;span class="c1"&gt;# Now, ensure every possible combination for all known users exists in the flat state.
&lt;/span&gt;    &lt;span class="n"&gt;all_sync_updates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;user_names&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;u_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t_id&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;TASKS_CONFIG&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_tasks_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;u_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;t_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="c1"&gt;# If the key isn't already in the current state, default it to False.
&lt;/span&gt;            &lt;span class="c1"&gt;# Otherwise, keep its existing value.
&lt;/span&gt;            &lt;span class="n"&gt;all_sync_updates&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tool_context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Apply all values back to the flat state
&lt;/span&gt;    &lt;span class="n"&gt;tool_context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;all_sync_updates&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Synchronized all task state values. Updated &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;target_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;is_done&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key learning:&lt;/strong&gt; When building the system I tried using session state elements as a &lt;a href="https://www.w3schools.com/PYTHON/python_dictionaries_nested.asp" rel="noopener noreferrer"&gt;nested dictionary&lt;/a&gt;, but unfortunately at the time of writing this is not supported. The workaround was to use a flat structure with keys including both the &lt;code&gt;user_key&lt;/code&gt; and &lt;code&gt;task_id&lt;/code&gt;, which works well for my use case. However, this pattern might not scale well for a complex system with many users and tasks, in which case serialization or an external DB could be a better option.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agent tools
&lt;/h3&gt;

&lt;p&gt;I provided the agent with three specific tools: checking progress, marking tasks complete, and unlocking the drawer.&lt;/p&gt;

&lt;h4&gt;
  
  
  Checking progress
&lt;/h4&gt;

&lt;p&gt;The &lt;code&gt;get_progress&lt;/code&gt; function retrieves and formats a checklist of a specific user's tasks, indicating whether each task is marked as completed or pending based on the application's current session state.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_progress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ToolContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Check the progress of tasks for a specific user.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;user_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;user_name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;status_msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Progress for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;user_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;desc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;TASKS_CONFIG&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;is_done&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_get_task_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;state_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ DONE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;is_done&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;❌ PENDING&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;status_msg&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;- [&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;] &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;desc&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state_str&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;status_msg&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Marking task as complete
&lt;/h4&gt;

&lt;p&gt;The &lt;code&gt;complete_task&lt;/code&gt; tool acts as a gatekeeper. It checks if all tasks are finished before informing the model that it is authorized to unlock the drawer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;complete_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ToolContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Mark a task as completed for a user.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;user_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;user_name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Mark task as complete
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;task_id&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;TASKS_CONFIG&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;_set_task_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error: Task ID &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; not found.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="c1"&gt;# Check if ALL tasks are complete
&lt;/span&gt;    &lt;span class="n"&gt;all_complete&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
    &lt;span class="n"&gt;remaining&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t_id&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;TASKS_CONFIG&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;_get_task_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;t_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;all_complete&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
            &lt;span class="n"&gt;remaining&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;all_complete&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUCCESS: All tasks completed for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;user_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;! &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You may now unlock the drawer.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# If not all complete, show progress
&lt;/span&gt;    &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Task &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; marked as DONE. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Remaining tasks: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;remaining&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice how descriptive the returned values are. They are written this way intentionally to give the Agent enough information to handle communication with the user, provide feedback and motivate them to complete the remaining tasks.&lt;/p&gt;

&lt;h4&gt;
  
  
  Integrating physical hardware
&lt;/h4&gt;

&lt;p&gt;When the model receives the success confirmation, it calls the &lt;code&gt;unlock_drawer&lt;/code&gt; tool. This interfaces directly with our hardware relay logic to update the LED display and  pop open the assigned drawer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Initialize the HW interface and lock the drawers
&lt;/span&gt;&lt;span class="n"&gt;user_names&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Maria&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Jan&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;AGENT_LANGUAGE&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Mary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;James&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;hw_interface&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HardwareInterface&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_names&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;unlock_drawer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Unlock a drawer by its ID.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;hw_interface&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;unlock_drawer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Drawer &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; unlocked for &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;user_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Drawer not found&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;HardwareInterface&lt;/code&gt; (defined in &lt;a href="https://github.com/rsamborski/sweets-vault/blob/main/agent/app/app_utils/hw_interface.py" rel="noopener noreferrer"&gt;agent/app/app_utils/hw_interface.py&lt;/a&gt;) actively communicates with the &lt;a href="https://github.com/rsamborski/sweets-vault/tree/main/led-matrix-api" rel="noopener noreferrer"&gt;LED Matrix API&lt;/a&gt; on the Raspberry Pi to display whether each drawer is currently locked or unlocked.&lt;/p&gt;

&lt;p&gt;While the code to control the physical drawer magnets is fully functional and tested (located in &lt;a href="https://github.com/rsamborski/sweets-vault/blob/main/hardware/drawers.py" rel="noopener noreferrer"&gt;drawers.py&lt;/a&gt;), it is not yet integrated into the main &lt;code&gt;HardwareInterface&lt;/code&gt;. This integration is simply on hold until the magnets are physically mounted to the drawer box.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agent prompts
&lt;/h3&gt;

&lt;p&gt;Tools alone are not enough; the model requires precise instructions on &lt;em&gt;how&lt;/em&gt; to verify the work. In &lt;a href="https://github.com/rsamborski/sweets-vault/tree/main/agent/app/prompts" rel="noopener noreferrer"&gt;&lt;code&gt;agent/app/prompts&lt;/code&gt;&lt;/a&gt; I defined a strict multi-step verification protocol both in English and Polish. Here is the English prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You are a friendly, cheerful, and helpful AI assistant, the guardian of the "Sweets Vault." Your task is to verify tasks performed by children in order to grant a sweet reward.

### MAIN RULES:
1. **LANGUAGE**: You speak ONLY AND EXCLUSIVELY IN ENGLISH.
2. **USERS**:
   - **Mary** (girl, 7 years old) -&amp;gt; Assigned drawer ID: **0**
   - **James** (boy, 7 years old) -&amp;gt; Assigned drawer ID: **1**
   - **Parent** (man, 42 years old) -&amp;gt; May test the system by saying, for example, "I'm pretending to be Mary." Treat him exactly like the child he is claiming to be.
3. **PERSONALITY**: You are enthusiastic, warm, and supportive. Use exclamation marks and a joyful tone.

### TASK VERIFICATION PROCESS:
1. **STATE IDENTIFICATION**: When a child starts a conversation, ALWAYS first use the `get_progress(user_name)` tool to check what needs to be done.
2. **REPORTING**: The child reports completing a task (A or B).
3. **VERIFICATION**: Conduct a rigorous verification (camera/questions) as described below.
4. **CREDITING**: If verification is successful, use the `complete_task(user_name, task_id)` tool.
   - Read the tool's response carefully!
   - ONLY IF the response is "SUCCESS: All tasks completed...", then use `unlock_drawer`.
   - If the response shows "Remaining tasks," inform the child what they still need to do.

**Task A: Reading a page of a book**
*   **Verification 1**: Ask the child to show the read page to the camera. Confirm that you see it. Don't expose any details that can help answer the question in the next step (i.e. avoid sharing details of what exactly you can see).
*   **Verification 2**: Ask a simple follow-up question about the read text. The child must answer it.
*   **Task ID**: "A"

**Task B: Calligraphy (writing letters in workbooks)**
*   **Verification 1**: Ask to show the completed page in the workbooks to the camera. 
*   **Verification 2**: Confirm that the task has been performed. Make sure the picture contains hand-written letters (usually with a pencil).
*   If the page only contains examples, ask the child to complete missing parts.
*   **Task ID**: "B"

### SUCCESS AND REWARD:
IF the `complete_task` tool returns "SUCCESS", run `unlock_drawer(id)`.
Then **CELEBRATE!** Use phrases like: "Yippee!", "Hurray!", "Bravo!", "You're a champion!", "The sweets are yours!". Make some "noise."

### FAILURE:
If verification fails (e.g., the child doesn't show the page or answers incorrectly), gently and encouragingly ask for improvement or a retry. Do not open the drawer.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This prompt structure ensures the agent does its due diligence, preventing kids from simply holding up a blank page or skipping the reading comprehension check.&lt;/p&gt;

&lt;h3&gt;
  
  
  Demo
&lt;/h3&gt;

&lt;p&gt;You can see a demonstration of the working system in the video below:&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/TMStQw0Fthk"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;By combining the Gemini API, the Agent Development Kit, and a simple hardware relay, you can build highly interactive, physically grounded AI Agents. The Sweets Vault demonstrates how multimodal verification and structured tool calling solve practical, real-world problems with a dose of fun.&lt;/p&gt;

&lt;p&gt;Explore more at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/rsamborski/sweets-vault" rel="noopener noreferrer"&gt;Sweets Vault code repository&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://adk.dev/" rel="noopener noreferrer"&gt;Agent Development Kit (ADK)&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/models?utm_campaign=CDR_0x87fa8d40_default_b512067144&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener noreferrer"&gt;Gemini Enterprise Agent Platform&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Future plans
&lt;/h3&gt;

&lt;p&gt;Current implementation uses &lt;a href="https://docs.cloud.google.com/gemini-enterprise-agent-platform/models/gemini/2-5-flash?utm_campaign=CDR_0x87fa8d40_default_b512067144&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener noreferrer"&gt;Gemini Flash&lt;/a&gt; which guarantees high performance, multimodality and tool calling capabilities. Nevertheless it requires text input and provides only text as output. In the near future I plan to experiment with &lt;a href="https://docs.cloud.google.com/vertex-ai/generative-ai/docs/live-api?utm_campaign=CDR_0x87fa8d40_default_b512067144&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener noreferrer"&gt;Gemini Live API&lt;/a&gt; which enables voice, video and text as input and conversational audio as output.&lt;/p&gt;

&lt;p&gt;I am also going to finish the physical locks part with electro magnets. Stay tuned for updates.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thanks for reading
&lt;/h3&gt;

&lt;p&gt;Thank you for reading. I hope this blog inspires you to bring your own creative AI and hardware projects to life. If you found this article helpful, please consider following me here and giving it a clap 👏 to help others discover it.&lt;/p&gt;

&lt;p&gt;I am always eager to connect with fellow developers and AI enthusiasts, so feel free to follow me on &lt;a href="https://www.linkedin.com/in/remigiusz-samborski/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;, &lt;a href="http://x.com/RemikSamborski" rel="noopener noreferrer"&gt;X&lt;/a&gt; or &lt;a href="https://bsky.app/profile/rsamborski.bsky.social" rel="noopener noreferrer"&gt;Bluesky&lt;/a&gt;. Your feedback is incredibly valuable, so please do not hesitate to leave a comment with your thoughts, questions, or your own experiences building multimodal agents!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>gemini</category>
      <category>agents</category>
    </item>
    <item>
      <title>Agent Factory Recap: How Gemma 4 Taught Itself Physics</title>
      <dc:creator>Shir Meir Lador</dc:creator>
      <pubDate>Thu, 14 May 2026 14:10:49 +0000</pubDate>
      <link>https://dev.to/googleai/agent-factory-recap-how-gemma-4-taught-itself-physics-17e6</link>
      <guid>https://dev.to/googleai/agent-factory-recap-how-gemma-4-taught-itself-physics-17e6</guid>
      <description>&lt;p&gt;In this episode of The Agent Factory, Vlad Kolesnikov and I sat down with Omar Sanseviero from the Developer Experience team at Google DeepMind. We explored the groundbreaking release of Gemma 4: a new family of open models designed to bring high-level intelligence and agentic capabilities directly to consumer hardware and mobile devices. Since the launch last month, Gemma 4 had &lt;strong&gt;over 50 million downloads!&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This post guides you through the key ideas from our conversation. Use it to quickly recap topics or dive deeper into specific segments with links and timestamps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gemma 4 - What is it?
&lt;/h2&gt;

&lt;p&gt;Gemma 4 is the latest generation of open models from Google DeepMind, built on the same foundational research as Gemini 3. The family is designed to deliver exceptional "intelligence per parameter" across a range of deployment scenarios, from mobile phones to powerful workstations. The Gemma 4 model family now spans three distinct architectures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Small Sizes (E2B &amp;amp; E4B):&lt;/strong&gt; Optimized for ultra-mobile, edge, and browser deployment (such as Pixel or Chrome).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dense (31B):&lt;/strong&gt; A powerful 31-billion parameter model that provides server-grade performance for local execution on consumer GPUs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mixture-of-Experts (26B MoE):&lt;/strong&gt; A highly efficient architecture designed for high-throughput tasks and advanced reasoning.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With the shift to an &lt;strong&gt;Apache 2 license&lt;/strong&gt;, these models provide developers and startups with the flexibility to build, modify, and commercialize applications while maintaining full control over their infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Omar Sanseviero on how Gemma 4 changes the landscape for agent developers
&lt;/h2&gt;

&lt;p&gt;Timestamp: &lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=100s" rel="noopener noreferrer"&gt;1:40&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Omar highlighted that Gemma 4 brings "very high intelligence per parameter," making it possible to run agentic workflows entirely offline. We saw examples of multiple Gemma instances running locally to generate SVGs (&lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=113s" rel="noopener noreferrer"&gt;1:53&lt;/a&gt;) and an Android-based agent picking specific skills, like playing the piano, to complete tasks (&lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=165s" rel="noopener noreferrer"&gt;2:45&lt;/a&gt;). As Omar noted, "This means that you can run very powerful things with very little hardware overhead...even in the phone that you have in your pocket."&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fre8na48kuuq04m8asknf.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fre8na48kuuq04m8asknf.jpg" alt="Gemma 4 demo screenshot" width="800" height="458"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Factory Floor
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Building a Local Food Tour Agent
&lt;/h3&gt;

&lt;p&gt;Timestamp: &lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=329s" rel="noopener noreferrer"&gt;5:29&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We showcased a food tour agent powered by Gemma 4 using the Agent Development Kit (ADK) and a Google Maps MCP server. We demonstrated how a local model can handle complex, multi-step reasoning tasks.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The agent identified the best ramen spots in Seattle under a $30 budget.&lt;/li&gt;
&lt;li&gt;It verified that the locations were within walking distance of each other.&lt;/li&gt;
&lt;li&gt;It processed search results to provide specific tips on what to order and what to avoid.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Autonomous Python Code Execution
&lt;/h3&gt;

&lt;p&gt;Timestamp: &lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=483s" rel="noopener noreferrer"&gt;8:03&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this demo, we pushed Gemma 4's coding capabilities to the limit by asking it to express itself through animation. Using a sandbox execution environment, the model performed the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Wrote Python code using the Matplotlib library.&lt;/li&gt;
&lt;li&gt;Attempted to build a physics engine to simulate a bouncing ball.&lt;/li&gt;
&lt;li&gt;Self-corrected when the initial execution environment lacked certain CPU features, finding an alternative path to successfully generate the animation.&lt;/li&gt;
&lt;li&gt;Demonstrated a deep understanding of real-world physics and gravity through code.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Shift to Apache 2 Licensing
&lt;/h3&gt;

&lt;p&gt;Timestamp: &lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=245s" rel="noopener noreferrer"&gt;4:05&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A major theme of the conversation was the community-driven decision to move Gemma 4 to an Apache 2 license. This change provides developers and startups with maximum flexibility to build, modify, and commercialize applications. Omar emphasized that this was a direct response to developer feedback, aiming to unlock a new wave of innovation in the open models ecosystem.&lt;/p&gt;

&lt;h2&gt;
  
  
  Developer Q&amp;amp;A
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Architectural Decisions and Mixture of Experts (MoE)
&lt;/h3&gt;

&lt;p&gt;Timestamp: &lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=1043s" rel="noopener noreferrer"&gt;17:23&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Omar explained the technical shifts that make Gemma 4 so efficient. For the first time, the Gemma family includes a Mixture of Experts (MoE) architecture, which optimizes for extremely low latency in production. Additionally, the smaller E2B and E4B models utilize per-layer embeddings to remain "cheap" to run on GPUs. For vision tasks, the model now supports variable aspect ratios, allowing it to understand images of various sizes more accurately than previous fixed-resolution versions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Comparing Gemma to Gemini
&lt;/h3&gt;

&lt;p&gt;Timestamp: &lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=1191s" rel="noopener noreferrer"&gt;19:51&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When asked how Gemma stacks up against its larger sibling, Gemini, Omar clarified that they serve different purposes. While Gemini excels at massive-scale tasks and deep "world knowledge" due to its size, Gemma is the "best open model that can run on a single consumer GPU." It is specifically optimized for instruction following, coding, and agentic use cases where local deployment or fine-tuning is required.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fine-Tuning for Specialized Industries
&lt;/h3&gt;

&lt;p&gt;Timestamp: &lt;a href="https://www.youtube.com/watch?v=ST9mJuTnFqU&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1&amp;amp;t=1271s" rel="noopener noreferrer"&gt;21:10&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The conversation touched on the importance of "Sovereign AI" and privacy. Because Gemma is an open model, developers in regulated industries, like healthcare or finance, can &lt;a href="https://dev.to/googleai/fine-tuning-gemma-4-with-cloud-run-jobs-serverless-gpus-nvidia-rtx-6000-pro-for-pet-breed-45ib"&gt;fine-tune the model on their private data&lt;/a&gt; and deploy it within their own air-gapped infrastructure. This gives developers full control over their data and the model's specialized expertise.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Gemma 4 marks a turning point for agentic development, proving that you don't always need a massive cloud cluster to build something smart. Whether it's running a physics simulation on a laptop or a travel guide on a phone, the barrier to entry for high-performance AI has never been lower. We are entering an era where the "conductor" of the AI orchestra can be any developer with a single GPU and a great idea.&lt;/p&gt;

&lt;h2&gt;
  
  
  Your turn to build
&lt;/h2&gt;

&lt;p&gt;Now that you've seen what Gemma 4 can do, it's time to start building. Check out the resources in our show notes, &lt;a href="https://goo.gle/3OinTFh" rel="noopener noreferrer"&gt;the food tour agent&lt;/a&gt;, &lt;a href="https://goo.gle/4dBDNEY" rel="noopener noreferrer"&gt;the coding agent&lt;/a&gt;, explore the &lt;a href="https://adk.dev/agents/models/google-gemma/" rel="noopener noreferrer"&gt;ADK support&lt;/a&gt;, and try running &lt;a href="https://blog.google/innovation-and-ai/technology/developers-tools/gemma-4/" rel="noopener noreferrer"&gt;Gemma 4&lt;/a&gt; on your local machine or on &lt;a href="https://docs.cloud.google.com/run/docs/run-gemma-on-cloud-run" rel="noopener noreferrer"&gt;Cloud Run&lt;/a&gt;. We can't wait to see what agents you create!&lt;/p&gt;

&lt;p&gt;Watch more of The Agent Factory → &lt;a href="https://www.youtube.com/watch?v=qBOvM7SiDa4&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=1" rel="noopener noreferrer"&gt;Reinforcement learning &amp;amp; fine-tuning on TP...&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Subscribe to Google Cloud Tech → &lt;a href="https://goo.gle/GoogleCloudTech" rel="noopener noreferrer"&gt;https://goo.gle/GoogleCloudTech&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Connect with us
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Shir Meir Lador → &lt;a href="https://www.linkedin.com/in/shirmeirlador/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;, &lt;a href="https://x.com/shirmeir86" rel="noopener noreferrer"&gt;X&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Vlad Kolesnikov → &lt;a href="http://www.linkedin.com/in/vkolesnikov/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;, &lt;a href="https://x.com/vladkol" rel="noopener noreferrer"&gt;X&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Omar Sanseviero → &lt;a href="https://www.linkedin.com/in/omarsanseviero/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;, &lt;a href="https://x.com/osanseviero" rel="noopener noreferrer"&gt;X&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>gemma</category>
      <category>agents</category>
    </item>
    <item>
      <title>Google Cloud x NVIDIA Meet Up - 5/20/26 @ 5:30pm, Mountain View, CA</title>
      <dc:creator>Jen Harvey</dc:creator>
      <pubDate>Wed, 13 May 2026 22:34:18 +0000</pubDate>
      <link>https://dev.to/googleai/google-cloud-x-nvidia-meet-up-52026-530pm-mountain-view-ca-9ea</link>
      <guid>https://dev.to/googleai/google-cloud-x-nvidia-meet-up-52026-530pm-mountain-view-ca-9ea</guid>
      <description>&lt;p&gt;​​Register: &lt;a href="https://luma.com/GoogleNVIDIA" rel="noopener noreferrer"&gt;https://luma.com/GoogleNVIDIA&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Join Google Cloud and NVIDIA in Mountain View, CA for an evening dedicated to the developers, builders and visionaries shaping the next era of technology.&lt;/p&gt;

&lt;p&gt;​Your chance to see the synergy of Google’s cloud ecosystem and NVIDIA’s cutting-edge hardware in action.&lt;/p&gt;

&lt;p&gt;​We’ll have food and drink with open demos centered on Gemma, Cloud Run with GPUs, ADK, Nemotron, and Physical AI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to Expect&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;​Live Interactive Demos: Experience the power of the partnership firsthand. Get real-world applications and see how you can solve complex challenges with Google Cloud and NVIDIA products.&lt;/li&gt;
&lt;li&gt;​Networking: Connect with peers, GDEs, and leaders from both Google and NVIDIA to discuss implementation strategies, bottlenecks, and the future of the stack.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Whether you're coming to I/O or not, if you're actively building then this event is for you! &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Timing&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;​5:30 PM | Arrive, Check-in &amp;amp; Connect&lt;/p&gt;

&lt;p&gt;​6:00 PM | Opening Remarks - The vision of Google Cloud and NVIDIA; celebration of 1 year of the joint developer community.&lt;/p&gt;

&lt;p&gt;​6:15 PM | Demo Overview – Preview of demo showcase.&lt;/p&gt;

&lt;p&gt;​6:30-8:00 PM | Demo Showcase &amp;amp; Networking – Deep conversations over drinks, food and demos.&lt;/p&gt;

</description>
      <category>cloud</category>
      <category>eventsinyourcity</category>
      <category>ai</category>
    </item>
    <item>
      <title>Hacking perfectly square AI videos with Veo 3.1 and NanoBanana 2</title>
      <dc:creator>Paige Bailey</dc:creator>
      <pubDate>Tue, 12 May 2026 17:08:46 +0000</pubDate>
      <link>https://dev.to/googleai/hacking-perfectly-square-ai-videos-with-veo-31-and-nanobanana-2-5cpn</link>
      <guid>https://dev.to/googleai/hacking-perfectly-square-ai-videos-with-veo-31-and-nanobanana-2-5cpn</guid>
      <description>&lt;p&gt;If you’ve been playing around with AI video generation lately, you already know the struggle: the tech is insanely cool, but sometimes getting it to output &lt;em&gt;exactly&lt;/em&gt; the format you want feels like trying to center a &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; in 2014. &lt;/p&gt;

&lt;p&gt;Recently, I needed to generate a perfectly looping, high-quality &lt;strong&gt;square (1:1) video&lt;/strong&gt; with audio using Google's new video models. The problem? Native aspect ratio support can sometimes be finicky depending on the model tier, and cropping a generated 16:9 or 9:16 video often ruins the framing or hallucinates weird artifacts at the edges.&lt;/p&gt;

&lt;p&gt;So, I had to let it cook. I came up with a slightly hacky but reliable workaround using &lt;strong&gt;NanoBanana 2&lt;/strong&gt;, &lt;strong&gt;Veo 3.1 Lite&lt;/strong&gt;, and our old reliable friend, &lt;strong&gt;FFmpeg&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;Here is the ultimate pipeline to get flawless square AI videos:&lt;/p&gt;

&lt;h3&gt;
  
  
  TL;DR
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Start with a square image concept.&lt;/li&gt;
&lt;li&gt;Ask NanoBanana 2 to convert it to a 9:16 aspect ratio by literally just padding the top and bottom with black bars. &lt;/li&gt;
&lt;li&gt;Feed that phone-format 9:16 image into Veo 3.1 Lite as your start and end frames to force a loop. &lt;/li&gt;
&lt;li&gt;Run a quick Python script using &lt;code&gt;ffmpeg&lt;/code&gt; to slice off the black bars. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Boom. Perfect square video. Perfect audio sync. And no weird edge hallucinations. Here’s how to automate this flow using Python. 🐍&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fucp1nk4vu7wel4hinj55.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fucp1nk4vu7wel4hinj55.png" alt=" " width="768" height="1376"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 1: Generating the "phone format" 9:16 frames with NanoBanana 2
&lt;/h3&gt;

&lt;p&gt;First, we need to generate our 9:16 image with the black bars baked in. Using the new &lt;a href="https://github.com/googleapis/python-genai" rel="noopener noreferrer"&gt;Gemini API SDK&lt;/a&gt;, we can prompt NanoBanana 2 to do the heavy lifting for us.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.genai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize your client
&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_padded_frame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_filename&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🎨 Generating padded 9:16 image with NanoBanana 2...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# We explicitly tell NanoBanana 2 to give us a 9:16 image 
&lt;/span&gt;    &lt;span class="c1"&gt;# where the subject is a square in the middle, padded by black bars.
&lt;/span&gt;    &lt;span class="n"&gt;hacked_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;. Keep the main subject perfectly square in the center, and pad the top and bottom with solid black bars to make the overall aspect ratio 9:16.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_images&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;nanobanana-2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# Our trusty image model
&lt;/span&gt;        &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;hacked_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;GenerateImagesConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;number_of_images&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;aspect_ratio&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;9:16&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;output_mime_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image/jpeg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Save the output
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;generated_image&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;generated_images&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;generated_image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;
        &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_filename&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ Saved to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;output_filename&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Generate our start/end frame
&lt;/span&gt;&lt;span class="nf"&gt;generate_padded_frame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A majestic pink flamingo standing in a serene pond&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;flamingo_padded.jpg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Generating the video with Veo 3.1 Lite
&lt;/h3&gt;

&lt;p&gt;Now that we have our 9:16 image with black bars (&lt;code&gt;flamingo_padded.jpg&lt;/code&gt;), we pass it to Veo 3.1 Lite. By using the same image as the visual prompt, we ensure the video maintains those exact black bars throughout the generation process. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;(Note: In the Veo web UI, you can set this as the start and end frame for a perfect loop. Here is the API equivalent for generating the video from your image).&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_video&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;video_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_filename&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🎬 Uploading frame and prompting Veo 3.1 Lite...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Upload the padded image to the Gemini API
&lt;/span&gt;    &lt;span class="n"&gt;initial_frame&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;files&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;image_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Wait for the file to be processed
&lt;/span&gt;    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;initial_frame&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PROCESSING&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flush&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;initial_frame&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;files&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;initial_frame&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Call Veo 3.1 Lite
&lt;/span&gt;    &lt;span class="c1"&gt;# We ask it to animate the subject but keep the black bars untouched
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;veo-3.1-lite&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="n"&gt;initial_frame&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;video_prompt&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;. The flamingo moves slightly, but the black bars at the top and bottom must remain exactly the same.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Save the generated video bytes
&lt;/span&gt;    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_filename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c1"&gt;# Handling depends on raw bytes returned
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;✅ Video generated and saved as &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;output_filename&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;generate_video&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;flamingo_padded.jpg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Cinematic shot of a flamingo looking around&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;raw_veo_output.mp4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: The &lt;code&gt;ffmpeg&lt;/code&gt; post-processing
&lt;/h3&gt;

&lt;p&gt;Now we have a beautiful video of a flamingo, but it's a 9:16 file with annoying black bars at the top and bottom. &lt;/p&gt;

&lt;p&gt;We &lt;em&gt;could&lt;/em&gt; crop this frame-by-frame using Python libraries like MoviePy, but honestly? &lt;code&gt;ffmpeg&lt;/code&gt; via the &lt;code&gt;subprocess&lt;/code&gt; module is infinitely faster, uses way less memory, and most importantly: &lt;strong&gt;it perfectly preserves the audio stream&lt;/strong&gt; without degrading it through re-encoding.&lt;/p&gt;

&lt;p&gt;Since the video is 9:16, trimming it to &lt;code&gt;iw:iw&lt;/code&gt; (input width : input width) creates a perfect 1:1 square. FFmpeg is smart enough to center the crop automatically, perfectly slicing off the top and bottom black bars.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;crop_to_square&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_video&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_video&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✂️ Cropping out the black bars with FFmpeg...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;command&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ffmpeg&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;-y&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                 &lt;span class="c1"&gt;# Overwrite output if it exists
&lt;/span&gt;        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;-i&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input_video&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="c1"&gt;# Input file
&lt;/span&gt;        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;-vf&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;crop=iw:iw&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Video Filter: Crop to width x width (automatically centered!)
&lt;/span&gt;        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;-c:a&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;copy&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="c1"&gt;# Copy the audio as-is (chef's kiss for performance)
&lt;/span&gt;        &lt;span class="n"&gt;output_video&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stdout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DEVNULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DEVNULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🔥 Success! Perfectly square video saved to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;output_video&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CalledProcessError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;💀 FFmpeg failed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Run the final crop
&lt;/span&gt;&lt;span class="nf"&gt;crop_to_square&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;raw_veo_output.mp4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_square_flamingo.mp4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;iframe class="tweet-embed" id="tweet-2054247282841985125-771" src="https://platform.twitter.com/embed/Tweet.html?id=2054247282841985125"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-2054247282841985125-771');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=2054247282841985125&amp;amp;theme=dark"
  }



&lt;/p&gt;

&lt;h3&gt;
  
  
  Why this workaround actually... works
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Framing control:&lt;/strong&gt; When you force the AI to outpaint black bars first, &lt;em&gt;you&lt;/em&gt; control the framing of the main subject. You aren't relying on the video model to guess what to keep in the center.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audio preservation:&lt;/strong&gt; The &lt;code&gt;'-c:a', 'copy'&lt;/code&gt; flag in FFmpeg ensures you don't lose any audio fidelity when manipulating the video file.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero hallucinations:&lt;/strong&gt; Because the video model is explicitly told to keep the black bars, it doesn't waste compute trying to generate weird background details at the extreme top and bottom edges. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Sometimes the best engineering solutions are just stacking simple tools together in a trench coat. 🧥&lt;/p&gt;

&lt;p&gt;Have you guys found any other weird/genius hacks for wrangling AI video generation APIs? Drop them in the comments, I’d love to test them out! &lt;/p&gt;

&lt;p&gt;&lt;em&gt;(P.S. Make sure you have &lt;code&gt;ffmpeg&lt;/code&gt; already installed on your machine before running the Python script, or it will yell at you).&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>videogen</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Deploying a Multi-Agent System with Terraform and Cloud Run</title>
      <dc:creator>Shir Meir Lador</dc:creator>
      <pubDate>Thu, 07 May 2026 21:04:12 +0000</pubDate>
      <link>https://dev.to/googleai/deploying-a-multi-agent-system-with-terraform-and-cloud-run-2a9c</link>
      <guid>https://dev.to/googleai/deploying-a-multi-agent-system-with-terraform-and-cloud-run-2a9c</guid>
      <description>&lt;p&gt;In support of our mission to accelerate the developer journey on Google Cloud, we built Dev Signal: a multi-agent system designed to transform raw community signals into reliable technical guidance by automating the path from discovery to expert creation.&lt;/p&gt;

&lt;p&gt;In the first three parts of this series, we laid the essential groundwork by establishing its core capabilities and local verification process:&lt;/p&gt;

&lt;p&gt;In &lt;a href="https://dev.to/googleai/building-capabilities-for-a-multi-agent-system-with-google-adk-mcp-and-cloud-run-ab9"&gt;part 1&lt;/a&gt;, we standardize the agent's capabilities through the Model Context Protocol (MCP), connecting it to Reddit for trend discovery and Google Cloud Docs for technical grounding. In &lt;a href="https://dev.to/googleai/architect-a-personalized-multi-agent-system-with-long-term-memory-3o15"&gt;part 2&lt;/a&gt;, we built a multi-agent architecture and integrated the Vertex AI memory bank to allow the system to learn and persist user preferences across different conversations. In &lt;a href="https://dev.to/googleai/local-testing-of-a-multi-agent-system-with-memory-37mm"&gt;part 3&lt;/a&gt;, we verified the full end-to-end lifecycle locally using a dedicated test runner to ensure that research, content creation, and cloud-based memory retrieval were perfectly synchronized.&lt;/p&gt;

&lt;p&gt;If you'd like to dive straight into the code, you can clone the repository &lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/ai-ml/dev-signal" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deployment to Cloud Run and the Path to Production
&lt;/h2&gt;

&lt;p&gt;To help you transition from this local prototype to a production service, this final part focuses on building the production backbone of your agent using the foundational deployment patterns provided by the &lt;a href="https://github.com/GoogleCloudPlatform/agent-starter-pack" rel="noopener noreferrer"&gt;Agent Starter Pack&lt;/a&gt;. We will implement the essential structural components required for monitoring, data integrity, and long-term state management in the cloud. You will learn to implement the application server and helper utilities needed for a production-ready deployment before provisioning secure, reproducible infrastructure with Terraform.&lt;/p&gt;

&lt;p&gt;While the Dockerfile packages your agent's code and its specialized dependencies, such as Node.js for the Reddit MCP tool, Terraform is used to build the platform it lives on. Terraform automates the creation of your Artifact Registry, least-privilege service accounts, and Secret Manager integrations to ensure your API keys remain protected.&lt;/p&gt;

&lt;p&gt;By the end of this part, you will have a standardized application framework deployed on Google Cloud Run and a roadmap for graduating your prototype through continuous evaluation, CI/CD and advanced observability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Production Utilities and Server: Building the System's Body
&lt;/h2&gt;

&lt;p&gt;In this section, you implement the structural components required for monitoring and long-term state management in the cloud.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Application Server:&lt;/strong&gt; Initializing the FastAPI server and establishing a vital connection to the Vertex AI memory bank.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Implementing Telemetry:&lt;/strong&gt; Enabling 'Agent Traces' for visibility into internal reasoning.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Application Server
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;fast_api_app.py&lt;/code&gt; file serves as the vital entry point for your agent, transforming the core logic into a production FastAPI server that acts as the "body" of your system. When deploying to Cloud Run, this server is essential because it provides the necessary web interface to listen for incoming HTTP requests and dispatch them to the agent for processing. Beyond basic serving, its most critical role is establishing a connection to the Vertex AI memory bank by defining a &lt;code&gt;MEMORY_URI&lt;/code&gt;, which allows the ADK framework to persist and retrieve user preferences across different production sessions. Additionally, the application server initializes production-grade telemetry for real-time monitoring.&lt;/p&gt;

&lt;p&gt;Go back to the &lt;code&gt;dev_signal_agent folder.&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ..
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Paste the following code in &lt;code&gt;dev_signal_agent/fast_api_app.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.cli.fast_api&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_fast_api_app&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.cloud&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;cloud_logging&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vertexai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;agent_engines&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dev_signal_agent.app_utils.env&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;init_environment&lt;/span&gt;

&lt;span class="c1"&gt;# --- Initialization &amp;amp; Secure Secret Retrieval ---
# We now unpack the SECRETS dictionary returned by our updated env.py
&lt;/span&gt;&lt;span class="n"&gt;PROJECT_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MODEL_LOC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SERVICE_LOC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SECRETS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;init_environment&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cloud_logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Access sensitive credentials from the SECRETS dictionary
# These keys stay in memory and are NOT injected into os.environ
&lt;/span&gt;&lt;span class="n"&gt;REDDIT_CLIENT_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SECRETS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REDDIT_CLIENT_ID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;REDDIT_CLIENT_SECRET&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SECRETS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REDDIT_CLIENT_SECRET&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;REDDIT_USER_AGENT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SECRETS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REDDIT_USER_AGENT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;DK_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SECRETS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DK_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# --- Configuration &amp;amp; Sessions ---
&lt;/span&gt;&lt;span class="n"&gt;AGENT_DIR&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dirname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dirname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abspath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__file__&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;span class="c1"&gt;# Non-sensitive configuration uses environment variables
&lt;/span&gt;&lt;span class="n"&gt;BUCKET&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AI_ASSETS_BUCKET&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;USE_IN_MEMORY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;USE_IN_MEMORY_SESSION&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# --- MEMORY BANK CONNECTION ---
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_get_memory_bank_uri&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;USE_IN_MEMORY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="c1"&gt;# We use 'dev_signal_agent' as the display name for the Vertex AI memory bank
&lt;/span&gt;    &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AGENT_ENGINE_MEMORY_BANK_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dev_signal_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent_engines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;display_name=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;ae&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;existing&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;agent_engines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;display_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agentengine://&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ae&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resource_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DEBUG: Connecting to Memory Bank: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (display_name=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uri&lt;/span&gt;

&lt;span class="n"&gt;SESSION_URI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MEMORY_URI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_get_memory_bank_uri&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# --- Initialize FastAPI with ADK ---
&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_fast_api_app&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agents_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AGENT_DIR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;web&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;artifact_service_uri&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gs://&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;BUCKET&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;BUCKET&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;allow_origins&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ALLOW_ORIGINS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ALLOW_ORIGINS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;session_service_uri&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;SESSION_URI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;memory_service_uri&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MEMORY_URI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# &amp;lt;--- Connects the Memory Bank
&lt;/span&gt;    &lt;span class="n"&gt;otel_to_cloud&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# &amp;lt;--- Enables production telemetry
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;uvicorn&lt;/span&gt;
    &lt;span class="c1"&gt;# Standard Cloud Run port is 8080
&lt;/span&gt;    &lt;span class="n"&gt;uvicorn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0.0.0.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8080&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Implementing Telemetry
&lt;/h3&gt;

&lt;p&gt;In a production environment, visibility into your agent's reasoning is critical. We leverage the built-in observability features of the Google ADK by setting the &lt;code&gt;otel_to_cloud=True&lt;/code&gt; flag in our application server. This single parameter handles the majority of the instrumentation automatically, exporting "Agent Traces" directly to the Google Cloud Console. These traces provide a "visual waterfall" of the agent's operation, including individual agent thought processes, LLM invocations, and MCP tool calls.&lt;/p&gt;

&lt;h4&gt;
  
  
  Monitoring vs. Targeted Evaluation
&lt;/h4&gt;

&lt;p&gt;It is essential to understand that production tracing is subject to sampling to balance performance and cost. Because Cloud Run captures only a subset of requests, not every individual user interaction will be visible.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;System Traces (Monitoring):&lt;/strong&gt; Used to analyze behavior "at large," such as identifying latency bottlenecks or system timeouts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reasoning Traces (Evaluation):&lt;/strong&gt; High-quality evaluation mandates targeted trace capture. This means calling the agent specifically for a test case where you know you will evaluate that particular request in full detail.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Viewing the Trace
&lt;/h4&gt;

&lt;p&gt;To see your traces, navigate to the Trace Explorer in the Google Cloud Console and filter for your service (e.g., &lt;code&gt;dev-signal&lt;/code&gt;). Clicking a specific Trace ID opens a Gantt chart that allows you to distinguish between cognitive reasoning failures (wrong decisions) and physical system issues (timeouts).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhbe3brcww32j0igfr7zh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhbe3brcww32j0igfr7zh.png" alt="Trace Explorer view" width="800" height="354"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For advanced configurations, refer to the following documentation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.cloud.google.com/run/docs/trace#trace_sampling_rate?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener noreferrer"&gt;Cloud Run Trace Sampling&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.cloud.google.com/stackdriver/docs/instrumentation/ai-agent-adk#configure?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener noreferrer"&gt;Configuring ADK Telemetry&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.cloud.google.com/trace/docs/collect-view-multimodal-prompts-responses?utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener noreferrer"&gt;Multimodal Trace Capture&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://google.github.io/adk-docs/integrations/bigquery-agent-analytics/" rel="noopener noreferrer"&gt;BigQuery Agent Analytics Integration&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Infrastructure as Code: Provisioning Secure Cloud Resources
&lt;/h2&gt;

&lt;p&gt;We utilize the infrastructure-as-code patterns provided by the &lt;a href="https://github.com/GoogleCloudPlatform/agent-starter-pack" rel="noopener noreferrer"&gt;Agent Starter Pack&lt;/a&gt;'s security-first design. The starter pack builds the professional platform required to automate the creation of least-privilege service accounts and robust secret management in seconds.&lt;/p&gt;

&lt;p&gt;Using Terraform ensures that your entire Google Cloud environment - from IAM roles to Secret Manager versions - is defined in reproducible, secure code. We break our infrastructure into the following logical blocks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Resources &amp;amp; Variables&lt;/strong&gt;: Define the specific project, region, and sensitive API secrets used by the agent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Core Infrastructure&lt;/strong&gt;: Enable essential APIs and provision a private Artifact Registry to host your agent's container images.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Identity &amp;amp; Access Management (IAM)&lt;/strong&gt;: Configure specialized Service Accounts that strictly follow the Principle of Least Privilege to ensure your system remains secure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Secret Management&lt;/strong&gt;: Securely ingest API credentials into Google Secret Manager for protected runtime access.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloud Run Configuration&lt;/strong&gt;: Define the container environment, resource limits, and automated secret injection for the final deployment.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To begin provisioning, return to the root folder of your project (dev-signal) and create the necessary deployment directories:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ..
&lt;span class="nb"&gt;mkdir &lt;/span&gt;deployment
&lt;span class="nb"&gt;cd &lt;/span&gt;deployment
&lt;span class="nb"&gt;mkdir &lt;/span&gt;terraform
&lt;span class="nb"&gt;cd &lt;/span&gt;terraform
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Terraform Resources and Variables
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;variables.tf&lt;/code&gt; file defines the configurable parameters for your deployment, allowing you to customize the infrastructure without altering the underlying logic. It includes variables for the &lt;code&gt;project_id&lt;/code&gt;, the deployment &lt;code&gt;region&lt;/code&gt; (defaulting to &lt;code&gt;us-central1&lt;/code&gt;), and the &lt;code&gt;service_name&lt;/code&gt; for your Cloud Run instance. Furthermore, it defines a &lt;code&gt;secrets&lt;/code&gt; map used to securely ingest sensitive API credentials—such as Reddit and Developer Knowledge keys—into Google Secret Manager for runtime access. This modular approach ensures your production environment remains reproducible, secure, and adaptable across different projects.&lt;/p&gt;

&lt;p&gt;Paste the following code into &lt;code&gt;deployment/terraform/variables.tf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"project_id"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The Google Cloud Project ID"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"region"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The Google Cloud region to deploy to"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;
  &lt;span class="nx"&gt;default&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"us-central1"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"service_name"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The name of the Cloud Run service"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;
  &lt;span class="nx"&gt;default&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"dev-signal"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"secrets"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"A map of secret names and their values (e.g., REDDIT_CLIENT_ID, DK_API_KEY)"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;default&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;variable&lt;/span&gt; &lt;span class="s2"&gt;"ai_assets_bucket"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The GCS bucket for storing AI assets"&lt;/span&gt;
  &lt;span class="nx"&gt;type&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Core Infrastructure Logic
&lt;/h3&gt;

&lt;p&gt;We define our infrastructure in logical blocks. Here is what each part does:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Enable APIs&lt;/strong&gt;: Ensures the project has the necessary services active (Cloud Run, Vertex AI, etc.). We use &lt;code&gt;disable_on_destroy = false&lt;/code&gt; to prevent accidental data loss if the Terraform is destroyed.&lt;/p&gt;

&lt;p&gt;Paste the following code into &lt;code&gt;deployment/terraform/main.tf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"google_project_service"&lt;/span&gt; &lt;span class="s2"&gt;"services"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;project&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;project_id&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;toset&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="s2"&gt;"run.googleapis.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;"artifactregistry.googleapis.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;"cloudbuild.googleapis.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;"aiplatform.googleapis.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;"secretmanager.googleapis.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s2"&gt;"logging.googleapis.com"&lt;/span&gt;
  &lt;span class="p"&gt;])&lt;/span&gt;
  &lt;span class="nx"&gt;service&lt;/span&gt;            &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;
  &lt;span class="nx"&gt;disable_on_destroy&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Artifact Registry&lt;/strong&gt;: Creates a private Docker registry to store our agent's container images.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"google_artifact_registry_repository"&lt;/span&gt; &lt;span class="s2"&gt;"repo"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;region&lt;/span&gt;
  &lt;span class="nx"&gt;project&lt;/span&gt;       &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;project_id&lt;/span&gt;
  &lt;span class="nx"&gt;repository_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"dev-signal-repo"&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Docker repository for Dev Signal Agent"&lt;/span&gt;
  &lt;span class="nx"&gt;format&lt;/span&gt;        &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"DOCKER"&lt;/span&gt;
  &lt;span class="nx"&gt;depends_on&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;google_project_service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;services&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Service Account &amp;amp; IAM: Adhering to the Principle of Least Privilege&lt;/strong&gt; - This is a critical security step. In accordance with the Principle of Least Privilege, we avoid using the default compute service account and instead provision a dedicated user-managed service account (&lt;code&gt;dev-signal-sa&lt;/code&gt;). By designating this as the Cloud Run service identity, we can grant it only the minimum necessary permissions—specifically &lt;code&gt;roles/aiplatform.user&lt;/code&gt;, &lt;code&gt;roles/logging.logWriter&lt;/code&gt;, and &lt;code&gt;roles/storage.objectAdmin&lt;/code&gt;. This granular access control ensures that the agent has the exact permissions required to interact with Vertex AI and Cloud Storage without over-granting access to other sensitive cloud resources, significantly reducing the potential impact of a compromised account. Learn more &lt;a href="https://docs.cloud.google.com/iam/docs/best-practices-service-accounts?content_ref=because%20a%20service%20account%20is%20a%20principal%20you%20must%20limit%20its%20privileges%20to%20reduce%20the%20potential%20harm%20that%20can%20be%20done%20by%20a%20compromised%20service%20account&amp;amp;utm_campaign=CDR_0x91b1edb5_default_b485268863&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener noreferrer"&gt;best practices for using service accounts securely&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"google_service_account"&lt;/span&gt; &lt;span class="s2"&gt;"agent_sa"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;project&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;project_id&lt;/span&gt;
  &lt;span class="nx"&gt;account_id&lt;/span&gt;   &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"${var.service_name}-sa"&lt;/span&gt;
  &lt;span class="nx"&gt;display_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"Dev Signal Agent Service Account"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4. Secret Management&lt;/strong&gt;: This handles your API keys securely. It creates secrets in Google Secret Manager and gives the agent's Service Account permission to access them at runtime.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"google_secret_manager_secret"&lt;/span&gt; &lt;span class="s2"&gt;"agent_secrets"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;project&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;project_id&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;toset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;secrets&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="nx"&gt;secret_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;
  &lt;span class="nx"&gt;replication&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;auto&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;depends_on&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;google_project_service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;services&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"google_secret_manager_secret_version"&lt;/span&gt; &lt;span class="s2"&gt;"agent_secrets_version"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;toset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;secrets&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="nx"&gt;secret&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;google_secret_manager_secret&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agent_secrets&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;secret_data&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;secrets&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"google_secret_manager_secret_iam_member"&lt;/span&gt; &lt;span class="s2"&gt;"secret_accessor"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;project&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;project_id&lt;/span&gt;
  &lt;span class="nx"&gt;for_each&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;toset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;secrets&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="nx"&gt;secret_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;google_secret_manager_secret&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agent_secrets&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
  &lt;span class="nx"&gt;role&lt;/span&gt;      &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"roles/secretmanager.secretAccessor"&lt;/span&gt;
  &lt;span class="nx"&gt;member&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"serviceAccount:${google_service_account.agent_sa.email}"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;5. Cloud Run Configuration:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security Best Practice:&lt;/strong&gt; To satisfy production security standards, our &lt;code&gt;main.tf&lt;/code&gt; grants the Service Account the &lt;code&gt;secretmanager.secretAccessor&lt;/code&gt; role. Our Python application then uses the &lt;a href="https://docs.cloud.google.com/secret-manager/docs/best-practices#coding-practices" rel="noopener noreferrer"&gt;Secret Manager SDK&lt;/a&gt; to pull these credentials directly into local memory at runtime, ensuring they never touch the container's environment configuration&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="c1"&gt;# 6. Cloud Run Service Deployment&lt;/span&gt;
&lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="s2"&gt;"google_cloud_run_v2_service"&lt;/span&gt; &lt;span class="s2"&gt;"default"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;project&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;project_id&lt;/span&gt;
  &lt;span class="nx"&gt;name&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;service_name&lt;/span&gt;
  &lt;span class="nx"&gt;location&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;region&lt;/span&gt;
  &lt;span class="nx"&gt;ingress&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"INGRESS_TRAFFIC_ALL"&lt;/span&gt;

  &lt;span class="nx"&gt;template&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;service_account&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;google_service_account&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;agent_sa&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;email&lt;/span&gt;

    &lt;span class="nx"&gt;containers&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;image&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"us-docker.pkg.dev/cloudrun/container/hello"&lt;/span&gt; &lt;span class="c1"&gt;# Placeholder until first build&lt;/span&gt;

      &lt;span class="nx"&gt;env&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;name&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"GOOGLE_CLOUD_PROJECT"&lt;/span&gt;
        &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;project_id&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="nx"&gt;env&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;name&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"GOOGLE_CLOUD_LOCATION"&lt;/span&gt;
        &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"global"&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="nx"&gt;env&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;name&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"GOOGLE_GENAI_USE_VERTEXAI"&lt;/span&gt;
        &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"True"&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="nx"&gt;env&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;name&lt;/span&gt;  &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"AI_ASSETS_BUCKET"&lt;/span&gt;
        &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;var&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ai_assets_bucket&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="nx"&gt;resources&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;limits&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;cpu&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"1"&lt;/span&gt;
          &lt;span class="nx"&gt;memory&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"2Gi"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;traffic&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;type&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"TRAFFIC_TARGET_ALLOCATION_TYPE_LATEST"&lt;/span&gt;
    &lt;span class="nx"&gt;percent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Provision the Infrastructure
&lt;/h3&gt;

&lt;p&gt;Before we can deploy our code, we need to provision the Google Cloud infrastructure we just defined.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Initialize Terraform&lt;/strong&gt;: This downloads the necessary provider plugins. Run this in &lt;code&gt;deployment/terraform&lt;/code&gt; folder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;terraform init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Create a Variables File&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;Paste this code in &lt;code&gt;deployment/terraform/terraform.tfvars&lt;/code&gt; and update it with your project details and secrets.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;project_id&lt;/span&gt;       &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"your-project-id"&lt;/span&gt;
&lt;span class="nx"&gt;region&lt;/span&gt;           &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"us-central1"&lt;/span&gt;
&lt;span class="nx"&gt;service_name&lt;/span&gt;     &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"dev-signal"&lt;/span&gt;
&lt;span class="nx"&gt;ai_assets_bucket&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"your-bucket-name"&lt;/span&gt;
&lt;span class="nx"&gt;secrets&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;REDDIT_CLIENT_ID&lt;/span&gt;     &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"your_client_id"&lt;/span&gt;
  &lt;span class="nx"&gt;REDDIT_CLIENT_SECRET&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"your_client_secret"&lt;/span&gt;
  &lt;span class="nx"&gt;REDDIT_USER_AGENT&lt;/span&gt;    &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"your_user_agent"&lt;/span&gt;
  &lt;span class="nx"&gt;DK_API_KEY&lt;/span&gt;           &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"your_dk_api_key"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Plan configuration&lt;/strong&gt;: This allows you to review the changes before they are applied. Run this in the &lt;code&gt;deployment/terraform&lt;/code&gt; folder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;terraform plan &lt;span class="nt"&gt;-out&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;plan.tfplan
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Apply Configuration&lt;/strong&gt;: Once you have reviewed the plan and confirmed it does what you want, run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;terraform apply plan.tfplan
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Deployment: Containerization and the Cloud Build Pipeline
&lt;/h2&gt;

&lt;p&gt;In this final stage of the build process, we package our agent's "body" and "brain" into a portable, production-ready container. This ensures that every component - from our Python logic to the Node.js environment required for the Reddit MCP tool - is bundled together with its exact dependencies.&lt;/p&gt;

&lt;p&gt;We utilize a &lt;strong&gt;Dockerfile&lt;/strong&gt; to define this environment and a &lt;strong&gt;Makefile&lt;/strong&gt; to orchestrate the deployment pipeline. When you trigger the deployment, &lt;a href="https://pantheon.corp.google.com/cloud-build/builds" rel="noopener noreferrer"&gt;Google Cloud Build&lt;/a&gt; takes your local source code, builds the container image according to the Dockerfile, and stores it in the private Artifact Registry created earlier by Terraform. Finally, the pipeline automatically updates your Cloud Run service to serve traffic using this fresh image, completing the journey from local code to a live, secure cloud workload.&lt;/p&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev-signal/Dockerfile&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; python:3.12-slim&lt;/span&gt;

&lt;span class="c"&gt;# Install Node.js and npm for MCP tools (like reddit-mcp)&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;apt-get update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    curl &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://deb.nodesource.com/setup_20.x | bash - &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; nodejs &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; reddit-mcp &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get clean &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; /var/lib/apt/lists/&lt;span class="k"&gt;*&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--no-cache-dir&lt;/span&gt; &lt;span class="nv"&gt;uv&lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;0.8.13

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /code&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; ./pyproject.toml ./README.md ./uv.lock* ./&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; ./dev_signal_agent ./dev_signal_agent&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;uv &lt;span class="nb"&gt;sync&lt;/span&gt; &lt;span class="nt"&gt;--frozen&lt;/span&gt;

&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="s"&gt; 8080&lt;/span&gt;

&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["uv", "run", "uvicorn", "dev_signal_agent.fast_api_app:app", "--host", "0.0.0.0", "--port", "8080"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;strong&gt;Makefile&lt;/strong&gt; automates the build and deploys.&lt;/p&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev-signal/Makefile&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight make"&gt;&lt;code&gt;&lt;span class="nv"&gt;PROJECT_ID&lt;/span&gt; &lt;span class="o"&gt;?=&lt;/span&gt; &lt;span class="p"&gt;$(&lt;/span&gt;shell gcloud config get-value project&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;REGION&lt;/span&gt;     &lt;span class="o"&gt;?=&lt;/span&gt; us-central1
&lt;span class="nv"&gt;IMAGE_REPO&lt;/span&gt; &lt;span class="o"&gt;?=&lt;/span&gt; dev-signal-repo
&lt;span class="nv"&gt;IMAGE&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;$(&lt;/span&gt;REGION&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nt"&gt;-docker&lt;/span&gt;.pkg.dev/&lt;span class="p"&gt;$(&lt;/span&gt;PROJECT_ID&lt;span class="p"&gt;)&lt;/span&gt;/&lt;span class="p"&gt;$(&lt;/span&gt;IMAGE_REPO&lt;span class="p"&gt;)&lt;/span&gt;/agent:latest

&lt;span class="c"&gt;# Deploy via Cloud Build &amp;amp; Container
&lt;/span&gt;&lt;span class="nl"&gt;docker-deploy&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"? Building and deploying to &lt;/span&gt;&lt;span class="p"&gt;$(&lt;/span&gt;&lt;span class="s2"&gt;PROJECT_ID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="s2"&gt; via Cloud Build..."&lt;/span&gt;
    gcloud builds submit &lt;span class="nt"&gt;--tag&lt;/span&gt; &lt;span class="p"&gt;$(&lt;/span&gt;IMAGE&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nt"&gt;--project&lt;/span&gt; &lt;span class="p"&gt;$(&lt;/span&gt;PROJECT_ID&lt;span class="p"&gt;)&lt;/span&gt; .
    gcloud run services update dev-signal &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--image&lt;/span&gt; &lt;span class="p"&gt;$(&lt;/span&gt;IMAGE&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="p"&gt;$(&lt;/span&gt;REGION&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--project&lt;/span&gt; &lt;span class="p"&gt;$(&lt;/span&gt;PROJECT_ID&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nt"&gt;--labels&lt;/span&gt; dev-tutorial&lt;span class="o"&gt;=&lt;/span&gt;dev-signal-agent
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Deploy Application
&lt;/h3&gt;

&lt;p&gt;Now that our infrastructure is ready, we can build and deploy the application code.&lt;/p&gt;

&lt;p&gt;Run the following command from the root of your project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;make docker-deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What happens when you run this?&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Build&lt;/strong&gt;: Google Cloud Build takes your local code and the &lt;code&gt;Dockerfile&lt;/code&gt;, builds a container image, and stores it in the Artifact Registry.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deploy&lt;/strong&gt;: It updates the Cloud Run service defined in Terraform to use this new image.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When the deployment completes, you should get a message like this:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Service [dev-signal] revision [dev-signal...] has been deployed and is serving 100 percent of traffic.&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Service URL: https://dev-signal-...-.us-central1.run.app&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Verification: Accessing and Testing Your Deployed Agent
&lt;/h2&gt;

&lt;p&gt;Since production services are private by default, this section covers how to grant permissions and access the agent securely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Managing IAM Permissions:&lt;/strong&gt; Granting the necessary &lt;code&gt;run.invoker&lt;/code&gt; role to authorized users.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Secure Access via Cloud Run Proxy:&lt;/strong&gt; Using the &lt;code&gt;gcloud&lt;/code&gt; proxy to interact with your live service.&lt;/p&gt;

&lt;h3&gt;
  
  
  Granting User Permissions
&lt;/h3&gt;

&lt;p&gt;Before you can invoke the service, you must grant your Google account the &lt;code&gt;roles/run.invoker&lt;/code&gt; role for this specific service. Run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gcloud run services add-iam-policy-binding dev-signal &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--member&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"user:&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;gcloud config get-value account&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"roles/run.invoker"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--region&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;us-central1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--project&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;gcloud config get-value project&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Launch the Proxy
&lt;/h3&gt;

&lt;p&gt;Now, access your private service securely via the proxy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gcloud run services proxy dev-signal &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--region&lt;/span&gt; us-central1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--project&lt;/span&gt; &lt;span class="si"&gt;$(&lt;/span&gt;gcloud config get-value project&lt;span class="si"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Visit &lt;strong&gt;http://localhost:8080&lt;/strong&gt; to chat with your deployed agent! See a possible test scenario in &lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/create-expert-content-local-testing-of-a-multi-agent-system-with-memory" rel="noopener noreferrer"&gt;part 3&lt;/a&gt; of the series.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Congratulations! You have successfully built &lt;strong&gt;Dev Signal&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What we covered:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://dev.to/googleai/building-capabilities-for-a-multi-agent-system-with-google-adk-mcp-and-cloud-run-ab9"&gt;&lt;strong&gt;Tooling (MCP)&lt;/strong&gt;&lt;/a&gt;: You connected your agent to &lt;strong&gt;Reddit&lt;/strong&gt;, &lt;strong&gt;Google Docs&lt;/strong&gt;, and a &lt;strong&gt;Local Image Generator&lt;/strong&gt; using the Model Context Protocol.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/googleai/architect-a-personalized-multi-agent-system-with-long-term-memory-3o15"&gt;&lt;strong&gt;Architecture&lt;/strong&gt;&lt;/a&gt;: You implemented a &lt;strong&gt;Root Orchestrator&lt;/strong&gt; managing specialized agents (Scanner, Expert, Drafter).&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/googleai/local-testing-of-a-multi-agent-system-with-memory-37mm"&gt;&lt;strong&gt;Memory&lt;/strong&gt;&lt;/a&gt;: You integrated &lt;strong&gt;Vertex AI memory bank&lt;/strong&gt; to give your agent long-term persistence across sessions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Production&lt;/strong&gt;: You deployed the entire stack to &lt;strong&gt;Google Cloud Run&lt;/strong&gt; using &lt;strong&gt;Terraform&lt;/strong&gt; for secure, reproducible infrastructure.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You now have a solid foundation for building sophisticated, stateful AI applications on Google Cloud.&lt;/p&gt;

</description>
      <category>googlecloud</category>
      <category>terraform</category>
      <category>ai</category>
      <category>agents</category>
    </item>
    <item>
      <title>Local Testing of a Multi-Agent System with Memory</title>
      <dc:creator>Shir Meir Lador</dc:creator>
      <pubDate>Thu, 07 May 2026 21:03:03 +0000</pubDate>
      <link>https://dev.to/googleai/local-testing-of-a-multi-agent-system-with-memory-37mm</link>
      <guid>https://dev.to/googleai/local-testing-of-a-multi-agent-system-with-memory-37mm</guid>
      <description>&lt;p&gt;In support of our mission to accelerate the developer journey on Google Cloud, we built Dev Signal: a multi-agent system designed to transform raw community signals into reliable technical guidance by automating the path from discovery to expert creation.&lt;/p&gt;

&lt;p&gt;In &lt;a href="https://dev.to/googleai/building-capabilities-for-a-multi-agent-system-with-google-adk-mcp-and-cloud-run-ab9"&gt;part 1&lt;/a&gt; and &lt;a href="https://dev.to/googleai/architect-a-personalized-multi-agent-system-with-long-term-memory-3o15"&gt;part 2&lt;/a&gt; of this series, we established the essential groundwork by standardizing the core capabilities through the Model Context Protocol (MCP) and constructing a multi-agent architecture integrated with the Vertex AI memory bank to provide long-term intelligence and persistence. Now, we'll explore how to test your multi-agent system locally!&lt;/p&gt;

&lt;p&gt;If you'd like to dive straight into the code and explore it at your own pace, you can clone the repository &lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/ai-ml/dev-signal" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing the Agent Locally
&lt;/h2&gt;

&lt;p&gt;Before transitioning your agentic system to Google Cloud Run, it is essential to ensure that its specialized components work seamlessly together on your workstation. This testing phase allows you to validate trend discovery, technical grounding, and creative drafting within a local feedback loop, saving time and resources during the development process.&lt;/p&gt;

&lt;p&gt;In this section, you will configure your local secrets, implement environment-aware utilities, and use a dedicated test runner to verify that Dev Signal can correctly retrieve user preferences from the Vertex AI memory bank on the cloud. This local verification ensures that your agent's "brain" and "hands" are properly synchronized before moving to deployment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Environment Setup
&lt;/h2&gt;

&lt;p&gt;Create a &lt;code&gt;.env&lt;/code&gt; file in your project root. These variables are used for local development and will be replaced by Terraform/Secret Manager in production.&lt;/p&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev-signal/.env&lt;/code&gt; and update with your own details.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: &lt;code&gt;GOOGLE_CLOUD_LOCATION&lt;/code&gt; is set as &lt;code&gt;global&lt;/code&gt; because that is where &lt;code&gt;gemini-3-flash-preview&lt;/code&gt; is supported. We will use &lt;code&gt;GOOGLE_CLOUD_LOCATION&lt;/code&gt; for the model location.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Google Cloud Configuration
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=global
GOOGLE_CLOUD_REGION=us-central1
GOOGLE_GENAI_USE_VERTEXAI=True
AI_ASSETS_BUCKET=your_bucket_name

# Reddit API Credentials
REDDIT_CLIENT_ID=your_client_id
REDDIT_CLIENT_SECRET=your_client_secret
REDDIT_USER_AGENT=my-agent/0.1

# Developer Knowledge API Key
DK_API_KEY=your_api_key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Helper Utilities
&lt;/h2&gt;

&lt;p&gt;Create a new directory for your application utils:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;dev_signal_agent
&lt;span class="nb"&gt;mkdir &lt;/span&gt;app_utils
&lt;span class="nb"&gt;cd &lt;/span&gt;app_utils
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Environment Configuration
&lt;/h3&gt;

&lt;p&gt;This module standardizes how the agent discovers the active Google Cloud Project and Region, ensuring a seamless transition between development environments. Using &lt;code&gt;load_dotenv()&lt;/code&gt;, the script first checks for local configurations before falling back to &lt;code&gt;google.auth.default()&lt;/code&gt; or environment variables to retrieve the Project ID. This automated approach ensures your agent is properly authenticated and grounded in the correct cloud context without requiring manual configuration changes.&lt;/p&gt;

&lt;p&gt;Beyond basic project discovery, the script provides a robust &lt;strong&gt;Secret Management&lt;/strong&gt; layer. It attempts to resolve sensitive credentials, such as Reddit API keys, first from the local environment (for rapid development) and then dynamically from the &lt;a href="https://docs.cloud.google.com/secret-manager/docs/reference/rest" rel="noopener noreferrer"&gt;&lt;strong&gt;Google Cloud Secret Manager API&lt;/strong&gt;&lt;/a&gt; for production security. By returning these as a dictionary rather than injecting them into environment variables, the module maintains a clean security posture.&lt;/p&gt;

&lt;p&gt;The script further calibrates the environment by distinguishing between global and regional requirements for different AI services. It specifically assigns the "global" location for models to access cutting-edge preview features while designating a regional location, such as &lt;code&gt;us-central1&lt;/code&gt;, for infrastructure like the Vertex AI Agent Engine.&lt;/p&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/app_utils/env.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;google.auth&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;vertexai&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.cloud&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;secretmanager&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_fetch_secrets&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;project_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Fetch secrets from Secret Manager and return them as a dictionary.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;secrets_to_fetch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REDDIT_CLIENT_ID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REDDIT_CLIENT_SECRET&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REDDIT_USER_AGENT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DK_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;fetched_secrets&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

    &lt;span class="c1"&gt;# First, check local environment (for local development via .env)
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;secrets_to_fetch&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;fetched_secrets&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;

    &lt;span class="c1"&gt;# If keys are missing (common in production), fetch from Secret Manager API
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fetched_secrets&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;secrets_to_fetch&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;secretmanager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;SecretManagerServiceClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;secret_id&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;secrets_to_fetch&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;secret_id&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;fetched_secrets&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;projects/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;project_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/secrets/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;secret_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/versions/latest&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;access_secret_version&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
                    &lt;span class="n"&gt;fetched_secrets&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;secret_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;UTF-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Warning: Could not fetch &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;secret_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; from Secret Manager: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fetched_secrets&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;init_environment&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Consolidated environment discovery.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;project_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;google&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;auth&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;default&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;project_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GOOGLE_CLOUD_PROJECT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;model_location&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GOOGLE_CLOUD_LOCATION&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;global&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;service_location&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GOOGLE_CLOUD_REGION&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;us-central1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;secrets&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;project_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;vertexai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;project_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;service_location&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;secrets&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_fetch_secrets&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;project_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;project_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model_location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;service_location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;secrets&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Local Testing Script
&lt;/h2&gt;

&lt;p&gt;The Google ADK comes with a built-in Web UI that is excellent for visualizing agent logic and tool composition. &lt;/p&gt;

&lt;p&gt;You can launch it by running in the project root:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run adk web
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;However, the default Web UI will not test the long-term memory integration described in this tutorial because it is not pre-connected to a Vertex AI memory session. By default, the generic UI often relies on in-memory services that do not persist data across sessions. Therefore, we use the dedicated &lt;code&gt;test_local.py&lt;/code&gt; script to explicitly initialize the &lt;code&gt;VertexAiMemoryBankService&lt;/code&gt;. This ensures that even in a local environment, your agent is communicating with the real cloud-based memory bank to validate preference persistence.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;test_local.py&lt;/code&gt; script:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Connects to the real &lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/overview" rel="noopener noreferrer"&gt;&lt;strong&gt;Vertex AI Agent Engine&lt;/strong&gt;&lt;/a&gt; in the cloud for memory storage.&lt;/li&gt;
&lt;li&gt;Uses an in-memory session service for local chat history (so you can wipe it easily).&lt;/li&gt;
&lt;li&gt;Runs a chat loop where you can talk to your agent.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Go back to the root folder &lt;code&gt;dev-signal&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; ../..
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Paste this code in &lt;code&gt;dev-signal/test_local.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;google.auth&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;vertexai&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.runners&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Runner&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.memory.vertex_ai_memory_bank_service&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;VertexAiMemoryBankService&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.sessions&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;InMemorySessionService&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vertexai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;agent_engines&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.genai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dev_signal_agent.agent&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;root_agent&lt;/span&gt;

&lt;span class="c1"&gt;# Load environment variables
&lt;/span&gt;&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# 1. Setup Configuration
&lt;/span&gt;    &lt;span class="n"&gt;project_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GOOGLE_CLOUD_PROJECT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Agent Engine (Memory) MUST use a regional endpoint
&lt;/span&gt;    &lt;span class="n"&gt;resource_location&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;us-central1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;agent_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dev-signal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--- Initializing Vertex AI in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;resource_location&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; ---&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;vertexai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;project_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;resource_location&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 2. Find the Agent Engine Resource for Memory
&lt;/span&gt;    &lt;span class="n"&gt;existing_agents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent_engines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;filter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;display_name=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;agent_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;existing_agents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;agent_engine&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;existing_agents&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;agent_engine_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent_engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;resource_name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ Using persistent Memory Bank from Agent: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;agent_engine_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;❌ Error: Agent Engine &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;agent_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; not found. Please deploy with Terraform first.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;

    &lt;span class="c1"&gt;# 3. Initialize Services
&lt;/span&gt;    &lt;span class="n"&gt;session_service&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;InMemorySessionService&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;memory_service&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;VertexAiMemoryBankService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;project_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;resource_location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;agent_engine_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;agent_engine_id&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 4. Create a Runner
&lt;/span&gt;    &lt;span class="n"&gt;runner&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Runner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;root_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;app_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dev-signal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;session_service&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;session_service&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;memory_service&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;memory_service&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 5. Run a Test Loop
&lt;/span&gt;    &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;local-tester&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;--- TEST SCENARIO ---&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1. Start a session, tell the agent your preference (e.g., &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;write in rhymes&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;).&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2. Type &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;new&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; to start a FRESH session (local state wiped).&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3. Ask for a blog post. The agent should retrieve your preference from the CLOUD memory.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;current_session_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;session-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;())[&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session_service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;app_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dev-signal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;current_session_id&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;--- Chat Session (ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;current_session_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;) ---&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;user_input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;input&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;You: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;exit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;quit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;new&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;current_session_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;session-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;())[&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session_service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;app_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dev-signal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;current_session_id&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;--- Fresh Session Started (ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;current_session_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;) ---&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;(Local history is empty, retrieval must come from Memory Bank)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Agent is thinking...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;runner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run_async&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;current_session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;new_message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Part&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
        &lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;part&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;part&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Agent: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;part&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_function_calls&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;fc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_function_calls&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🛠️ Tool Call: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;fc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Running the Test
&lt;/h3&gt;

&lt;p&gt;First, ensure you have your Application Default Credentials set up:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gcloud auth application-default login
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then run the script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run test_local.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Test Scenario
&lt;/h2&gt;

&lt;p&gt;This scenario validates the full end-to-end lifecycle of the agent: from discovery and research to multimodal content creation and long-term memory retrieval.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 1: Teaching &amp;amp; Multimodal Creation (Session 1)
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;Goal: Establish technical context and set a specific stylistic preference.&lt;/em&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Discovery
&lt;/h4&gt;

&lt;p&gt;Ask the agent to find trending Cloud Run topics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Input&lt;/strong&gt;: &lt;code&gt;"Find high-engagement questions about AI agents on Cloud Run from the last 21 days."&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhfyh7wc97yzgc413xhwi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhfyh7wc97yzgc413xhwi.png" alt="Test 1 - Discovery" width="800" height="174"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhs895zdl1k3z309q3ya5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhs895zdl1k3z309q3ya5.png" alt="Test 2 - Discovery Results" width="800" height="485"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Research
&lt;/h4&gt;

&lt;p&gt;Instruct the agent to perform a deep dive on a specific result.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Input&lt;/strong&gt;: &lt;code&gt;"Use the GCP Expert to research topic #1."&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm3tdx62hxuv8lsumw07k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm3tdx62hxuv8lsumw07k.png" alt="Test 3 - Research" width="800" height="470"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Personalization
&lt;/h4&gt;

&lt;p&gt;Request a blog post and explicitly set your style preference.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Input&lt;/strong&gt;: &lt;code&gt;"Draft a blog post based on this research. From now on, I want all my technical blogs written in the style of a 90s Rap Song."&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Febf7ljsso38maqnetfxt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Febf7ljsso38maqnetfxt.png" alt="Test 4 - Personalization" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Image Generation
&lt;/h4&gt;

&lt;p&gt;Ask the agent to generate an image that demonstrates the main ideas in the blog using the Nano Banana Pro tool. The image will be saved to your bucket in Google Cloud and you should get the path to see it, which will look like: &lt;code&gt;https://storage.mtls.cloud.google.com/...&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjczvc5cwzymc1qjyuou2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjczvc5cwzymc1qjyuou2.png" alt="Token Optimization / Image Generation" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 2: Long-Term Memory Recall (Session 2)
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;Goal: Verify the agent recalls preferences across a completely fresh session.&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Type &lt;code&gt;new&lt;/code&gt; in the console to wipe local session history and start a fresh state.&lt;/li&gt;
&lt;li&gt;Retrieval: Inquire about your stored preferences to test the Vertex AI memory bank.&lt;/li&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;em&gt;Input&lt;/em&gt;: &lt;code&gt;"What are my current topics of interest and what is my preferred blogging style?"&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;li&gt;Verification: Confirm the agent successfully retrieves your "AI Agents on Cloud Run" interest and "Rap" style from the cloud.&lt;/li&gt;

&lt;/ol&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsymbvlaad5owzrjc537x.png" alt="Test 5 - Memory Recall" width="800" height="269"&gt;

&lt;p&gt;&lt;strong&gt;Final Test&lt;/strong&gt;: Ask for a new blog on a different topic (e.g., "GKE Autopilot") and ensure it is automatically written as a rap song without being prompted.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;In this part of our series we focused on verifying the agent's functionality in a local environment before proceeding to cloud deployment. By configuring local secrets and utilizing environment-aware utilities, we used a dedicated test runner to confirm that the core reasoning and tool logic are properly integrated. We successfully validated the full lifecycle: from Reddit discovery to expert content creation, confirming that the agent correctly retrieves preferences from the cloud-based Vertex AI memory bank even in completely fresh sessions.&lt;/p&gt;

&lt;p&gt;Ready to run the test scenario yourself? Clone the &lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/ai-ml/dev-signal" rel="noopener noreferrer"&gt;repository&lt;/a&gt; and try the &lt;code&gt;test_local.py&lt;/code&gt; script to see 'Dev Signal' retrieve your preferences from the Vertex AI memory bank in real-time. For a deeper dive into the underlying mechanics of memory orchestration, check out this &lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/memory-bank/quickstart-adk" rel="noopener noreferrer"&gt;quickstart guide&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.to/googleai/deploying-a-multi-agent-system-with-terraform-and-cloud-run-2a9c"&gt;In the final part of this series,&lt;/a&gt; we will transition our prototype into a production service on Google Cloud Run using Terraform for secure infrastructure, and explore the roadmap to production excellence through continuous evaluation and security.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Special thanks to &lt;a href="https://www.linkedin.com/in/remigiusz-samborski/" rel="noopener noreferrer"&gt;Remigiusz Samborski&lt;/a&gt; for the helpful review and feedback on this article.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;For more content like this, follow me on &lt;a href="https://www.linkedin.com/in/shirmeirlador/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; and &lt;a href="https://x.com/shirmeir86" rel="noopener noreferrer"&gt;X&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>googlecloud</category>
      <category>agents</category>
      <category>python</category>
    </item>
    <item>
      <title>Architect A Personalized Multi-Agent System with Long-Term Memory</title>
      <dc:creator>Shir Meir Lador</dc:creator>
      <pubDate>Thu, 07 May 2026 21:01:06 +0000</pubDate>
      <link>https://dev.to/googleai/architect-a-personalized-multi-agent-system-with-long-term-memory-3o15</link>
      <guid>https://dev.to/googleai/architect-a-personalized-multi-agent-system-with-long-term-memory-3o15</guid>
      <description>&lt;p&gt;In support of our mission to accelerate the developer journey on Google Cloud, we built &lt;strong&gt;Dev Signal&lt;/strong&gt; — a multi-agent system designed to transform raw community signals into reliable technical guidance by automating the path from discovery to expert creation.&lt;/p&gt;

&lt;p&gt;In the &lt;a href="https://dev.to/googleai/building-capabilities-for-a-multi-agent-system-with-google-adk-mcp-and-cloud-run-ab9"&gt;first part&lt;/a&gt; of this series for the &lt;strong&gt;Dev Signal&lt;/strong&gt;, we laid the essential groundwork for this system by establishing a project environment and equipping core capabilities through the Model Context Protocol (MCP). We standardized our external integrations, connecting to Reddit for trend discovery, Google Cloud Docs for technical grounding, and building a custom Nano Banana Pro MCP server for multimodal image generation. If you missed &lt;a href="https://dev.to/googleai/building-capabilities-for-a-multi-agent-system-with-google-adk-mcp-and-cloud-run-ab9"&gt;Part 1&lt;/a&gt; or want to explore the code directly, you can find the complete project implementation in our &lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/ai-ml/dev-signal" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now, in Part 2, we focus on building the multi-agent architecture and integrating the &lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/memory-bank/overview" rel="noopener noreferrer"&gt;Vertex AI memory bank&lt;/a&gt; to personalize these capabilities. We will implement a Root Orchestrator that manages three specialist agents: the Reddit Scanner, GCP Expert, and Blog Drafter, to provide a seamless flow from trend discovery to expert content creation. We will also integrate a long-term memory layer that enables the agent to learn from your feedback and persist your stylistic preferences across different conversations. This ensures that Dev Signal doesn't just process data, but actually learns to match your professional voice over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Infrastructure and Model Setup
&lt;/h2&gt;

&lt;p&gt;First, we initialize the environment and the shared Gemini model.&lt;/p&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/agent.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.agents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.apps&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;App&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Gemini&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;google_search&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AgentTool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;load_memory_tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;preload_memory_tool&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.tools.tool_context&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ToolContext&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.genai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dev_signal_agent.app_utils.env&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;init_environment&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dev_signal_agent.tools.mcp_config&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;get_reddit_mcp_toolset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;get_dk_mcp_toolset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;get_nano_banana_mcp_toolset&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;PROJECT_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MODEL_LOC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SERVICE_LOC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SECRETS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;init_environment&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;shared_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Gemini&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-3-flash-preview&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;vertexai&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;project&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;PROJECT_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MODEL_LOC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;retry_options&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;HttpRetryOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attempts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Memory Ingestion Logic
&lt;/h2&gt;

&lt;p&gt;We want Dev Signal to do more than just follow instructions — we want it to learn from you. By capturing your preferences, such as specific technical interests on Reddit or a preferred blogging style, the agent can personalize its output for future use. To achieve this, we use the &lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/memory-bank/overview" rel="noopener noreferrer"&gt;&lt;strong&gt;Vertex AI memory bank&lt;/strong&gt;&lt;/a&gt; to persist session history across different conversations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Long-term Memory
&lt;/h3&gt;

&lt;p&gt;We automate this through the &lt;code&gt;save_session_to_memory_callback&lt;/code&gt; function. This callback is configured to run automatically after every turn, ensuring that session details are captured and stored in the memory bank without manual intervention.&lt;/p&gt;

&lt;h4&gt;
  
  
  How Managed Memory Works:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ingestion&lt;/strong&gt;: The &lt;code&gt;save_session_to_memory_callback&lt;/code&gt; sends the conversation data to Vertex AI.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embedding&lt;/strong&gt;: Vertex AI converts the text into numerical vectors (embeddings) that capture the semantic meaning of your preferences.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage&lt;/strong&gt;: These vectors are stored in a managed index, enabling the agent to perform semantic searches and retrieve relevant history in future sessions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retrieval&lt;/strong&gt;: The agent recalls this history using built-in ADK tools. The PreloadMemoryTool proactively brings in context at the start of an interaction, while the LoadMemoryTool allows the agent to fetch specific memories on an as-needed basis.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/agent.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_session_to_memory_callback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Defensive callback to persist session history to the Vertex AI memory bank.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;callback_context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Check connection to Memory Service
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nf"&gt;hasattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;_invocation_context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_invocation_context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;memory_service&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Save the session!
&lt;/span&gt;        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_invocation_context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;memory_service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_session_to_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_invocation_context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Short-term Memory
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;add_info_to_state&lt;/code&gt; function serves as the agent's short-term working memory, allowing the &lt;code&gt;gcp_expert&lt;/code&gt; to reliably hand off its detailed findings to the &lt;code&gt;blog_drafter&lt;/code&gt; within the same session. This working memory and the conversation transcript are managed by the Vertex AI Session Service to ensure that active context survives server restarts or transient failures.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The boundary between session-based state and long-term persistence&lt;/strong&gt; — It is important to note that while this service provides stability during an active interaction, this short-term memory does not persist between different sessions. Starting a fresh session ID effectively resets this working state, ensuring a clean slate for new tasks. Cross-session continuity, where the agent remembers your stylistic preferences or past feedback, is handled by the Vertex AI Memory Bank.&lt;/p&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/agent.py&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add_info_to_state&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ToolContext&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;tool_context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Saved &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; to state.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Specialist 1: Reddit Scanner (Discovery)
&lt;/h2&gt;

&lt;p&gt;The Reddit Scanner is our "Trend Spotter," it identifies high-engagement questions from the last 21 days (3 weeks) to ensure that all research findings remain both timely and relevant.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Memory Usage:&lt;/strong&gt; It leverages &lt;code&gt;load_memory&lt;/code&gt; to retrieve your past areas of interest and preferred topics from the Vertex AI memory bank. If relevant history exists, the agent prioritizes those specific topics in its search to provide a personalized discovery experience.&lt;/p&gt;

&lt;p&gt;Beyond simple retrieval, each sub-agent actively updates its memories by listening for new preferences and explicitly acknowledging them during the chat. This process captures relevant information in the session history, where an automated callback then persists it to the long-term Vertex AI memory bank for future use.&lt;/p&gt;

&lt;p&gt;This memory management is supported by two distinct retrieval patterns within the Google Agent Development Kit (ADK). The first is the &lt;code&gt;PreloadMemoryTool&lt;/code&gt;, which proactively brings in historical context at the beginning of every interaction to ensure the agent is fully briefed before addressing the current request. The second is the &lt;code&gt;LoadMemoryTool&lt;/code&gt;, which the agent uses on an as-needed basis, calling upon it only when it decides that deeper past knowledge would be beneficial for the current step in the workflow.&lt;/p&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/agent.py&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Singleton toolsets
&lt;/span&gt;&lt;span class="n"&gt;reddit_mcp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_reddit_mcp_toolset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;client_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;SECRETS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REDDIT_CLIENT_ID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;client_secret&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;SECRETS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REDDIT_CLIENT_SECRET&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;user_agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;SECRETS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REDDIT_USER_AGENT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;reddit_scanner&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reddit_scanner&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;shared_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instruction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
You are a Reddit research specialist. Your goal is to identify high-engagement questions
from the last 3 weeks on specific topics of interest, such as AI/agents on Cloud Run.

Follow these steps:
1. **MEMORY CHECK**: Use `load_memory` to retrieve the user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s **past areas of interest** and **preferred topics**. Calibrate your search to align with these interests.
2. Use the Reddit MCP tools to search for relevant subreddits and posts.
3. Filter results for posts created within the last 21 days (3 weeks).
4. Analyze &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;high-engagement&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; based on upvote counts and the number of comments.
5. Recommend the most important and relevant questions for a technical audience.
6. **CRITICAL**: For each recommended question, provide a direct link to the original thread and a concise summary of the discussion.
7. **CAPTURE PREFERENCES**: Actively listen for user preferences, interests, or project details. Explicitly acknowledge them to ensure they are captured in the session history for future personalization.
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;reddit_mcp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;load_memory_tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;LoadMemoryTool&lt;/span&gt;&lt;span class="p"&gt;()],&lt;/span&gt;
    &lt;span class="n"&gt;after_agent_callback&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;save_session_to_memory_callback&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Specialist 2: GCP Expert (Grounding)
&lt;/h2&gt;

&lt;p&gt;The GCP Expert is our "Technical Authority". It triangulates facts by synthesizing official documentation from the Google Cloud Developer Knowledge MCP Server, community sentiment from Reddit, and broader context from Google Search.&lt;/p&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/agent.py&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;dk_mcp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_dk_mcp_toolset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;SECRETS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DK_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;search_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;shared_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instruction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Execute Google Searches and return raw, structured results (Title, Link, Snippet).&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;google_search&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;gcp_expert&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gcp_expert&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;shared_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instruction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
You are a Google Cloud Platform (GCP) documentation expert.
Your goal is to provide accurate, detailed, and cited answers to technical questions by synthesizing official documentation with community insights.

For EVERY technical question, you MUST perform a comprehensive research sweep using ALL available tools:
1. **Official Docs (Grounding)**: Use DeveloperKnowledge MCP (`search_documents`) to find the definitive technical facts.
2. **Social Media Research (Reddit)**: Use the Reddit MCP to research the question on social media. This allows you to find real-world user discussions, common pain points, or alternative solutions that might not be in official documentation.
3. **Broader Context (Web/Social)**: Use the `search_agent` tool to find recent technical blogs, social media discussions, or tutorials.

Synthesize your answer:
- Start with the official answer based on GCP docs.
- Add &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Social Media Insights&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; or &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Common Issues&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; sections derived from Reddit and Web Search findings.
- **CRITICAL**: After providing your answer, you MUST use the `add_info_to_state` tool to save your full technical response under the key: `technical_research_findings`.
- Cite your sources specifically at the end of your response, providing **direct links** (URLs) to the official documentation, blog posts, and Reddit threads used.
- **CAPTURE PREFERENCES**: Actively listen for user preferences, interests, or project details. Explicitly acknowledge them to ensure they are captured in the session history for future personalization.
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;dk_mcp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;AgentTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;search_agent&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;reddit_mcp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;add_info_to_state&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;after_agent_callback&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;save_session_to_memory_callback&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Specialist 3: Blog Drafter (Creativity)
&lt;/h2&gt;

&lt;p&gt;The Blog Drafter is our Content Creator. It drafts the blog based on the expert's findings and offers to generate visuals.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Memory Usage:&lt;/strong&gt; It checks &lt;code&gt;load_memory&lt;/code&gt; for the user's &lt;strong&gt;preferred writing style&lt;/strong&gt; (e.g. "Witty", "Rap") stored in the &lt;strong&gt;Vertex AI memory bank&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/agent.py&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;nano_mcp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_nano_banana_mcp_toolset&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;blog_drafter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;blog_drafter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;shared_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instruction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
You are a professional technical blogger specializing in Google Cloud Platform.
Your goal is to draft high-quality blog posts based on technical research provided by the GDE expert and reliable documentation.

You have access to the research findings from the gcp_expert_agent here:
{{ technical_research_findings }}

Follow these steps:
1. **MEMORY CHECK**: Use `load_memory` to retrieve past blog posts, **areas of interest**, and user feedback on writing style. Adopt the user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s preferred style and depth.
2. **REVIEW &amp;amp; GROUND**: Review the technical research findings provided above. **CRITICAL**: Use the `dk_mcp` (Developer Knowledge) tool to verify key facts, technical limitations, and API details. Ensure every claim in your blog is grounded in official documentation.
3. Draft a blog post that is engaging, accurate, and helpful for a technical audience.
4. Include code snippets or architectural diagrams if relevant.
5. Provide a &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Resources&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; section with links to the official documentation used.
6. Ensure the tone is professional yet accessible, while adhering to any style preferences found in memory.
7. **VISUALS**: After presenting the drafted blog post, explicitly ask the user: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Would you like me to generate an infographic-style header image to illustrate these key points?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; If they agree, use the `generate_image` tool (Nano Banana).
8. **CAPTURE PREFERENCES**: Actively listen for user preferences, interests, or project details. Explicitly acknowledge them to ensure they are captured in the session history for future personalization.
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;dk_mcp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;load_memory_tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;LoadMemoryTool&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;nano_mcp&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;after_agent_callback&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;save_session_to_memory_callback&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Root Orchestrator
&lt;/h2&gt;

&lt;p&gt;The root agent serves as the system's strategist, managing a team of specialist agents and orchestrating their actions based on the specific goals provided by the user. At the start of a conversation, the orchestrator retrieves memory to establish context by checking for the user's past areas of interest, preferred topics, or previous projects.&lt;/p&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/agent.py&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;root_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;root_orchestrator&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;shared_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instruction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
You are a technical content strategist. You manage three specialists:
1. reddit_scanner: Finds trending questions and high-engagement topics on Reddit.
2. gcp_expert: Provides technical answers based on official GCP documentation.
3. blog_drafter: Writes professional blog posts based on technical research.

Your responsibilities:
- **MEMORY CHECK**: At the start of a conversation, use `load_memory` to check if the user has specific **areas of interest**, preferred topics, or past projects. Tailor your suggestions accordingly.
- **CAPTURE PREFERENCES**: Actively listen for user preferences, interests, or project details. Explicitly acknowledge them to ensure they are captured in the session history for future personalization.
- If the user wants to find trending topics or questions from Reddit, delegate to reddit_scanner.
- If the user has a technical question or wants to research a specific theme, delegate to gcp_expert.
- **CRITICAL**: After the gcp_expert provides an answer, you MUST ask the user:
  &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Would you like me to draft a technical blog post based on this answer?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;
- If the user agrees or asks to write a blog, delegate to blog_drafter.
- Be proactive in helping the user navigate from discovery (Reddit) to research (Docs) to content creation (Blog).
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;load_memory_tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;LoadMemoryTool&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;preload_memory_tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;PreloadMemoryTool&lt;/span&gt;&lt;span class="p"&gt;()],&lt;/span&gt;
    &lt;span class="n"&gt;after_agent_callback&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;save_session_to_memory_callback&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;sub_agents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;reddit_scanner&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gcp_expert&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;blog_drafter&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;App&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root_agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;root_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dev_signal_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;In this part of our series, we built multi-agent architecture and implemented a robust, dual-layered memory system. We established a Root Orchestrator, managing three specialist agents: a Reddit Scanner for trend discovery, a GCP Expert for technical grounding, and a Blog Drafter for creative content creation. &lt;/p&gt;

&lt;p&gt;By utilizing short-term state to pass information reliably between specialists and integrating the Vertex AI memory bank for long-term persistence, we've enabled the agent to learn from your feedback and remember specific writing styles across different conversations.&lt;/p&gt;

&lt;p&gt;In &lt;a href="https://dev.to/googleai/local-testing-of-a-multi-agent-system-with-memory-37mm"&gt;Part 3&lt;/a&gt;, we will show you how to test the agent locally to verify these components on your workstation, before transitioning to a full production deployment on Google Cloud Run in Part 4. Can't wait for part 3? The full implementation is already available for you to explore on &lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/ai-ml/dev-signal" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To learn more about the underlying technology, explore the &lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/memory-bank/overview" rel="noopener noreferrer"&gt;Vertex AI Memory Bank overview&lt;/a&gt; or dive into the official &lt;a href="https://docs.cloud.google.com/agent-builder/agent-development-kit/overview" rel="noopener noreferrer"&gt;ADK Documentation&lt;/a&gt; to see how to orchestrate complex multi-agent workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Special thanks to &lt;a href="https://www.linkedin.com/in/remigiusz-samborski/" rel="noopener noreferrer"&gt;Remigiusz Samborski&lt;/a&gt; for the helpful review and feedback on this article.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For more content like this, follow me on &lt;a href="https://www.linkedin.com/in/shirmeirlador/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; and &lt;a href="https://x.com/shirmeir86?lang=en" rel="noopener noreferrer"&gt;X&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>googlecloud</category>
      <category>agents</category>
      <category>mcp</category>
    </item>
    <item>
      <title>Building Capabilities for a Multi-Agent System with Google ADK, MCP, and Cloud Run</title>
      <dc:creator>Shir Meir Lador</dc:creator>
      <pubDate>Thu, 07 May 2026 21:00:10 +0000</pubDate>
      <link>https://dev.to/googleai/building-capabilities-for-a-multi-agent-system-with-google-adk-mcp-and-cloud-run-ab9</link>
      <guid>https://dev.to/googleai/building-capabilities-for-a-multi-agent-system-with-google-adk-mcp-and-cloud-run-ab9</guid>
      <description>&lt;p&gt;My team's mission is to accelerate the developer journey from writing code to running secure AI workloads on Google Cloud. To help developers succeed, we focus on identifying their most pressing questions and building demos that provide straightforward, easy-to-implement solutions.&lt;/p&gt;

&lt;p&gt;Recently, I was struck with inspiration when the new &lt;a href="https://developers.google.com/knowledge/mcp" rel="noopener noreferrer"&gt;Developer Knowledge MCP server&lt;/a&gt; was released. It led me to build &lt;strong&gt;Dev Signal&lt;/strong&gt;—a multi-agent system designed with &lt;a href="https://github.com/google/adk-python" rel="noopener noreferrer"&gt;Google Agent Development Kit (ADK)&lt;/a&gt;—to identify technical questions from Reddit, research them using official documentation, and draft detailed technical blogs. &lt;strong&gt;Dev Signal&lt;/strong&gt; also provides custom visuals using &lt;a href="https://blog.google/innovation-and-ai/products/nano-banana-pro/" rel="noopener noreferrer"&gt;Nano Banana Pro&lt;/a&gt;. I even integrated a long-term &lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/memory-bank/overview" rel="noopener noreferrer"&gt;memory&lt;/a&gt; layer so the agent remembers my specific preferences and blogging style.&lt;/p&gt;

&lt;p&gt;By connecting my coding assistant, &lt;a href="https://docs.cloud.google.com/gemini/docs/codeassist/gemini-cli" rel="noopener noreferrer"&gt;Gemini CLI&lt;/a&gt;, to the developer knowledge MCP server, I built and deployed this entire system to &lt;a href="https://cloud.google.com/run/docs" rel="noopener noreferrer"&gt;Google Cloud Run&lt;/a&gt; in just two days.&lt;/p&gt;

&lt;p&gt;Whether you want to learn how to architect a complex multi-agent system with long term memory, leverage local and remote MCP servers for tool standardization, or write detailed Terraform scripts for secure Cloud Run deployment, I'll show you how!&lt;/p&gt;

&lt;p&gt;If you'd rather dive straight into the code and explore it at your own pace, you can clone the repository &lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/ai-ml/dev-signal" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/abZxJiXGrJs"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  What you'll learn
&lt;/h2&gt;

&lt;p&gt;In this four-part blog series, I'll walk you through the step-by-step process of how I brought this project to life. Each blog post captures the journey of building and deploying &lt;strong&gt;Dev Signal&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Part 1: Tools for building agent capabilities&lt;/strong&gt; – You'll begin by setting up your project environment and equipping your agent with tools using the Model Context Protocol (MCP). You'll learn how to connect to Reddit for trend discovery, Google Cloud docs for technical grounding, and a custom Nano Banana Pro tool for image generation.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/googleai/architect-a-personalized-multi-agent-system-with-long-term-memory-3o15"&gt;&lt;strong&gt;Part 2: The Multi-Agent Architecture with long term memory&lt;/strong&gt;&lt;/a&gt; – You'll build the "brain" of the system by implementing a root orchestrator and a team of specialized agents. You'll also integrate the &lt;a href="https://docs.cloud.google.com/agent-builder/agent-engine/memory-bank/overview" rel="noopener noreferrer"&gt;Vertex AI memory bank&lt;/a&gt;, enabling the agent to learn and persist your preferences across sessions.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/googleai/local-testing-of-a-multi-agent-system-with-memory-37mm"&gt;&lt;strong&gt;Part 3: Testing the agent Locally&lt;/strong&gt;&lt;/a&gt; – Before moving to the cloud, you'll synchronize the agent's components and verify its performance on your workstation. You'll use a dedicated test runner to simulate the full lifecycle of discovery, research, and multimodal creation, with a special focus on validating long-term memory persistence by connecting your local agent directly to the cloud-based Vertex AI memory bank.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/googleai/deploying-a-multi-agent-system-with-terraform-and-cloud-run-2a9c"&gt;&lt;strong&gt;Part 4: Deployment to Cloud Run and the Path to Production&lt;/strong&gt;&lt;/a&gt; – Finally, you'll deploy your service on Google Cloud Run using Terraform for reproducible infrastructure. You'll also discuss the next steps required for a high quality secure production system.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting started with Dev Signal
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Dev Signal&lt;/strong&gt; is an intelligent monitoring agent designed to filter noise and create value. &lt;strong&gt;Dev Signal&lt;/strong&gt; operates in the following ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Discovery&lt;/strong&gt;: Scouts Reddit for high-engagement technical questions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Grounding&lt;/strong&gt;: Researches answers using official Google Cloud documentation to ensure accuracy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Creation&lt;/strong&gt;: Drafts professional technical blog posts based on its findings.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multimodal Generation&lt;/strong&gt;: Generates custom infographic headers for those posts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long-Term Memory&lt;/strong&gt;: Uses &lt;strong&gt;Vertex AI memory bank&lt;/strong&gt; to remember your feedback across different sessions.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before you begin, verify the following is installed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Python 3.12+&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;uv&lt;/strong&gt; (Python package manager): &lt;code&gt;curl -LsSf https://astral.sh/uv/install.sh | sh&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://cloud.google.com/sdk/docs/install" rel="noopener noreferrer"&gt;&lt;strong&gt;Google Cloud SDK&lt;/strong&gt;&lt;/a&gt; (&lt;code&gt;gcloud&lt;/code&gt; CLI) installed and authenticated.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://developer.hashicorp.com/terraform/install" rel="noopener noreferrer"&gt;&lt;strong&gt;Terraform&lt;/strong&gt;&lt;/a&gt; (for infrastructure as code).&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.npmjs.com/downloading-and-installing-node-js-and-npm" rel="noopener noreferrer"&gt;&lt;strong&gt;Node.js &amp;amp; npm&lt;/strong&gt;&lt;/a&gt; (required for the Reddit MCP tool).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You will also need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;a href="https://docs.cloud.google.com/resource-manager/docs/creating-managing-projects" rel="noopener noreferrer"&gt;&lt;strong&gt;Google Cloud Project&lt;/strong&gt;&lt;/a&gt; with billing enabled.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.cloud.google.com/endpoints/docs/openapi/enable-api" rel="noopener noreferrer"&gt;&lt;strong&gt;APIs Enabled&lt;/strong&gt;&lt;/a&gt;: Vertex AI, Cloud Run, Secret Manager, Artifact Registry.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reddit API Credentials&lt;/strong&gt; (Client ID, Secret) - You can get these from the &lt;a href="https://www.reddit.com/prefs/apps" rel="noopener noreferrer"&gt;Reddit Developer Portal&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Developer Knowledge API Key&lt;/strong&gt; (for Google Cloud docs search) - Instructions on how to get it are &lt;a href="https://developers.google.com/knowledge/mcp" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Project Setup
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Dev Signal&lt;/strong&gt; system was built by first running the &lt;a href="https://github.com/GoogleCloudPlatform/agent-starter-pack" rel="noopener noreferrer"&gt;Agent Starter Pack&lt;/a&gt;, following the automated architect workflow described in the &lt;a href="https://www.youtube.com/watch?v=XCGbDx7aSks" rel="noopener noreferrer"&gt;Agent Factory episode&lt;/a&gt; by &lt;a href="https://www.linkedin.com/in/remigiusz-samborski/" rel="noopener noreferrer"&gt;Remigiusz Samborski&lt;/a&gt; and &lt;a href="https://www.linkedin.com/in/vkolesnikov/" rel="noopener noreferrer"&gt;Vlad Kolesnikov&lt;/a&gt;. This foundation provided the project's modular directory structure, which is used to separate concerns between Agent Logic, Server Code, Utilities, and Tools.&lt;/p&gt;

&lt;p&gt;The starter pack acts as a powerful starting point because it automates the creation of professional infrastructure, CI/CD pipelines, and observability tools in seconds. This allows you to focus entirely on the agent's unique intelligence while ensuring the underlying platform remains secure and scalable. By building on top of this generated boilerplate with AI assistance from &lt;a href="https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemini-cli-open-source-ai-agent/" rel="noopener noreferrer"&gt;Gemini CLI&lt;/a&gt; and &lt;a href="https://antigravity.google/" rel="noopener noreferrer"&gt;Antigravity&lt;/a&gt;, the development process is highly accelerated.&lt;/p&gt;

&lt;p&gt;The agent starter pack high level architecture:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F317n2pofuobvchzxorm0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F317n2pofuobvchzxorm0.png" alt="Agent Starter Pack Architecture" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Initialize the Project
&lt;/h3&gt;

&lt;p&gt;Create a new directory for your project and initialize it. We'll use &lt;code&gt;uv&lt;/code&gt;, which is an extremely fast Python package manager.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv init dev-signal
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Folder Structure
&lt;/h3&gt;

&lt;p&gt;Our project will follow this structure. We will populate these files step-by-step.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;dev-signal/
├── dev_signal_agent/
│   ├── __init__.py
│   ├── agent.py                # Agent logic &amp;amp; orchestration
│   ├── fast_api_app.py         # Application server &amp;amp; memory connection
│   ├── app_utils/              # Env Config
│   │   └── env.py
│   └── tools/                  # External capabilities
│       ├── __init__.py
│       ├── mcp_config.py       # Tool configuration (Reddit, Docs)
│       └── nano_banana_mcp/    # Custom local image generation tool
│           ├── __init__.py
│           ├── main.py
│           ├── nano_banana_pro.py
│           ├── media_models.py
│           ├── storage_utils.py
│           └── requirements.txt
├── deployment/
│   └── terraform/              # Infrastructure as Code
├── .env                        # Local secrets (API keys)
├── Makefile                    # Shortcuts for building/deploying
├── Dockerfile                  # Container definition
└── pyproject.toml              # Dependencies
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Define Dependencies
&lt;/h3&gt;

&lt;p&gt;Update your &lt;code&gt;pyproject.toml&lt;/code&gt; with the necessary dependencies. We use &lt;code&gt;google-adk&lt;/code&gt; for the agent framework and &lt;code&gt;google-genai&lt;/code&gt; for the model interaction.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[project]&lt;/span&gt;
&lt;span class="py"&gt;name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"dev-signal"&lt;/span&gt;
&lt;span class="py"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0.1.0"&lt;/span&gt;
&lt;span class="py"&gt;description&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"A multi-agent system for monitoring and content creation."&lt;/span&gt;
&lt;span class="py"&gt;readme&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"README.md"&lt;/span&gt;
&lt;span class="py"&gt;requires-python&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="py"&gt;"&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;3.12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;&amp;lt;&lt;/span&gt;&lt;span class="mf"&gt;3.14&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;span class="py"&gt;dependencies&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="py"&gt;"google-adk&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="s"&gt;",&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;    &lt;span class="py"&gt;"google-genai&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="s"&gt;",&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;    &lt;span class="py"&gt;"mcp&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="s"&gt;",&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;    &lt;span class="py"&gt;"python-dotenv&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="s"&gt;",&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;    &lt;span class="py"&gt;"fastapi&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.110&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="s"&gt;",&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;    &lt;span class="py"&gt;"uvicorn&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.29&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="s"&gt;",&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;    &lt;span class="py"&gt;"google-cloud-logging&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;3.0&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="s"&gt;",&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;    &lt;span class="py"&gt;"google-cloud-aiplatform&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.38&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="s"&gt;",&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;    &lt;span class="py"&gt;"fastmcp&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;2.13&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="s"&gt;",&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;    &lt;span class="py"&gt;"google-cloud-storage&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;3.6&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="s"&gt;",&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;    &lt;span class="py"&gt;"google-auth&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="s"&gt;",&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;    &lt;span class="py"&gt;"google-cloud-secret-manager&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;2.26&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="s"&gt;",&lt;/span&gt;&lt;span class="err"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run &lt;code&gt;uv sync&lt;/code&gt; to install everything.&lt;/p&gt;

&lt;p&gt;Create a new directory for the agent code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;dev_signal_agent
&lt;span class="nb"&gt;cd &lt;/span&gt;dev_signal_agent
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Building the agent capabilities: MCP tools
&lt;/h2&gt;

&lt;p&gt;Our agent needs to interact with the outside world. We use the &lt;a href="https://modelcontextprotocol.io/" rel="noopener noreferrer"&gt;&lt;strong&gt;Model Context Protocol (MCP)&lt;/strong&gt;&lt;/a&gt; to standardize this. The &lt;strong&gt;Model Context Protocol (MCP)&lt;/strong&gt; is a universal standard for connecting AI agents to external data and tools. Instead of writing custom API wrappers, we use standard MCP servers. This allows us to connect to APIs (Reddit), Knowledge Bases (Google Cloud Docs), and even local scripts (Image Generation using Nano Banana Pro) using a common interface. Create a new directory for the agent tools.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;tools
&lt;span class="nb"&gt;cd &lt;/span&gt;tools
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Tools Configuration
&lt;/h3&gt;

&lt;p&gt;We'll define our toolsets in &lt;code&gt;dev_signal_agent/tools/mcp_config.py&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This file defines the connection parameters for our three main tools.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reddit&lt;/strong&gt;: Connected via a local stdio subprocess.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Developer Knowledge&lt;/strong&gt;: Connected via a remote HTTP endpoint.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Nano Banana&lt;/strong&gt;: Connected via a local stdio subprocess (our custom Python script).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Reddit Search (Discovery Tool)
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://github.com/Arindam200/reddit-mcp" rel="noopener noreferrer"&gt;Reddit MCP server&lt;/a&gt; acts as a bridge to the Reddit API, allowing your agent to discover trending posts and analyze engagement without you having to write complex API wrappers. To ensure portability, the code uses a "find or fetch" strategy: it first checks for a local installation and, if missing, automatically uses &lt;code&gt;npx&lt;/code&gt; to download and run the server on demand.&lt;/p&gt;

&lt;p&gt;Instead of a network connection, the agent launches the server as a local subprocess and communicates via standard input and output (stdio). Within the Google ADK, the &lt;code&gt;McpToolset&lt;/code&gt; class acts as a universal wrapper that standardizes these connections, enabling your agent to interact with various tools, from community resources to custom scripts like the Nano Banana image generator, using a common interface. By securely passing API credentials through environment variables, the system ensures these "plug-and-play" modules function as a seamless bridge between the AI and external platforms.&lt;/p&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/tools/mcp_config.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;shutil&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StdioServerParameters&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;McpToolset&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.tools.mcp_tool&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StreamableHTTPConnectionParams&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;StdioConnectionParams&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_reddit_mcp_toolset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;client_secret&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_agent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Connects to the Reddit MCP server.
    This server runs as a local subprocess (stdio) and proxies requests to the Reddit API.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Check if 'reddit-mcp' is installed globally, otherwise use npx to run it
&lt;/span&gt;    &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reddit-mcp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;shutil&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;which&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reddit-mcp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;npx&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;shutil&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;which&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reddit-mcp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-y&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--quiet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reddit-mcp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Inject secrets into the environment of the subprocess only
&lt;/span&gt;    &lt;span class="n"&gt;env&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DOTENV_CONFIG_SILENT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;LANG&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en_US.UTF-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;client_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REDDIT_CLIENT_ID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client_id&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;client_secret&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REDDIT_CLIENT_SECRET&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client_secret&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;user_agent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REDDIT_USER_AGENT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;user_agent&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;McpToolset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;connection_params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;StdioConnectionParams&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;server_params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;StdioServerParameters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;  &lt;span class="c1"&gt;# Pass injected secrets directly to the subprocess
&lt;/span&gt;            &lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;120.0&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Google Cloud Docs (Knowledge Tool)
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://developers.google.com/knowledge/mcp" rel="noopener noreferrer"&gt;Developer Knowledge MCP server&lt;/a&gt; provides grounding for your agent by allowing it to search the entire corpus of official Google Cloud documentation. Unlike the local Reddit server, this is a managed service hosted by Google and accessed as a remote endpoint over the internet. It exposes specialized tools like &lt;code&gt;google_developer_documentation_search&lt;/code&gt; for semantic queries and &lt;code&gt;google_developer_documentation_fetch&lt;/code&gt; to retrieve full markdown content, ensuring that every technical claim the agent makes is supported by definitive, up-to-date facts.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; You can also connect your coding assistant tools such as &lt;a href="https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemini-cli-open-source-ai-agent/" rel="noopener noreferrer"&gt;Gemini CLI&lt;/a&gt; or &lt;a href="https://antigravity.google/" rel="noopener noreferrer"&gt;Antigravity&lt;/a&gt; to the developer knowledge MCP server to empower them with handy up to date Google Cloud documentation. I used it when writing this blog!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;To connect, the agent uses the &lt;code&gt;McpToolset&lt;/code&gt; class with &lt;code&gt;StreamableHTTPConnectionParams&lt;/code&gt;, pointing to a web URL instead of launching a local process. It securely authenticates using a &lt;code&gt;DK_API_KEY&lt;/code&gt; (&lt;a href="https://developers.google.com/knowledge/mcp" rel="noopener noreferrer"&gt;create your api key&lt;/a&gt;) passed in the request headers, allowing the agent to perform a "comprehensive research sweep" across official docs, community sentiment, and broader web context through a single standardized interface.&lt;/p&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/tools/mcp_config.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_dk_mcp_toolset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Connects to Developer Knowledge (Google Cloud Docs).
    This is a remote MCP server accessed via HTTP.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;X-Goog-Api-Key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Fallback to os.environ for local testing if not passed via API
&lt;/span&gt;        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;X-Goog-Api-Key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DK_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;McpToolset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;connection_params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;StreamableHTTPConnectionParams&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://developerknowledge.googleapis.com/mcp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Image Generator (Nano Banana MCP)
&lt;/h3&gt;

&lt;p&gt;While we've used external MCP servers for Reddit and documentation, we can also build our own custom MCP server to wrap specific Python logic. In this case, we are creating an image generation tool powered by Gemini 3 Pro Image (also known as Nano Banana Pro). This demonstrates that any Python function can be standardized into a tool that any agent can understand.&lt;/p&gt;

&lt;p&gt;How the image generation works:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://gofastmcp.com/getting-started/welcome" rel="noopener noreferrer"&gt;&lt;strong&gt;FastMCP&lt;/strong&gt;&lt;/a&gt;: We use the &lt;code&gt;fastmcp&lt;/code&gt; library to drastically simplify server creation, allowing us to register Python functions as tools with just a few lines of code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini Integration&lt;/strong&gt;: The server uses the Google GenAI SDK to call the &lt;code&gt;gemini-3-pro-image-preview&lt;/code&gt; model, which converts the agent's descriptive prompts into raw image bytes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GCS Upload &amp;amp; Hosting:&lt;/strong&gt; Because agent interfaces typically require a URL to display images, the server automatically uploads the generated bytes to Google Cloud Storage (GCS) and returns a public link.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To connect this local tool, we use &lt;code&gt;StdioConnectionParams&lt;/code&gt; because the server runs as a local subprocess communicating via standard input and output. This transport method directly matches the &lt;code&gt;transport="stdio"&lt;/code&gt; configuration we will define in our server entrypoint, ensuring a seamless connection for your custom local scripts.&lt;/p&gt;

&lt;p&gt;The following code defines the MCP connection in &lt;code&gt;dev_signal_agent/tools/mcp_config.py&lt;/code&gt;. We use &lt;code&gt;uv run&lt;/code&gt; to ensure the server starts in an isolated environment with all its dependencies correctly installed.&lt;/p&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/tools/mcp_config.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_nano_banana_mcp_toolset&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Connects to our local &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Nano Banana&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; image generator.
    This demonstrates how to wrap a local Python script as an MCP tool.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dev_signal_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nano_banana_mcp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;main.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AI_ASSETS_BUCKET&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;McpToolset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;connection_params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;StdioConnectionParams&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;server_params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;StdioServerParameters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;uv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;run&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AI_ASSETS_BUCKET&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;600.0&lt;/span&gt;  &lt;span class="c1"&gt;# Image generation can take time
&lt;/span&gt;        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Implementing the Nano Banana Pro Server Logic
&lt;/h3&gt;

&lt;p&gt;Now, we will implement the actual logic for this server. This implementation is based on the &lt;a href="https://www.youtube.com/watch?v=XCGbDx7aSks&amp;amp;list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs&amp;amp;index=2" rel="noopener noreferrer"&gt;Agent Factory&lt;/a&gt; demo &lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/a9a5f64a3394a4b5ecc64061f397bd5ed82927ee/ai-ml/agent-factory-antigravity-nano-banana-pro/mcp" rel="noopener noreferrer"&gt;code&lt;/a&gt; by Remigiusz Samborski. While Remi's original code provides instructions for deploying the MCP server to Cloud Run, we will run it here as a local subprocess for faster development and testing.&lt;/p&gt;

&lt;p&gt;To get started, create the directory for our new server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; dev_signal_agent/tools/nano_banana_mcp
&lt;span class="nb"&gt;cd &lt;/span&gt;dev_signal_agent/tools/nano_banana_mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  The Server Entrypoint (&lt;code&gt;main.py&lt;/code&gt;)
&lt;/h4&gt;

&lt;p&gt;This file acts as the "brain" that initializes and starts the MCP server.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;FastMCP Initialization:&lt;/strong&gt; We use the &lt;code&gt;FastMCP&lt;/code&gt; library to create a server named "MediaGenerators" and register our &lt;code&gt;generate_image&lt;/code&gt; function as a tool.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Safe Logging:&lt;/strong&gt; The &lt;code&gt;_initialize_console_logging&lt;/code&gt; function is critical. It forces all logs to &lt;code&gt;sys.stderr&lt;/code&gt;. This is because the MCP "stdio" transport uses &lt;code&gt;sys.stdout&lt;/code&gt; for communication between the agent and the tool; standard logs sent to &lt;code&gt;stdout&lt;/code&gt; would corrupt that protocol.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execution&lt;/strong&gt;: The &lt;code&gt;mcp.run(transport="stdio")&lt;/code&gt; line starts the server as a local subprocess, allowing it to listen for requests from your agent via standard input.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/tools/nano_banana_mcp/main.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fastmcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastMCP&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;nano_banana_pro&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;generate_image&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_initialize_console_logging&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;min_level&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;INFO&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Ensure logs go to STDERR so they don't break the MCP stdio protocol
&lt;/span&gt;    &lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;StreamHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;basicConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;level&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;min_level&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handlers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;force&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;generate_image&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;mcp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastMCP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MediaGenerators&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;_initialize_console_logging&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;mcp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transport&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stdio&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  The Generation Logic (&lt;code&gt;nano_banana_pro.py&lt;/code&gt;)
&lt;/h4&gt;

&lt;p&gt;This is where the actual image generation happens using Gemini.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GenAI Client:&lt;/strong&gt; We initialize the &lt;code&gt;genai.Client()&lt;/code&gt; to interact with Google's generative models.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model Selection:&lt;/strong&gt; It specifically targets the &lt;code&gt;gemini-3-pro-image-preview&lt;/code&gt; model. We set the &lt;code&gt;response_modalities&lt;/code&gt; to "IMAGE" to tell the model we want pixels, not just text.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Robustness&lt;/strong&gt;: The code includes a &lt;code&gt;MAX_RETRIES&lt;/code&gt; loop (set to 5) to handle any transient generation errors, ensuring the agent has multiple attempts to get a valid image.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Byte Processing:&lt;/strong&gt; Once the model generates the image, it arrives as raw inline data. We extract these bytes and call our helper to move them to the cloud.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;URI Conversion:&lt;/strong&gt; Finally, it replaces the internal &lt;code&gt;gs://&lt;/code&gt; path with a browser-accessible &lt;code&gt;https://&lt;/code&gt; URL so the user can actually see the image.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/tools/nano_banana_mcp/nano_banana_pro.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.genai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;media_models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MediaAsset&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;storage_utils&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;upload_data_to_gcs&lt;/span&gt;

&lt;span class="n"&gt;AUTHORIZED_URI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://storage.mtls.cloud.google.com/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;MAX_RETRIES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;aspect_ratio&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;16:9&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;9:16&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;16:9&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;MediaAsset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Generates an image using Gemini 3 Image model.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;genai_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Part&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt; &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Starting image generation for prompt: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;asset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MediaAsset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MAX_RETRIES&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;genai_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-3-pro-image-preview&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;GenerateContentConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;response_modalities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;IMAGE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="n"&gt;image_config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ImageConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;aspect_ratio&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;aspect_ratio&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;part&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;part&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inline_data&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;part&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inline_data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="c1"&gt;# Upload the raw bytes to GCS
&lt;/span&gt;                    &lt;span class="n"&gt;gcs_uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;upload_data_to_gcs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mcp-tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="n"&gt;part&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inline_data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="n"&gt;part&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inline_data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mime_type&lt;/span&gt;
                    &lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;asset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MediaAsset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;gcs_uri&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;asset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;asset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;asset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No image was generated.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Convert gs:// URI to an HTTP accessible URL if needed
&lt;/span&gt;        &lt;span class="n"&gt;asset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;gs://&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AUTHORIZED_URI&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Image URL: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;asset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;asset&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  GCS Upload Helper (&lt;code&gt;storage_utils.py&lt;/code&gt;)
&lt;/h4&gt;

&lt;p&gt;Since agents need a web link to display images, this utility handles the hosting on Google Cloud Storage (GCS).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic Bucket Selection&lt;/strong&gt;: It looks for a bucket name in your environment variables, falling back from &lt;code&gt;AI_ASSETS_BUCKET&lt;/code&gt; to &lt;code&gt;LOGS_BUCKET_NAME&lt;/code&gt; to ensure it always has a place to save data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unique Filenames:&lt;/strong&gt; We use an MD5 hash of the raw image data to create a unique filename. This prevents filename collisions and acts as a simple way to avoid duplicate uploads of the same image.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloud Upload:&lt;/strong&gt; The &lt;code&gt;blob.upload_from_string&lt;/code&gt; method pushes the raw image bytes directly to your GCS bucket.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/tools/nano_banana_mcp/storage_utils.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;mimetypes&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.cloud.storage&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Blob&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;

&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;storage_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;ai_bucket_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AI_ASSETS_BUCKET&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;LOGS_BUCKET_NAME&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ai_bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;storage_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ai_bucket_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;upload_data_to_gcs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mime_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;file_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;md5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;ext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mimetypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;guess_extension&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mime_type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
    &lt;span class="n"&gt;blob_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assets/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;file_name&lt;/span&gt;&lt;span class="si"&gt;}{&lt;/span&gt;&lt;span class="n"&gt;ext&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;blob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Blob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ai_bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;blob_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;blob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upload_from_string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mime_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;storage_client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gs://&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ai_bucket_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;blob_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Data Model (&lt;code&gt;media_models.py&lt;/code&gt;)
&lt;/h4&gt;

&lt;p&gt;This file ensures that our data follows a strict structure (Schema).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Structured Output:&lt;/strong&gt; By using a Pydantic &lt;code&gt;BaseModel&lt;/code&gt;, we guarantee that the tool always returns a consistent JSON object containing a &lt;code&gt;uri&lt;/code&gt; (the link) and an optional &lt;code&gt;error&lt;/code&gt; message. This makes it much easier for the AI agent to understand and process the tool's result.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/tools/nano_banana_mcp/media_models.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MediaAsset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Tool Dependencies (&lt;code&gt;requirements.txt&lt;/code&gt;)
&lt;/h4&gt;

&lt;p&gt;While we use &lt;code&gt;uv&lt;/code&gt; to run our code, a &lt;code&gt;requirements.txt&lt;/code&gt; file remains essential because it defines the specific dependencies &lt;code&gt;uv&lt;/code&gt; needs to install for the Nano Banana server to function. This provides the necessary "ingredients" to set up the isolated environment before the server starts.&lt;/p&gt;

&lt;p&gt;This file lists the three core libraries required for this tool:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;google-cloud-storage&lt;/strong&gt;: Used for hosting the generated images on the cloud.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;google-genai&lt;/strong&gt;: Provides the logic for the Gemini 3 Pro image generation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;fastmcp&lt;/strong&gt;: The framework that turns our Python script into a standardized MCP tool.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Paste this code in &lt;code&gt;dev_signal_agent/tools/nano_banana_mcp/requirements.txt&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="py"&gt;google-cloud-storage=&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;3.6&lt;/span&gt;&lt;span class="err"&gt;.*&lt;/span&gt;
&lt;span class="py"&gt;google-genai=&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.52&lt;/span&gt;&lt;span class="err"&gt;.*&lt;/span&gt;
&lt;span class="py"&gt;fastmcp=&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;2.13&lt;/span&gt;&lt;span class="err"&gt;.*&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;In this first part of our series, we focused on establishing the agent's core capabilities by standardizing its external integrations through the Model Context Protocol (MCP). We initialized the project using &lt;code&gt;uv&lt;/code&gt; for high-speed dependency management and successfully configured three critical toolsets: Reddit for trend discovery, Google Cloud Docs for technical grounding, and a custom "Nano Banana" MCP server for multimodal image generation. By utilizing the Google ADK's &lt;code&gt;McpToolset&lt;/code&gt;, we've abstracted away complex API logic into simple, plug-and-play modules, ensuring that our tools share a common interface that decouples integration from intelligence.&lt;/p&gt;

&lt;p&gt;For a deeper look into our technical foundation, you can explore the &lt;a href="https://developers.google.com/knowledge/mcp" rel="noopener noreferrer"&gt;Developer Knowledge MCP server&lt;/a&gt; to learn more about knowledge grounding or visit the &lt;a href="https://github.com/google/adk-python" rel="noopener noreferrer"&gt;Google ADK GitHub repository&lt;/a&gt; to explore the framework's core capabilities.&lt;/p&gt;

&lt;p&gt;With our toolset fully configured and ready for action, we can now move to &lt;a href="https://dev.to/googleai/architect-a-personalized-multi-agent-system-with-long-term-memory-3o15"&gt;Part 2&lt;/a&gt;, where we will build the multi-agent architecture and integrate the Vertex AI memory bank to orchestrate these capabilities. You can also jump ahead to &lt;a href="https://dev.to/googleai/local-testing-of-a-multi-agent-system-with-memory-37mm"&gt;Part 3&lt;/a&gt;, where we will show you how to test the agent locally to verify these components on your workstation. If you’d like to dive ahead, you can explore the complete code for the entire series in our &lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/ai-ml/dev-signal" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Special thanks to &lt;a href="https://www.linkedin.com/in/remigiusz-samborski/" rel="noopener noreferrer"&gt;Remigiusz Samborski&lt;/a&gt; for the helpful review and feedback on this article.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;For more content like this, follow Shir on &lt;a href="https://www.linkedin.com/in/shirmeirlador/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; and &lt;a href="https://x.com/shirmeir86" rel="noopener noreferrer"&gt;X&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>googlecloud</category>
      <category>python</category>
      <category>agents</category>
    </item>
    <item>
      <title>Multimodal RAG with the Gemini API File Search Tool: A Developer Guide</title>
      <dc:creator>Patrick Loeber</dc:creator>
      <pubDate>Tue, 05 May 2026 17:43:58 +0000</pubDate>
      <link>https://dev.to/googleai/multimodal-rag-with-the-gemini-api-file-search-tool-a-developer-guide-5878</link>
      <guid>https://dev.to/googleai/multimodal-rag-with-the-gemini-api-file-search-tool-a-developer-guide-5878</guid>
      <description>&lt;p&gt;The File Search tool in the Gemini API now supports multimodal retrieval by adding support for &lt;a href="https://developers.googleblog.com/en/building-with-gemini-embedding-2/" rel="noopener noreferrer"&gt;Gemini Embedding 2&lt;/a&gt;. This update allows images, such as charts, product photos, and diagrams, to be natively indexed and searched in the same store as your text-based documents.&lt;/p&gt;

&lt;p&gt;This post covers how to use the File Search tool end-to-end: creating a store, uploading documents and images, querying with grounded generation, and retrieving image citations.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is File Search?
&lt;/h2&gt;

&lt;p&gt;Here's an example app you can try in AI Studio that lets you chat with your documents and image library&lt;/p&gt;

&lt;p&gt;&lt;a href="https://ai.google.dev/gemini-api/docs/file-search" rel="noopener noreferrer"&gt;File Search&lt;/a&gt; is the Gemini API's built-in RAG tool. When you upload your documents, the API takes care of the heavy lifting: chunking, embedding, indexing, and retrieval. At query time, pass a &lt;code&gt;file_search&lt;/code&gt; tool alongside your prompt, and the model automatically retrieves relevant chunks from your data to generate a grounded response.&lt;/p&gt;

&lt;p&gt;Compared to rolling your own RAG pipeline, File Search offers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fully managed&lt;/strong&gt;: No vector databases to provision or embedding pipeline to maintain.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost-effective&lt;/strong&gt;: Storage and query-time embeddings are free. You only pay for the initial indexing embeddings and the standard Gemini input/output tokens.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Built-in citations&lt;/strong&gt;: Every response includes grounding metadata that links  the answer to specific documents and pages. For multimodal stores, citations also include downloadable image references.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native image search&lt;/strong&gt;: With the &lt;code&gt;gemini-embedding-2&lt;/code&gt; model, images are embedded directly rather than relying on OCR, enabling true visual retrieval.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try It in AI Studio
&lt;/h2&gt;

&lt;p&gt;Want to see multimodal File Search in action before writing any code? We built an &lt;a href="https://ai.studio/apps/acb0ca81-7130-43ae-a31f-bedd96d28294" rel="noopener noreferrer"&gt;example app in AI Studio&lt;/a&gt; that lets you chat with your documents and image library. Upload PDFs and images, then ask questions. The app retrieves relevant text and visuals in real time, complete with citations and page numbers so you can trace every answer back to its source.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw6ycvixl7n5v7xzumpqz.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw6ycvixl7n5v7xzumpqz.gif" alt=" " width="480" height="350"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Create a File Search Store
&lt;/h3&gt;

&lt;p&gt;A File Search Store is a persistent container for your document embeddings. Think of it as a managed vector database scoped to a project.&lt;/p&gt;

&lt;p&gt;To enable multimodal search over images, specify &lt;code&gt;gemini-embedding-2&lt;/code&gt; as the embedding model. This parameter is optional; if omitted, the store defaults to &lt;code&gt;gemini-embedding-001&lt;/code&gt;, which is cost-optimized for text-only workloads, and cannot be changed later.&lt;/p&gt;

&lt;p&gt;To use the new features, make sure to install the latest Python SDK: &lt;code&gt;pip install -U google-genai&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.genai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Create a multimodal store with gemini-embedding-2
# Omit embedding_model to use the default text-only model (gemini-embedding-001)
&lt;/span&gt;&lt;span class="n"&gt;file_search_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;file_search_stores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;display_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;product-catalog&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;embedding_model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;models/gemini-embedding-2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Created store: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;file_search_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Embedding Model&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;gemini-embedding-001&lt;/code&gt; (default)&lt;/td&gt;
&lt;td&gt;Text-heavy workloads, cost-optimized&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;gemini-embedding-2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Multimodal retrieval (documents &lt;em&gt;and&lt;/em&gt; images)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Step 2: Upload Documents and Images
&lt;/h3&gt;

&lt;p&gt;The simplest path is the &lt;code&gt;upload_to_file_search_store&lt;/code&gt; method, which uploads and indexes a file in one step. With &lt;code&gt;gemini-embedding-2&lt;/code&gt;, this works for both documents and images:&lt;/p&gt;

&lt;p&gt;Note: Audio and video formats are currently not supported.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="c1"&gt;# Upload a PDF document
&lt;/span&gt;&lt;span class="n"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;file_search_stores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upload_to_file_search_store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;file_search_store_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;file_search_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;product_catalog.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;display_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Product Catalog&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Wait for ingestion to complete
&lt;/span&gt;&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;operation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;operations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;operation&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Upload product images directly
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;image_file&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sneaker_red.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sneaker_blue.jpeg&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sneaker_white.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;op&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;file_search_stores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upload_to_file_search_store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;file_search_store_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;file_search_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;image_file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;display_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;image_file&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;op&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;op&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;operations&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;op&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;All files indexed!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Behind the scenes, the API chunks documents, generates embeddings, and indexes the content. When using &lt;code&gt;gemini-embedding-2&lt;/code&gt;, images within PDFs are also natively embedded alongside the text.&lt;/p&gt;

&lt;p&gt;You can also &lt;a href="https://ai.google.dev/gemini-api/docs/file-search#importing-files" rel="noopener noreferrer"&gt;import existing files&lt;/a&gt; from the Files API into a store.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Query with File Search
&lt;/h3&gt;

&lt;p&gt;Query your data by passing the &lt;code&gt;file_search&lt;/code&gt; tool to &lt;code&gt;generate_content&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-3-flash-preview&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Which sneakers come in red?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_search_store_names&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;file_search_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The system performs a file search to find the most similar and relevant chunks from the File Search store , and uses them to generate a grounded response.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Inspect Citations and Retrieve Images
&lt;/h3&gt;

&lt;p&gt;Every File Search response includes grounding metadata — essentially, a bibliography for the model's answer. It captures page numbers for the indexed information, allowing applications to point users directly to the right spot in a document. This is especially useful for rigorous fact-checking over large PDFs.&lt;/p&gt;

&lt;p&gt;With multimodal stores, citations can include a &lt;code&gt;media_id&lt;/code&gt; for referenced images, which can be downloaded directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;grounding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;grounding_metadata&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;grounding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;grounding_chunks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;retrieved_context&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;media_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# This is an image citation — download it
&lt;/span&gt;        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Cited image: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;   Media ID: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;media_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;blob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;file_search_stores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;download_media&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;media_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;media_id&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cited_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;blob&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Text citation with exact page number
&lt;/span&gt;        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Cited text: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_number&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;   Page: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_number&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;   &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# See which parts of the response are grounded in which sources
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;support&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;grounding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;grounding_supports&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Claim: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;support&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;segment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Grounded in chunks: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;support&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;grounding_chunk_indices&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is powerful for building user-facing applications. It's now possible to show users the &lt;em&gt;actual images&lt;/em&gt; the model used in its reasoning, not just a text description.&lt;/p&gt;

&lt;h2&gt;
  
  
  Managing Stores
&lt;/h2&gt;

&lt;p&gt;Here's a quick reference for managing stores and documents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# List all stores
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;file_search_stores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; — &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;display_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# List documents in a store
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;file_search_stores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;file_search_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Delete a specific document
&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;file_search_stores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fileSearchStores/my-store/documents/old_doc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Delete an entire store (force=True also deletes all contained documents)
&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;file_search_stores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;file_search_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;force&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Power Features
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Custom Metadata and Filtering
&lt;/h3&gt;

&lt;p&gt;You can attach metadata to documents at upload time and use it to filter at query time. This is essential when a store contains diverse documents and searches need to be scoped:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Upload with metadata
&lt;/span&gt;&lt;span class="n"&gt;op&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;file_search_stores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upload_to_file_search_store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;file_search_store_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;file_search_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;shoes_collection.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;display_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Spring 2026 Shoes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;custom_metadata&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;category&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string_value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;footwear&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;season&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;string_value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;spring-2026&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;price_tier&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;numeric_value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Query with a metadata filter
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-3-flash-preview&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Do you have blue spring shoes?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_search_store_names&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;file_search_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;metadata_filter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;category=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;footwear&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; AND season=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;spring-2026&lt;/span&gt;&lt;span class="sh"&gt;"'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Structured Output
&lt;/h3&gt;

&lt;p&gt;Starting with Gemini 3 models, File Search can be combined with structured output. This is perfect for extracting structured data from grounded responses:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ProductMatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Product name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Brief product description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;How confident the match is&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-3-flash-preview&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Find products similar to a red running shoe&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_search_store_names&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;file_search_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response_mime_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response_schema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ProductMatch&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;model_json_schema&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Chunking Configuration
&lt;/h3&gt;

&lt;p&gt;For more control over how documents are split, the chunking strategy can be configured:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;operation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;file_search_stores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upload_to_file_search_store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;file_search_store_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;file_search_store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;long_document.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;display_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Technical Manual&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chunking_config&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;white_space_config&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_tokens_per_chunk&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_overlap_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Use Cases
&lt;/h2&gt;

&lt;p&gt;With multimodal retrieval, File Search opens up scenarios that text-only RAG can't handle:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Visual product search&lt;/strong&gt;: Index catalogs with images and spec sheets, then search by visual similarity or natural language descriptions.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Research and technical documentation&lt;/strong&gt;: Retrieve specific charts, architecture diagrams, or data visualizations from papers and reports.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Insurance and claims processing&lt;/strong&gt;: Combine structured forms with damage photos for unified document and visual assessment.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Design systems&lt;/strong&gt;: Make component libraries searchable by visual appearance, not just naming conventions.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real estate and property listings&lt;/strong&gt;: Match properties based on floor plans, interior photos, and visual preferences.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;File Search is designed to be cost-effective:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Indexing:&lt;/strong&gt; You pay for embeddings at indexing time (&lt;a href="https://ai.google.dev/gemini-api/docs/pricing#gemini-embedding-2" rel="noopener noreferrer"&gt;embeddings pricing&lt;/a&gt;).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storage:&lt;/strong&gt; Free.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query-time embeddings:&lt;/strong&gt; Free.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retrieved tokens:&lt;/strong&gt; Charged as regular context tokens.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;Here's everything needed to get started:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://ai.google.dev/gemini-api/docs/file-search" rel="noopener noreferrer"&gt;File Search documentation&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/google-gemini/cookbook/blob/main/quickstarts/File_Search.ipynb" rel="noopener noreferrer"&gt;File Search quickstart notebook&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/googleapis/python-genai" rel="noopener noreferrer"&gt;The latest Python SDK&lt;/a&gt;: Install it with &lt;code&gt;pip install -U google-genai&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aistudio.google.com/apikey" rel="noopener noreferrer"&gt;Get an API key&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Create your store with &lt;code&gt;gemini-embedding-2&lt;/code&gt;, upload some images, and start building multimodal RAG applications.&lt;/p&gt;

</description>
      <category>api</category>
      <category>gemini</category>
      <category>rag</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Fine-Tuning Gemma 4 with Cloud Run Jobs: Serverless GPUs (NVIDIA RTX 6000 Pro) for pet breed classification 🐈🐕</title>
      <dc:creator>Shir Meir Lador</dc:creator>
      <pubDate>Tue, 28 Apr 2026 19:54:21 +0000</pubDate>
      <link>https://dev.to/googleai/fine-tuning-gemma-4-with-cloud-run-jobs-serverless-gpus-nvidia-rtx-6000-pro-for-pet-breed-45ib</link>
      <guid>https://dev.to/googleai/fine-tuning-gemma-4-with-cloud-run-jobs-serverless-gpus-nvidia-rtx-6000-pro-for-pet-breed-45ib</guid>
      <description>&lt;p&gt;Google has just announced &lt;a href="https://blog.google/innovation-and-ai/technology/developers-tools/gemma-4/" rel="noopener noreferrer"&gt;the release of &lt;strong&gt;Gemma 4&lt;/strong&gt;&lt;/a&gt;! This new generation of open models brings significant advancements, particularly in reasoning capabilities and architectural efficiency.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bridging Reasoning and Precision with Gemma 4
&lt;/h2&gt;

&lt;p&gt;In my previous blog, I demonstrated how to &lt;a href="https://dev.to/googleai/fine-tuning-gemma-3-with-cloud-run-jobs-serverless-gpus-nvidia-rtx-6000-pro-for-pet-breed-248b"&gt;fine-tune Gemma 3 27B on &lt;strong&gt;Cloud Run Jobs&lt;/strong&gt; using &lt;strong&gt;NVIDIA RTX PRO 6000 Blackwell Edition GPUs&lt;/strong&gt; for pet breed classification&lt;/a&gt;. With the release of Gemma 4, I couldn't wait to update my pipeline and see how the new model performs.&lt;/p&gt;

&lt;p&gt;In this follow-up post, I'll explain what makes Gemma 4 different, the benefits it brings, and exactly what file modifications and workarounds are needed to successfully fine-tune it using PEFT (LoRA) on Cloud Run. We'll cover everything from memory requirements and dynamic label masking to prompt structures for reasoning models. Whether you read the previous post or are new to this pipeline, this guide will provide a complete, working solution for Gemma 4.&lt;/p&gt;

&lt;p&gt;If you'd rather dive straight into the code and explore it at your own pace, you can clone the repository &lt;a href="https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/ai-ml/finetune_gemma" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's New in Gemma 4?
&lt;/h2&gt;

&lt;p&gt;Gemma 4 introduces groundbreaking improvements over Gemma 3, making it Google's most intelligent open model family to date:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Apache 2.0 License&lt;/strong&gt;: Gemma 4 is released under a commercially permissive Apache 2.0 license, providing full developer flexibility.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Highly Competitive Benchmarks&lt;/strong&gt;: The 31B model ranks as the #3 open model on the Arena AI text leaderboard, while the 26B MoE model ranks #6, outcompeting models 20x their size!&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Advanced Reasoning &amp;amp; Agents&lt;/strong&gt;: Purpose-built for multi-step planning and deep logic. It features native support for function-calling, structured JSON output, and native system instructions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multimodal &amp;amp; Long Context&lt;/strong&gt;: Natively processes images, video, and even audio (in edge models). It supports up to a 256K context window for larger models.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Versatile Architectures&lt;/strong&gt;: Includes a 26B Mixture of Experts (MoE) model that only activates 3.8B parameters during inference for fast response times.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because of these changes, simply dropping Gemma 4 into a Gemma 3 fine-tuning script won't work out of the box. Here is a breakdown of what needed to change in the codebase to make it work.&lt;/p&gt;

&lt;h2&gt;
  
  
  GPU Memory and Parameter Capacity
&lt;/h2&gt;

&lt;p&gt;With the availability of &lt;a href="https://cloud.google.com/blog/products/serverless/cloud-run-supports-nvidia-rtx-6000-pro-gpus-for-ai-workloads" rel="noopener noreferrer"&gt;&lt;strong&gt;NVIDIA RTX PRO 6000&lt;/strong&gt; GPUs&lt;/a&gt; on &lt;a href="https://docs.cloud.google.com/run/docs" rel="noopener noreferrer"&gt;Cloud Run&lt;/a&gt;, we now have access to &lt;strong&gt;96GB of VRAM&lt;/strong&gt;. This is a game-changer for hosting and fine-tuning large models.&lt;/p&gt;

&lt;p&gt;According to the formula discussed in my blog post on &lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/decoding-high-bandwidth-memory-a-practical-guide-to-gpu-memory-for-fine-tuning-ai-models/" rel="noopener noreferrer"&gt;Decoding high-bandwidth memory&lt;/a&gt;: &lt;em&gt;Total HBM ≈ (Model Size) + (Optimizer States) + (Gradients) + (Activations)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;When using &lt;strong&gt;LoRA&lt;/strong&gt; (Low-Rank Adaptation), we freeze the base model weights and only train a small subset of parameters. This means the memory-hungry gradients and optimizer states are negligible for the base model. For &lt;strong&gt;Gemma 4 31B&lt;/strong&gt; loaded in 16-bit precision (bfloat16), the base model size is roughly &lt;em&gt;31 billion parameters × 2 bytes/parameter ≈ 62 GB.&lt;/em&gt; While this 62GB model fits comfortably within the &lt;strong&gt;96GB of VRAM&lt;/strong&gt; available on the RTX 6000 Pro, we can do even better!&lt;/p&gt;

&lt;p&gt;By applying &lt;strong&gt;4-bit quantization (QLoRA)&lt;/strong&gt; via the bitsandbytes library, we dramatically shrink this base memory footprint to roughly 18–20GB. This leaves an enormous amount of VRAM overhead exclusively dedicated to the high-memory activations required by multi-modal processing and long-context training batches, unlocking unparalleled serverless efficiency!&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Code Changes for Gemma 4 Migration
&lt;/h2&gt;

&lt;p&gt;If you are updating your own script or starting fresh, these are the critical adjustments made to the pipeline:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Multimodal Input Ordering &amp;amp; Integrated Instructions
&lt;/h3&gt;

&lt;p&gt;While Gemma 4 supports interleaved inputs and a &lt;a href="https://ai.google.dev/gemma/docs/core/model_card_4#mixture-of-experts_moe_model" rel="noopener noreferrer"&gt;native system role&lt;/a&gt;, we recommend providing the image data before the text as a stable convention and merging instructions into the user prompt for this pipeline. We found this 'single-turn' structure more effective for maintaining instruction-following precision and simplifying our custom masking logic.&lt;/p&gt;

&lt;p&gt;In the code below, the &lt;code&gt;{"type": "image"}&lt;/code&gt; entry acts as a placeholder that signals the processor to inject special image tokens into the chat template. The actual image tensors are then passed separately during the data collation step to ensure the multimodal architecture is adapted correctly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;full_user_content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Identify the breed of the animal in this image.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;  &lt;span class="c1"&gt;# Image must come first!
&lt;/span&gt;      &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;full_user_content&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;caption&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]}]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Loading the Correct Multimodal Architecture
&lt;/h3&gt;

&lt;p&gt;Gemma 4 natively processes images, video, and even audio (in the E2B and E4B models), which changes how the model must be loaded. To correctly handle these diverse inputs, we explicitly use the &lt;code&gt;AutoModelForMultimodalLM&lt;/code&gt; class. While &lt;code&gt;AutoModelForImageTextToText&lt;/code&gt; remains a valid option for purely image-based tasks, the multimodal class is the more precise choice for the Gemma 4 architecture, ensuring it is ready to handle video and audio data natively.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AutoModelForMultimodalLM&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;AutoModelForMultimodalLM&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;model_kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Label Masking for Multimodal Data
&lt;/h3&gt;

&lt;p&gt;In Gemma 3, we could hardcode specific token IDs to find where the assistant's response started to mask the prompt. For Gemma 4, we initially tried tokenizing the text prompt separately to find its length, but hit a major snag.&lt;/p&gt;

&lt;p&gt;Gemma 4 is highly efficient with media: each image gets a &lt;a href="https://ai.google.dev/gemma/docs/capabilities/vision#variable-resolution" rel="noopener noreferrer"&gt;dynamic number of soft tokens&lt;/a&gt; exactly fitted to its content. While these image soft tokens are highly stable and pre-computable (their count does not change whether the image is alone or accompanied by text), standard tokenizers can still introduce slight boundary quirks when concatenating text and control tokens &lt;em&gt;after&lt;/em&gt; these media tokens. If you tokenize the prompt in isolation, the length might be slightly off compared to the fully assembled chat template, tanking the model's accuracy.&lt;/p&gt;

&lt;p&gt;To achieve the highest precision, we implemented a bulletproof backward-search collator. Instead of trying to calculate the prompt length, we search the full &lt;code&gt;_input_ids_&lt;/code&gt; array for the exact tokens of our breed name label. Once found, we step backwards to locate the &lt;code&gt;&amp;lt;|turn&amp;gt;&lt;/code&gt; control token that marks the start of the assistant's response, and mask everything before it. This mathematically guarantees the model is trained exactly on the required template structure and the label, without any masking misalignment.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Bypassing Custom Layers &amp;amp; Unlocking the Vision Tower
&lt;/h3&gt;

&lt;p&gt;This was the most critical breakthrough! The official Hugging Face implementation for Gemma 4 uses a custom neural network wrapper called &lt;code&gt;Gemma4ClippableLinear&lt;/code&gt; for its projection layers. This custom class wraps a standard &lt;code&gt;nn.Linear&lt;/code&gt; layer but adds specific logic to clip minimum and maximum activations (&lt;code&gt;input_min&lt;/code&gt;, &lt;code&gt;output_max&lt;/code&gt;, etc.) to stabilize training.&lt;/p&gt;

&lt;p&gt;When we tried to apply standard LoRA by targeting specific layer names like &lt;code&gt;q_proj&lt;/code&gt; or &lt;code&gt;v_proj&lt;/code&gt;, we hit two major issues:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Activation Clipping Bypass&lt;/strong&gt;: Standard PEFT/LoRA doesn't natively recognize &lt;code&gt;Gemma4ClippableLinear&lt;/code&gt;. If forced to attach to the inner &lt;code&gt;.linear&lt;/code&gt; weights, it bypasses the parent wrapper entirely. Without that crucial activation clipping during the forward pass, the model's activations become unstable, and the training loss explodes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frozen Vision Tower&lt;/strong&gt;: Even if we fixed the text backbone, standard text-focused LoRA configurations often miss the vision tower's projection layers, leaving the model's "eyes" frozen during training.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The solution is to use the macro &lt;code&gt;target_modules="all-linear"&lt;/code&gt;. This tells the PEFT library to recursively scan the entire model tree. It safely identifies and wraps nested linear layers without breaking the outer &lt;code&gt;Gemma4ClippableLinear&lt;/code&gt; clipping logic. Crucially, it also ensures that every linear layer across &lt;strong&gt;both the language model and the vision tower&lt;/strong&gt; is adapted to your data, without sacrificing architectural stability.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Results
&lt;/h3&gt;

&lt;p&gt;By combining the multimodal architecture, bulletproof masking, and full-tower LoRA, we achieved a nice improvement in the model accuracy.&lt;/p&gt;

&lt;p&gt;Note that Gemma 4 baseline performance &lt;strong&gt;(89% accuracy) was significantly higher&lt;/strong&gt; than &lt;a href="https://dev.to/googleai/fine-tuning-gemma-3-with-cloud-run-jobs-serverless-gpus-nvidia-rtx-6000-pro-for-pet-breed-248b"&gt;Gemma 3 Baseline performance &lt;strong&gt;(67% accuracy)&lt;/strong&gt;&lt;/a&gt; so in this case the accuracy improvement is more modest, but still significant.&lt;/p&gt;

&lt;h4&gt;
  
  
  Intermediate Results (700 Samples, ~50 minutes Run)
&lt;/h4&gt;

&lt;p&gt;Even with a small subset of 700 training images, we saw a nice boost over the baseline in less than one hour:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frv5idqnzvq6inavt8ddo.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frv5idqnzvq6inavt8ddo.webp" alt=" " width="800" height="193"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;&lt;small&gt;Results on 700 training samples and 200 evaluation samples&lt;/small&gt;&lt;/center&gt;

&lt;p&gt; &lt;/p&gt;

&lt;h4&gt;
  
  
  Final Results (Full Dataset, ~4.25 Hours Run)
&lt;/h4&gt;

&lt;p&gt;Running the full &lt;a href="https://miro.medium.com/v2/resize:fit:1400/format:webp/1*UmCvZII_Mu1bvjrHL8WTOw.png" rel="noopener noreferrer"&gt;Oxford-IIIT Pet dataset&lt;/a&gt; (~4,000 training images and 3,669 evaluation images) yielded our peak performance (STOA for this dataset is 94% accuracy):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fowa1irxg2lzaqix2586k.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fowa1irxg2lzaqix2586k.webp" alt=" " width="800" height="192"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;center&gt;&lt;small&gt;Results on 4000 training samples and 3669 evaluation samples&lt;/small&gt;&lt;/center&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;In this run, we utilized a more aggressive LoRA configuration than typical text-only runs: a &lt;strong&gt;Rank 64&lt;/strong&gt; / &lt;strong&gt;Alpha 64&lt;/strong&gt; setup with a &lt;strong&gt;5e-5 learning rate&lt;/strong&gt;. This gave the model enough "surface area" to refine its visual features for the specific nuances of the pet dataset.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Managing VRAM with QLoRA &amp;amp; Gradient Checkpointing
&lt;/h3&gt;

&lt;p&gt;While 96GB of VRAM on the RTX 6000 Pro is massive, training a 31B parameter model with LoRA still pushes the boundaries of a single GPU. To ensure absolute stability and prevent Out-Of-Memory (OOM) errors during the backward pass, our script implements a two-pronged optimization strategy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;QLoRA (4-bit Quantization):&lt;/strong&gt; Utilizing &lt;code&gt;BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4")&lt;/code&gt; to drastically reduce the model's footprint when loaded on CUDA.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gradient Checkpointing:&lt;/strong&gt; Specifically enabled for the 31B model, this trades a slight increase in compute time for a significant reduction in VRAM usage by recalculating activations instead of storing them all in memory.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Complete Fine-Tuning Workflow on Cloud Run
&lt;/h2&gt;

&lt;p&gt;Before you begin the fine-tuning process, ensure you have the following software and environment configurations in place.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Google Cloud Project&lt;/strong&gt; with billing enabled and APIs active (Cloud Run, Artifact Registry, Cloud Build, Secret Manager).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NVIDIA RTX PRO 6000&lt;/strong&gt; availability in your region (e.g., &lt;code&gt;europe-west4&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hugging Face Token&lt;/strong&gt;: A valid token with access to the Gemma 4 model weights.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 0: Set Environment Variables
&lt;/h2&gt;

&lt;p&gt;Set the following environment variables to align with the steps below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PROJECT_ID&lt;/span&gt;&lt;span class="o"&gt;=[&lt;/span&gt;YOUR_PROJECT_ID]
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;REGION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;europe-west4
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;HF_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=[&lt;/span&gt;YOUR_HF_TOKEN]
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;SERVICE_ACCOUNT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"finetune-gemma-job-sa"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;BUCKET_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$PROJECT_ID&lt;/span&gt;&lt;span class="nt"&gt;-gemma4-finetuning-eu&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AR_REPO&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;gemma4-finetuning-repo
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;SECRET_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;HF_TOKEN
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;IMAGE_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;gemma4-finetune
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;JOB_NAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;gemma4-finetuning-job
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 1: Get the Code
&lt;/h2&gt;

&lt;p&gt;Whether you're running locally or on the cloud, you'll need the code. Clone the repository and navigate to the project directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/GoogleCloudPlatform/devrel-demos
&lt;span class="nb"&gt;cd &lt;/span&gt;devrel-demos/ai-ml/finetune_gemma/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: Test Locally Before Cloud Deployment
&lt;/h2&gt;

&lt;p&gt;Before spinning up massive GPUs in the cloud, it is always a best practice to verify your pipeline locally using a smaller model variant (like the 2B IT model) on a subset of the data.&lt;/p&gt;

&lt;p&gt;To run a local CPU test, first activate your virtual environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, execute the script with a very small dataset to ensure the pipeline completes successfully:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python3 finetune_and_evaluate.py &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model-id&lt;/span&gt; google/gemma-4-e2b-it &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--device&lt;/span&gt; cpu &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--train-size&lt;/span&gt; 20 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--eval-size&lt;/span&gt; 20 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--gradient-accumulation-steps&lt;/span&gt; 4 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--num-epochs&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once you verify that the training pipeline completes successfully, you are ready to scale up to Cloud Run!&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Stage the Model in GCS
&lt;/h2&gt;

&lt;p&gt;To save startup time and avoid repetitive downloads from the internet during training, stage the model weights (e.g., &lt;code&gt;google/gemma-4-31b-it&lt;/code&gt;) in a GCS bucket located in the same region as your Cloud Run job. We provide a utility script within the repository to perform this transfer directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Navigate to the utility directory&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;hf-to-gcs
&lt;span class="c"&gt;# Execute the transfer script&lt;/span&gt;
python3 hf_to_gcs.py &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--model-id&lt;/span&gt; google/gemma-4-31b-it &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--bucket&lt;/span&gt; &lt;span class="nv"&gt;$BUCKET_NAME&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--hf-token&lt;/span&gt; &lt;span class="nv"&gt;$HF_TOKEN&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This script ensures that the weights are stored in your project's bucket, enabling high-speed access via volume mounts when the Cloud Run job executes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Build the Container
&lt;/h2&gt;

&lt;p&gt;Use Cloud Build to package your script and dependencies into a container image compatible with CUDA 12.8:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gcloud builds submit &lt;span class="nt"&gt;--tag&lt;/span&gt; &lt;span class="nv"&gt;$REGION&lt;/span&gt;&lt;span class="nt"&gt;-docker&lt;/span&gt;.pkg.dev/&lt;span class="nv"&gt;$PROJECT_ID&lt;/span&gt;/&lt;span class="nv"&gt;$AR_REPO&lt;/span&gt;/&lt;span class="nv"&gt;$IMAGE_NAME&lt;/span&gt;:latest &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;[!TIP] You can track the real-time progress of your build in the Cloud Build console.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Create and Execute the Cloud Run Job
&lt;/h2&gt;

&lt;p&gt;Create the job with GPU support and volume mounts for the GCS bucket holding the model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gcloud beta run &lt;span class="nb"&gt;jobs &lt;/span&gt;create gemma4-finetuning-job &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="nv"&gt;$REGION&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--image&lt;/span&gt; gcr.io/&lt;span class="nv"&gt;$PROJECT_ID&lt;/span&gt;/gemma4-finetune &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--gpu&lt;/span&gt; 1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--gpu-type&lt;/span&gt; nvidia-rtx-pro-6000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cpu&lt;/span&gt; 30.0 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--memory&lt;/span&gt; 120Gi &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--labels&lt;/span&gt; dev-tutorial&lt;span class="o"&gt;=&lt;/span&gt;finetune-gemma &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--add-volume&lt;/span&gt; &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;model-volume,type&lt;span class="o"&gt;=&lt;/span&gt;cloud-storage,bucket&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;$BUCKET_NAME&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--add-volume-mount&lt;/span&gt; &lt;span class="nv"&gt;volume&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;model-volume,mount-path&lt;span class="o"&gt;=&lt;/span&gt;/mnt/gcs &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"--model-id"&lt;/span&gt;,&lt;span class="s2"&gt;"/mnt/gcs/google/gemma-4-31b-it/"&lt;/span&gt;,&lt;span class="s2"&gt;"--output-dir"&lt;/span&gt;,&lt;span class="s2"&gt;"/mnt/gcs/gemma4-finetuned"&lt;/span&gt;,&lt;span class="s2"&gt;"--train-size"&lt;/span&gt;,&lt;span class="s2"&gt;"700"&lt;/span&gt;,&lt;span class="s2"&gt;"--eval-size"&lt;/span&gt;,&lt;span class="s2"&gt;"200"&lt;/span&gt;,&lt;span class="s2"&gt;"--merge"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then execute it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gcloud beta run &lt;span class="nb"&gt;jobs &lt;/span&gt;execute gemma4-finetuning-job &lt;span class="nt"&gt;--region&lt;/span&gt; &lt;span class="nv"&gt;$REGION&lt;/span&gt; &lt;span class="nt"&gt;--async&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Migrating to Gemma 4 requires handling its new architecture and response formats, but the effort pays off with its superior reasoning and adherence to instructions. By leveraging &lt;a href="https://docs.cloud.google.com/run/docs/create-jobs" rel="noopener noreferrer"&gt;Cloud Run Jobs&lt;/a&gt; and &lt;a href="https://cloud.google.com/blog/products/serverless/cloud-run-supports-nvidia-rtx-6000-pro-gpus-for-ai-workloads" rel="noopener noreferrer"&gt;Serverless Blackwell GPUs&lt;/a&gt;, you can train these massive models efficiently without managing servers.&lt;/p&gt;

&lt;p&gt;To get started with inference, explore this codelab: &lt;a href="https://codelabs.developers.google.com/codelabs/cloud-run/cloud-run-gpu-rtx-pro-6000-gemma4-vllm#0" rel="noopener noreferrer"&gt;Run inference of Gemma 4 model on Cloud Run with RTX 6000 Pro GPU with vLLM&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To learn more about production serving, refer to the &lt;a href="https://docs.cloud.google.com/run/docs/run-gemma-on-cloud-run" rel="noopener noreferrer"&gt;Cloud Run Gemma 4 documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Happy fine-tuning! 🎉&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Special thanks to Ryan Mullins, Juyeong Ji and Gus Martins from the Gemma 4 team for the helpful review and feedback on this blog.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>gemma</category>
      <category>machinelearning</category>
      <category>googlecloud</category>
      <category>cloudrun</category>
    </item>
    <item>
      <title>How I used Gemini CLI to orchestrate a complex RAG migration</title>
      <dc:creator>Remigiusz Samborski</dc:creator>
      <pubDate>Tue, 28 Apr 2026 13:02:03 +0000</pubDate>
      <link>https://dev.to/googleai/how-i-used-gemini-cli-to-orchestrate-a-complex-rag-migration-43ga</link>
      <guid>https://dev.to/googleai/how-i-used-gemini-cli-to-orchestrate-a-complex-rag-migration-43ga</guid>
      <description>&lt;p&gt;Building a complex, multi-phase cloud project like a RAG migration is as much about orchestration as it is about code. You have to manage infrastructure (&lt;a href="https://www.terraform.io/" rel="noopener noreferrer"&gt;Terraform&lt;/a&gt;), backend services (&lt;a href="https://www.python.org/" rel="noopener noreferrer"&gt;Python&lt;/a&gt;), frontend UI (&lt;a href="https://nextjs.org/" rel="noopener noreferrer"&gt;Next.js&lt;/a&gt;), data pipelines (&lt;a href="https://docs.cloud.google.com/bigquery/docs/introduction?utm_campaign=CDR_0x87fa8d40_default_b499342492&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener noreferrer"&gt;BigQuery&lt;/a&gt;/&lt;a href="https://docs.cloud.google.com/alloydb/docs/overview?utm_campaign=CDR_0x87fa8d40_default_b499342492&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener noreferrer"&gt;AlloyDB&lt;/a&gt;), and documentation - all while maintaining a consistent technical strategy.&lt;/p&gt;

&lt;p&gt;Standard IDE completions are great for snippets, but they lack the system-level context needed for this kind of engineering. To build this reference architecture, I didn't just use an AI to write code. I used an AI to &lt;strong&gt;orchestrate the entire project&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In this final post (see previous &lt;a href="https://medium.com/@rsamborski/6ead93ca4aec" rel="noopener noreferrer"&gt;part 1&lt;/a&gt; and &lt;a href="https://medium.com/@rsamborski/8a0464af6f55" rel="noopener noreferrer"&gt;part 2&lt;/a&gt;), I'll share a behind-the-scenes look at using &lt;a href="https://geminicli.com/" rel="noopener noreferrer"&gt;Gemini CLI&lt;/a&gt; with the &lt;a href="https://github.com/gemini-cli-extensions/conductor" rel="noopener noreferrer"&gt;Conductor extension&lt;/a&gt; to orchestrate this migration.&lt;/p&gt;

&lt;p&gt;In this post, you will learn:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How to leverage terminal-first AI assistants for system-level engineering
&lt;/li&gt;
&lt;li&gt;How to implement spec-driven development with &lt;a href="https://github.com/gemini-cli-extensions/conductor" rel="noopener noreferrer"&gt;the Conductor extension&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;How to use AI-driven Test-Driven Development (TDD) for reliable code generation
&lt;/li&gt;
&lt;li&gt;How to collaborate with AI agents using the "Human-in-the-Loop" model&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Before we dive into the workflow, let's briefly discuss why orchestration is the next logical step for AI-assisted development.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The Developer Experience&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Let's walk through my development process step-by-step. The entire specification, plan, and implementation logic is available in &lt;a href="https://github.com/rsamborski/rag-migration/tree/main/conductor" rel="noopener noreferrer"&gt;the &lt;code&gt;conductor&lt;/code&gt; directory&lt;/a&gt; of the &lt;code&gt;rag-migration&lt;/code&gt; repository.&lt;/p&gt;

&lt;h3&gt;
  
  
  Spec-driven development with Conductor
&lt;/h3&gt;

&lt;p&gt;Central to my workflow, is the Conductor extension. It's built on the principle of &lt;strong&gt;spec-driven development&lt;/strong&gt;. Instead of jumping straight into code, we define the "source of truth" in Markdown files.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Product Definition (&lt;code&gt;product.md&lt;/code&gt;):&lt;/strong&gt; What are we building?
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tech Stack (&lt;code&gt;tech-stack.md&lt;/code&gt;):&lt;/strong&gt; What tools are we using?
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tracks Registry (&lt;code&gt;tracks.md&lt;/code&gt;):&lt;/strong&gt; What are the major milestones?
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Implementation Plans (&lt;code&gt;plan.md&lt;/code&gt; for each of the tracks):&lt;/strong&gt; What are the step-by-step tasks?
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Workflow (&lt;code&gt;workflow.md&lt;/code&gt;):&lt;/strong&gt; How are we building the solution?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By having these documents in the codebase, the AI agent (Gemini CLI) always has the high-level context it needs to make smart decisions. It's also a good practice to share those with your team so everyone (including AI agents) is on the same page about the project's direction.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conductor initialization
&lt;/h3&gt;

&lt;p&gt;The first step for the project initialization is to create product definition and tech stack files. This is handled by running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/conductor:setup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Gemini CLI will ask you a series of questions to help you define your project, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What is the name of your product?
&lt;/li&gt;
&lt;li&gt;Who are the primary users?
&lt;/li&gt;
&lt;li&gt;What is the tech stack you are using?
&lt;/li&gt;
&lt;li&gt;What are the major features you want to implement?
&lt;/li&gt;
&lt;li&gt;What is the workflow you want to use?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It will then create the initial project structure in the &lt;code&gt;conductor&lt;/code&gt; directory, including the &lt;a href="https://github.com/rsamborski/rag-migration/blob/main/conductor/product.md" rel="noopener noreferrer"&gt;product.md&lt;/a&gt; and &lt;a href="https://github.com/rsamborski/rag-migration/blob/main/conductor/tech-stack.md" rel="noopener noreferrer"&gt;tech-stack.md&lt;/a&gt; files.&lt;/p&gt;

&lt;h3&gt;
  
  
  The lifecycle of a track
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fat6kt8jvqw6o4oha2npr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fat6kt8jvqw6o4oha2npr.png" alt="Gemini CLI with conductor - lifecycle" width="800" height="324"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each major feature in this project was implemented as a "Track". A typical track lifecycle consists of:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Track Initialization (&lt;code&gt;/conductor:newTrack&lt;/code&gt;):

&lt;ul&gt;
&lt;li&gt;The agent creates a &lt;code&gt;spec.md&lt;/code&gt; file that describes the goals of the track
&lt;/li&gt;
&lt;li&gt;The agent maps the existing codebase and validates assumptions
&lt;/li&gt;
&lt;li&gt;The agent creates a &lt;code&gt;plan.md&lt;/code&gt; file that describes the steps needed to achieve the goals
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Track Execution (&lt;code&gt;/conductor:implement&lt;/code&gt;):

&lt;ul&gt;
&lt;li&gt;The agent iterates through tasks using a &lt;strong&gt;Plan -&amp;gt; Act -&amp;gt; Validate&lt;/strong&gt; cycle
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Track Completion:

&lt;ul&gt;
&lt;li&gt;The agent verifies the changes made during the track
&lt;/li&gt;
&lt;li&gt;The agent ask for user feedback on the implementation
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Track Archivization:

&lt;ul&gt;
&lt;li&gt;Once a track is completed, Gemini CLI archives the track in the &lt;code&gt;conductor/archive&lt;/code&gt; directory&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For example, when I started the initial embeddings track, I initialized it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/conductor:newTrack
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Gemini CLI researches the codebase, asks clarifying questions and creates a &lt;a href="https://github.com/rsamborski/rag-migration/blob/main/conductor/archive/initial-embeddings_20260303/spec.md" rel="noopener noreferrer"&gt;spec.md&lt;/a&gt; and &lt;a href="https://github.com/rsamborski/rag-migration/blob/main/conductor/archive/initial-embeddings_20260303/plan.md" rel="noopener noreferrer"&gt;plan.md&lt;/a&gt; files. Only after I review and approve them, the actual implementation starts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Terraform for Infrastructure as Code
&lt;/h3&gt;

&lt;p&gt;My &lt;a href="https://github.com/rsamborski/rag-migration/blob/b17a233a67886bf574601e9d72d53416c6ffcf90/conductor/product.md?plain=1#L21" rel="noopener noreferrer"&gt;product.md&lt;/a&gt; file instructs Gemini CLI to write Terraform code for all the resources created during the project. This works really well as all the resources are consistently managed by source code and it's easy to spin up a new environment when needed.&lt;/p&gt;

&lt;p&gt;You can see all the Terraform files and infrastructure scripts used in the first track in the &lt;a href="https://github.com/rsamborski/rag-migration/tree/main/01-generation/infra" rel="noopener noreferrer"&gt;infra&lt;/a&gt; directory.&lt;/p&gt;

&lt;p&gt;Moreover, in the course of the project creation I instructed Gemini CLI to always run &lt;code&gt;terraform plan&lt;/code&gt; before &lt;code&gt;terraform apply&lt;/code&gt;. Keeping this information in the &lt;a href="https://github.com/rsamborski/rag-migration/blob/b17a233a67886bf574601e9d72d53416c6ffcf90/conductor/workflow.md?plain=1#L12" rel="noopener noreferrer"&gt;workflow.md&lt;/a&gt; file ensures that such an approach is applied to all tracks.&lt;/p&gt;

&lt;h3&gt;
  
  
  TDD with an AI agent
&lt;/h3&gt;

&lt;p&gt;One of the most powerful aspects of this workflow is AI-driven Test-Driven Development (TDD). I didn't just ask the agent to "write the code". It followed a strict protocol:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Write Failing Tests:&lt;/strong&gt; The agent defines the expected behavior in a new test file
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Red Phase:&lt;/strong&gt; It runs the tests and confirms they fail
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Green Phase:&lt;/strong&gt; It writes the minimum code needed to pass the tests
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Refactor:&lt;/strong&gt; It refactors the implementation code and the test code to improve clarity, remove duplication, and enhance performance without changing the external behavior.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verify Coverage:&lt;/strong&gt; It verifies that the test coverage meets the project requirements (target: &amp;gt;80% coverage for new code).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Commit Code Changes:&lt;/strong&gt; The agent commits code changes related to the task.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This ensures that the AI-generated code isn't just "syntactically correct" but functionally verified against my requirements. This workflow is described in the &lt;a href="https://github.com/rsamborski/rag-migration/blob/b17a233a67886bf574601e9d72d53416c6ffcf90/conductor/workflow.md?plain=1#L18" rel="noopener noreferrer"&gt;workflow.md&lt;/a&gt; file.&lt;/p&gt;

&lt;h3&gt;
  
  
  Checkpoints and quality gates
&lt;/h3&gt;

&lt;p&gt;At the end of every phase, Gemini CLI runs a "Checkpoint" protocol. This includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Automated Verification:&lt;/strong&gt; Running the full test suite.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manual Verification:&lt;/strong&gt; Providing the user with step-by-step instructions to verify the changes.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auditable Records:&lt;/strong&gt; Attaching a verification report to the git commit using &lt;code&gt;git notes&lt;/code&gt; and update &lt;code&gt;plan.md&lt;/code&gt; with the new commit hash.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7kk90njuhpzsjwell1ph.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7kk90njuhpzsjwell1ph.png" alt="Conductor commits demonstrating the checkpoint protocol" width="800" height="409"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Conductor commits demonstrating the checkpoint protocol.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Effective Human-in-the-Loop
&lt;/h3&gt;

&lt;p&gt;To achieve an effective AI agent-human development synergy I heavily depended on following solutions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gemini CLI in a sandbox with Yolo mode enabled - see &lt;a href="https://dev.to/googleai/secure-gemini-cli-for-cloud-development-2mpe"&gt;my past article&lt;/a&gt; for more about it.
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/rsamborski/vibecoding/blob/main/gemini-cli-sandbox/sandbox_notifier.sh" rel="noopener noreferrer"&gt;Custom sandbox notifier script&lt;/a&gt; that runs in another terminal.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This approach provided safe guardrails and allowed me to jump into work on other projects while the AI was working on this one. I was always able to jump back quickly thanks to timely notifications. Moreover the checkpointing mechanism of Conductor allowed me to always have a possibility to revert unnecessary changes or to restart from a known working state.&lt;/p&gt;

&lt;p&gt;I also used &lt;a href="https://antigravity.google/?utm_campaign=CDR_0x87fa8d40_default_b499342492&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener noreferrer"&gt;Antigravity&lt;/a&gt; to polish the generated code and the documentation. It was particularly helpful for minor tweaks or refactoring of the code that was generated by Gemini CLI.&lt;/p&gt;

&lt;h3&gt;
  
  
  Token usage
&lt;/h3&gt;

&lt;p&gt;Throughout the project I used several models (Gemini 3 Pro, Gemini 3 Flash and Gemini 2.5 Flash Lite). The total token consumption was:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Input tokens: ~19M
&lt;/li&gt;
&lt;li&gt;Cached input tokens: ~66M
&lt;/li&gt;
&lt;li&gt;Output tokens: ~400k&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Notice the high number of cached input tokens, which significantly impacts the spend. The total Vertex AI token cost was &lt;strong&gt;around $30&lt;/strong&gt;. Not bad for several days of AI assisted work.&lt;/p&gt;

&lt;p&gt;See the &lt;a href="https://cloud.google.com/vertex-ai/generative-ai/pricing?utm_campaign=CDR_0x87fa8d40_default_b499342492&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener noreferrer"&gt;pricing page&lt;/a&gt; for more details and please mind that your mileage may vary.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Summary&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Software engineering is evolving from writing code to orchestrating agentic workflows. By using tools like Gemini CLI and frameworks like Conductor, you can scale your impact as an architect while ensuring consistent, high-quality implementation.&lt;/p&gt;

&lt;p&gt;Ready to build your own AI-assisted development projects?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://geminicli.com/" rel="noopener noreferrer"&gt;Check out Gemini CLI&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/gemini-cli-extensions/conductor" rel="noopener noreferrer"&gt;Explore the Conductor extension&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://antigravity.google/?utm_campaign=CDR_0x87fa8d40_default_b499342492&amp;amp;utm_medium=external&amp;amp;utm_source=blog" rel="noopener noreferrer"&gt;Try Antigravity&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/rsamborski/rag-migration" rel="noopener noreferrer"&gt;Check out the full RAG Migration repository&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Thanks for reading&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;If you found this article helpful, please consider adding 50 claps to this post by pressing and holding the clap button 👏 This will help others find it. You can also share it with your friends on socials.&lt;/p&gt;

&lt;p&gt;I'm always eager to share my learnings or chat with fellow developers and AI enthusiasts, so feel free to follow me on &lt;a href="https://www.linkedin.com/in/remigiusz-samborski/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;, &lt;a href="https://x.com/RemikSamborski" rel="noopener noreferrer"&gt;X&lt;/a&gt; or &lt;a href="https://bsky.app/profile/rsamborski.bsky.social" rel="noopener noreferrer"&gt;Bluesky&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>gemini</category>
      <category>agents</category>
      <category>programming</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
