DEV Community

Cover image for How to Build Long-Running AI Agents with Google Gen AI SDK
Gate of AI
Gate of AI

Posted on • Originally published at gateofai.com

How to Build Long-Running AI Agents with Google Gen AI SDK

šŸš€ Technical Briefing: This tutorial is part of our deep-dive series on Agentic Workflows at Gate of AI. For the full technical breakdown, interactive code sandbox, and the native Arabic translation, visit the original article here.

<span>Tutorial</span>
<span>Advanced</span>
<span>ā± 45 min read</span>
<span>Ā© Gate of AI 2026-05-31</span>
Enter fullscreen mode Exit fullscreen mode

Step away from standard chat APIs. Learn the foundational architecture for building long-running, stateful autonomous agents inspired by the new Gemini Enterprise Unified Inbox.

Prerequisites


  • Python 3.10 or higher
  • Access to the Google Gen AI SDK (Gemini 1.5 Pro or higher)
  • A Google Cloud Project with Billing Enabled
  • Advanced understanding of asynchronous Python (asyncio) and state management

What We're Building


With Google Cloud's announcement of Long-Running Agents in Gemini Enterprise, the development paradigm has officially shifted. In this tutorial, we will construct the foundational "pause-and-resume" architecture required to build these agents.


We won't just build a chatbot. We will build a stateful, asynchronous Python worker that executes a multi-step task, intentionally "pauses" when it requires simulated human approval (mimicking the Unified Inbox), and resumes upon confirmation.

Setup and Installation


We will use the official Google Gen AI SDK and python-dotenv for our environment variables.


pip install google-genai python-dotenv asyncio

Secure your API credentials in a .env file.



# .env file
GEMINI_API_KEY=your_gemini_api_key_here

Step 1: Architecting the Stateful Client


Unlike a standard chatbot that forgets data between queries, a long-running agent must maintain a rigid state dictionary. We initialize the official genai.Client and set up our state manager.



import os
import asyncio
from google import genai
from dotenv import load_dotenv

load_dotenv()

class LongRunningAgent:
def init(self):
# Initialize the official Google Gen AI Client
self.client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))
self.model = "gemini-1.5-pro"

    # This dictionary mimics the persistent state stored in a database
    self.state = {
        "status": "idle", # idle, running, awaiting_approval, completed
        "workflow_history": [],
        "pending_approval_request": None
    }

def log_action(self, action):
    print(f"[AGENT LOG]: {action}")
    self.state["workflow_history"].append(action)
Enter fullscreen mode Exit fullscreen mode

Step 2: Building the "Pause-and-Resume" HITL Logic


The core innovation of Gemini's new update is the Human-in-the-Loop (HITL) Inbox. Here, we build the asynchronous logic that allows the agent to pause execution when it hits a restricted action.



async def request_human_approval(self, task_description):
"""Simulates pushing a task to the Unified Inbox"""
self.state["status"] = "awaiting_approval"
self.state["pending_approval_request"] = task_description
    self.log_action(f"PAUSED: Awaiting human approval for: {task_description}")

    # Simulate waiting for the manager to click "Approve" in the Inbox
    while self.state["status"] == "awaiting_approval":
        await asyncio.sleep(2) # Check database/state every 2 seconds

    self.log_action("RESUMED: Human approval granted.")
    return True

def simulate_manager_approval(self):
    """External function called by your UI/Inbox when a user clicks approve"""
    if self.state["status"] == "awaiting_approval":
        self.state["status"] = "running"
        self.state["pending_approval_request"] = None
        print("\nāœ… [INBOX]: Manager approved the action.\n")
Enter fullscreen mode Exit fullscreen mode

Step 3: Executing the Asynchronous Workflow


Now, we tie it together. We will use the client.models.generate_content method to process data, but wrap it in our async execution loop.



async def run_multi_day_workflow(self, initial_prompt):
self.state["status"] = "running"
self.log_action("Starting long-running workflow...")
    # Phase 1: Autonomous Processing
    self.log_action("Analyzing request via Gemini API...")
    response = self.client.models.generate_content(
        model=self.model,
        contents=f"Analyze this task and propose a 3-step execution plan: {initial_prompt}"
    )
    self.log_action(f"Plan generated: {response.text[:100]}...")

    # Phase 2: Hitting a permission wall (Mimicking the Unified Inbox feature)
    await asyncio.sleep(1) # Simulating heavy compute time

    # The agent realizes it needs access to a restricted system (e.g., Google Drive)
    await self.request_human_approval("Access restricted Drive Folder: 'Q3 Financials'")

    # Phase 3: Post-Approval Execution
    self.log_action("Finalizing workflow with approved access...")
    final_response = self.client.models.generate_content(
        model=self.model,
        contents="The human approved access. Generate the final summary report."
    )

    self.state["status"] = "completed"
    self.log_action("Workflow Completed.")
    return final_response.text
Enter fullscreen mode Exit fullscreen mode

āš ļø Expert Tip: In a production environment, do not use asyncio.sleep to hold state. You must serialize the self.state dictionary to a persistent database (like Redis or PostgreSQL). When the webhook from your Inbox arrives, you retrieve the state and re-initialize the agent.

Testing the Unified Inbox Architecture


To run this, we will use Python's asyncio.gather to run the agent in the background while simulating a human checking their inbox.



async def main():
agent = LongRunningAgent()
# Start the agent as a background task
agent_task = asyncio.create_task(
    agent.run_multi_day_workflow("Audit the Q3 Marketing Spend")
)

# Simulate the human manager checking their inbox after 5 seconds
await asyncio.sleep(5)
agent.simulate_manager_approval()

# Wait for the agent to finish
result = await agent_task
print(f"\n[FINAL OUTPUT]:\n{result}")
Enter fullscreen mode Exit fullscreen mode

if name == 'main':
asyncio.run(main())

What to Build Next


  • Replace the simulated wait loop by saving the agent's state to a PostgreSQL database.
  • Build a frontend React/Next.js "Unified Inbox" UI that triggers the webhook to resume the agent.
  • Implement the official genai.types.Tool configurations to let the agent actually execute the actions post-approval.

Top comments (0)