š Technical Briefing: This tutorial is part of our deep-dive series on Agentic Workflows at Gate of AI. For the full technical breakdown, interactive code sandbox, and the native Arabic translation, visit the original article here.
<span>Tutorial</span>
<span>Advanced</span>
<span>ā± 45 min read</span>
<span>Ā© Gate of AI 2026-05-31</span>
Step away from standard chat APIs. Learn the foundational architecture for building long-running, stateful autonomous agents inspired by the new Gemini Enterprise Unified Inbox.
Prerequisites
- Python 3.10 or higher
- Access to the Google Gen AI SDK (Gemini 1.5 Pro or higher)
- A Google Cloud Project with Billing Enabled
- Advanced understanding of asynchronous Python (
asyncio) and state management
What We're Building
With Google Cloud's announcement of Long-Running Agents in Gemini Enterprise, the development paradigm has officially shifted. In this tutorial, we will construct the foundational "pause-and-resume" architecture required to build these agents.
We won't just build a chatbot. We will build a stateful, asynchronous Python worker that executes a multi-step task, intentionally "pauses" when it requires simulated human approval (mimicking the Unified Inbox), and resumes upon confirmation.
Setup and Installation
We will use the official Google Gen AI SDK and python-dotenv for our environment variables.
pip install google-genai python-dotenv asyncio
Secure your API credentials in a .env file.
# .env file
GEMINI_API_KEY=your_gemini_api_key_here
Step 1: Architecting the Stateful Client
Unlike a standard chatbot that forgets data between queries, a long-running agent must maintain a rigid state dictionary. We initialize the official genai.Client and set up our state manager.
import os
import asyncio
from google import genai
from dotenv import load_dotenv
load_dotenv()
class LongRunningAgent:
def init(self):
# Initialize the official Google Gen AI Client
self.client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))
self.model = "gemini-1.5-pro"
# This dictionary mimics the persistent state stored in a database
self.state = {
"status": "idle", # idle, running, awaiting_approval, completed
"workflow_history": [],
"pending_approval_request": None
}
def log_action(self, action):
print(f"[AGENT LOG]: {action}")
self.state["workflow_history"].append(action)
Step 2: Building the "Pause-and-Resume" HITL Logic
The core innovation of Gemini's new update is the Human-in-the-Loop (HITL) Inbox. Here, we build the asynchronous logic that allows the agent to pause execution when it hits a restricted action.
async def request_human_approval(self, task_description):
"""Simulates pushing a task to the Unified Inbox"""
self.state["status"] = "awaiting_approval"
self.state["pending_approval_request"] = task_description
self.log_action(f"PAUSED: Awaiting human approval for: {task_description}")
# Simulate waiting for the manager to click "Approve" in the Inbox
while self.state["status"] == "awaiting_approval":
await asyncio.sleep(2) # Check database/state every 2 seconds
self.log_action("RESUMED: Human approval granted.")
return True
def simulate_manager_approval(self):
"""External function called by your UI/Inbox when a user clicks approve"""
if self.state["status"] == "awaiting_approval":
self.state["status"] = "running"
self.state["pending_approval_request"] = None
print("\nā
[INBOX]: Manager approved the action.\n")
Step 3: Executing the Asynchronous Workflow
Now, we tie it together. We will use the client.models.generate_content method to process data, but wrap it in our async execution loop.
async def run_multi_day_workflow(self, initial_prompt):
self.state["status"] = "running"
self.log_action("Starting long-running workflow...")
# Phase 1: Autonomous Processing
self.log_action("Analyzing request via Gemini API...")
response = self.client.models.generate_content(
model=self.model,
contents=f"Analyze this task and propose a 3-step execution plan: {initial_prompt}"
)
self.log_action(f"Plan generated: {response.text[:100]}...")
# Phase 2: Hitting a permission wall (Mimicking the Unified Inbox feature)
await asyncio.sleep(1) # Simulating heavy compute time
# The agent realizes it needs access to a restricted system (e.g., Google Drive)
await self.request_human_approval("Access restricted Drive Folder: 'Q3 Financials'")
# Phase 3: Post-Approval Execution
self.log_action("Finalizing workflow with approved access...")
final_response = self.client.models.generate_content(
model=self.model,
contents="The human approved access. Generate the final summary report."
)
self.state["status"] = "completed"
self.log_action("Workflow Completed.")
return final_response.text
ā ļø Expert Tip: In a production environment, do not use asyncio.sleep to hold state. You must serialize the self.state dictionary to a persistent database (like Redis or PostgreSQL). When the webhook from your Inbox arrives, you retrieve the state and re-initialize the agent.
Testing the Unified Inbox Architecture
To run this, we will use Python's asyncio.gather to run the agent in the background while simulating a human checking their inbox.
async def main():
agent = LongRunningAgent()
# Start the agent as a background task
agent_task = asyncio.create_task(
agent.run_multi_day_workflow("Audit the Q3 Marketing Spend")
)
# Simulate the human manager checking their inbox after 5 seconds
await asyncio.sleep(5)
agent.simulate_manager_approval()
# Wait for the agent to finish
result = await agent_task
print(f"\n[FINAL OUTPUT]:\n{result}")
if name == 'main':
asyncio.run(main())
What to Build Next
- Replace the simulated wait loop by saving the agent's state to a PostgreSQL database.
- Build a frontend React/Next.js "Unified Inbox" UI that triggers the webhook to resume the agent.
- Implement the official
genai.types.Toolconfigurations to let the agent actually execute the actions post-approval.
Top comments (0)