Vertex AI Agent Engine handles the infrastructure for running agents in production. Google's ADK framework handles the logic. Here's how to deploy your first agent with Terraform provisioning the infra and ADK defining the behavior, where swapping models is a one-variable change.
In Series 1, we deployed a Vertex AI endpoint and called Gemini directly. Single prompt, single response, stateless.
Vertex AI agents are different. An agent wraps a model with reasoning, session memory, and tool orchestration. It decides what to do next based on the user's request, calls tools when needed, and loops until it has a satisfactory answer. The model is the brain; the agent is the decision-maker.
GCP's agent stack has two layers: Agent Development Kit (ADK) is an open-source Python framework that defines agent behavior (model, instructions, tools). Agent Engine is a fully managed runtime that hosts ADK agents in production with scaling, sessions, and monitoring built in. Terraform provisions the infrastructure (APIs, IAM, service accounts). ADK defines the agent logic. This post wires them together. 🎯
🧠 Endpoint vs Agent: What's Different
| Capability | Direct Endpoint (Series 1) | Vertex AI Agent (ADK + Engine) |
|---|---|---|
| Interaction | Single request/response | Multi-turn with managed sessions |
| State | Stateless | Session memory + Memory Bank |
| Tool use | Manual implementation | Declarative tool definitions |
| Reasoning | Whatever the model does | Orchestrated reasoning loop |
| Deployment | API call | Managed runtime (Agent Engine) |
| Framework | SDK only | ADK (open-source, model-agnostic) |
🔧 Model Configuration: The Future-Proof Layer
ADK is model-agnostic. It works with Gemini, Claude, Llama, Mistral, and any model in Vertex AI Model Garden or via LiteLLM. The key is externalizing the model ID so swapping is a config change:
# variables.tf
variable "agent_model" {
description = "Model ID for the agent. Change this to upgrade models."
type = object({
id = string # Vertex AI model ID
display = string # Human-readable name for tags/logs
})
default = {
id = "gemini-2.5-flash"
display = "Gemini 2.5 Flash"
}
}
variable "agent_name" {
description = "Name of the agent"
type = string
default = "assistant"
}
variable "agent_instruction" {
description = "System instruction for the agent"
type = string
}
Environment-specific overrides:
# environments/dev.tfvars
agent_model = {
id = "gemini-2.5-flash"
display = "Gemini 2.5 Flash"
}
# environments/prod.tfvars
agent_model = {
id = "gemini-2.5-pro"
display = "Gemini 2.5 Pro"
}
# When a new model launches:
# agent_model = {
# id = "gemini-3.0-flash"
# display = "Gemini 3.0 Flash"
# }
The model variable flows into both the Terraform infrastructure and the ADK agent code.
🔧 Terraform: Provision the Infrastructure
Terraform handles APIs, IAM, and the service account. The agent logic lives in Python with ADK.
APIs and Service Account
# agent/main.tf
resource "google_project_service" "required" {
for_each = toset([
"aiplatform.googleapis.com",
"compute.googleapis.com",
"cloudbuild.googleapis.com",
])
project = var.project_id
service = each.value
}
resource "google_service_account" "agent" {
account_id = "${var.environment}-${var.agent_name}-agent"
display_name = "Agent SA: ${var.agent_model.display}"
project = var.project_id
}
# Agent needs Vertex AI User to invoke models
resource "google_project_iam_member" "vertex_ai_user" {
project = var.project_id
role = "roles/aiplatform.user"
member = "serviceAccount:${google_service_account.agent.email}"
}
# Agent Engine needs permission to manage agent runtime
resource "google_project_iam_member" "agent_engine_admin" {
project = var.project_id
role = "roles/aiplatform.agentEngineAdmin"
member = "serviceAccount:${google_service_account.agent.email}"
}
Agent Config Output
Terraform generates a config file that the ADK agent code reads at deployment:
# agent/config.tf
resource "local_file" "agent_config" {
filename = "${path.module}/agent_source/config.json"
content = jsonencode({
project_id = var.project_id
location = var.region
model_id = var.agent_model.id
agent_name = "${var.environment}-${var.agent_name}"
instruction = var.agent_instruction
})
}
This bridges Terraform variables into the Python agent code. When you change agent_model.id in .tfvars, the config file updates, and the next deployment uses the new model.
🐍 ADK Agent Code
The agent logic is pure Python using Google's Agent Development Kit:
# agent_source/agent.py
import json
from google.adk.agents import Agent
from google.adk.models.vertexai import VertexAi
# Load config generated by Terraform
with open("config.json") as f:
config = json.load(f)
# Model is a variable, not hardcoded
model = VertexAi(model=config["model_id"])
# Define the agent
agent = Agent(
name=config["agent_name"],
model=model,
instruction=config["instruction"],
description="A helpful assistant deployed via Terraform and ADK",
)
Model-agnostic by design. ADK's VertexAi class accepts any model available in Vertex AI. For non-Google models, use LiteLlm instead:
# For Claude, Llama, Mistral via LiteLLM
from google.adk.models.lite_llm import LiteLlm
model = LiteLlm(model="vertex_ai/claude-sonnet-4-20250514")
Same agent code, different model backend. No framework changes needed.
🚀 Deploy to Agent Engine
Wrap the agent with AdkApp and deploy to the managed runtime:
# agent_source/deploy.py
import vertexai
from vertexai.agent_engines import AdkApp
# Initialize Vertex AI
vertexai.init(project=config["project_id"], location=config["location"])
# Wrap agent for deployment
app = AdkApp(agent=agent)
# Deploy to Agent Engine
deployed = vertexai.agent_engines.create(
agent_engine=app,
requirements=["google-cloud-aiplatform[agent_engines,adk]"],
display_name=config["agent_name"],
)
print(f"Deployed: {deployed.api_resource.name}")
Agent Engine handles scaling, session management, and networking. Your agent gets an HTTP endpoint at /invocations automatically.
🧪 Query the Deployed Agent
import vertexai
vertexai.init(project="my-project", location="us-central1")
client = vertexai.Client()
# Get deployed agent
adk_app = client.agent_engines.get(name="RESOURCE_ID")
# Create a session and query
session = await adk_app.async_create_session(user_id="user-123")
async for event in adk_app.async_stream_query(
user_id="user-123",
session_id=session.id,
message="What were our Q3 revenue numbers?"
):
if event.get("content", {}).get("parts"):
for part in event["content"]["parts"]:
if "text" in part:
print(part["text"], end="")
The session_id enables multi-turn conversation. Agent Engine manages session state automatically - in-memory locally, cloud-managed when deployed.
📐 The Upgrade Workflow
When a new Gemini model launches:
-
Update
.tfvars- changeagent_model.idandagent_model.display -
Run
terraform apply- regeneratesconfig.jsonwith new model ID, updates IAM tags -
Redeploy agent -
python deploy.pypicks up the new config
Your query code doesn't change. The resource ID stays the same. The agent just uses a better model.
For non-Gemini models (Claude, Llama), swap VertexAi for LiteLlm in agent.py and update the model string. The rest of the infrastructure stays identical.
⚠️ Gotchas and Tips
ADK is Python-only. Agent Engine deployment only supports Python agents. If your team is .NET or Java, consider deploying ADK agents on Cloud Run or GKE instead.
Pin your dependencies. Always pin google-cloud-aiplatform version in requirements. Unpinned versions can break deployments when the SDK updates.
Agent Engine pricing. You pay for compute time when the agent is processing queries. Idle agents don't incur compute charges, but deployed agents have a base infrastructure cost. Check the Agent Engine pricing page for current rates.
Memory Bank. ADK agents deployed to Agent Engine automatically get cloud-managed session persistence. For long-term memory across sessions, add PreloadMemoryTool to your agent definition.
⏭️ What's Next
This is Post 1 of the GCP AI Agents with Terraform series.
- Post 1: Deploy First Vertex AI Agent (you are here) 🤖
- Post 2: Agent Tools - Connect to APIs
- Post 3: Multi-Agent with Agent Engine
- Post 4: Agent + Google Search Grounding
Your first GCP agent is deployed. ADK defines the logic, Agent Engine runs it in production, and Terraform provisions the infrastructure. When the next Gemini model drops, it's a config change - not a rewrite. 🤖
Found this helpful? Follow for the full AI Agents with Terraform series! 💬
Top comments (0)