Suhas Mallesh

Posted on Mar 14

Deploy Your First Vertex AI Agent with Terraform and ADK: Model-Agnostic and Future-Proof 🤖

#googlecloud #ai #devops #terraform

Vertex AI Agent Engine handles the infrastructure for running agents in production. Google's ADK framework handles the logic. Here's how to deploy your first agent with Terraform provisioning the infra and ADK defining the behavior, where swapping models is a one-variable change.

In Series 1, we deployed a Vertex AI endpoint and called Gemini directly. Single prompt, single response, stateless.

Vertex AI agents are different. An agent wraps a model with reasoning, session memory, and tool orchestration. It decides what to do next based on the user's request, calls tools when needed, and loops until it has a satisfactory answer. The model is the brain; the agent is the decision-maker.

GCP's agent stack has two layers: Agent Development Kit (ADK) is an open-source Python framework that defines agent behavior (model, instructions, tools). Agent Engine is a fully managed runtime that hosts ADK agents in production with scaling, sessions, and monitoring built in. Terraform provisions the infrastructure (APIs, IAM, service accounts). ADK defines the agent logic. This post wires them together. 🎯

🧠 Endpoint vs Agent: What's Different

Capability	Direct Endpoint (Series 1)	Vertex AI Agent (ADK + Engine)
Interaction	Single request/response	Multi-turn with managed sessions
State	Stateless	Session memory + Memory Bank
Tool use	Manual implementation	Declarative tool definitions
Reasoning	Whatever the model does	Orchestrated reasoning loop
Deployment	API call	Managed runtime (Agent Engine)
Framework	SDK only	ADK (open-source, model-agnostic)

🔧 Model Configuration: The Future-Proof Layer

ADK is model-agnostic. It works with Gemini, Claude, Llama, Mistral, and any model in Vertex AI Model Garden or via LiteLLM. The key is externalizing the model ID so swapping is a config change:

# variables.tf

variable "agent_model" {
  description = "Model ID for the agent. Change this to upgrade models."
  type = object({
    id      = string  # Vertex AI model ID
    display = string  # Human-readable name for tags/logs
  })
  default = {
    id      = "gemini-2.5-flash"
    display = "Gemini 2.5 Flash"
  }
}

variable "agent_name" {
  description = "Name of the agent"
  type        = string
  default     = "assistant"
}

variable "agent_instruction" {
  description = "System instruction for the agent"
  type        = string
}

Environment-specific overrides:

# environments/dev.tfvars
agent_model = {
  id      = "gemini-2.5-flash"
  display = "Gemini 2.5 Flash"
}

# environments/prod.tfvars
agent_model = {
  id      = "gemini-2.5-pro"
  display = "Gemini 2.5 Pro"
}

# When a new model launches:
# agent_model = {
#   id      = "gemini-3.0-flash"
#   display = "Gemini 3.0 Flash"
# }

The model variable flows into both the Terraform infrastructure and the ADK agent code.

🔧 Terraform: Provision the Infrastructure

Terraform handles APIs, IAM, and the service account. The agent logic lives in Python with ADK.

APIs and Service Account

# agent/main.tf

resource "google_project_service" "required" {
  for_each = toset([
    "aiplatform.googleapis.com",
    "compute.googleapis.com",
    "cloudbuild.googleapis.com",
  ])
  project = var.project_id
  service = each.value
}

resource "google_service_account" "agent" {
  account_id   = "${var.environment}-${var.agent_name}-agent"
  display_name = "Agent SA: ${var.agent_model.display}"
  project      = var.project_id
}

# Agent needs Vertex AI User to invoke models
resource "google_project_iam_member" "vertex_ai_user" {
  project = var.project_id
  role    = "roles/aiplatform.user"
  member  = "serviceAccount:${google_service_account.agent.email}"
}

# Agent Engine needs permission to manage agent runtime
resource "google_project_iam_member" "agent_engine_admin" {
  project = var.project_id
  role    = "roles/aiplatform.agentEngineAdmin"
  member  = "serviceAccount:${google_service_account.agent.email}"
}

Agent Config Output

Terraform generates a config file that the ADK agent code reads at deployment:

# agent/config.tf

resource "local_file" "agent_config" {
  filename = "${path.module}/agent_source/config.json"
  content = jsonencode({
    project_id  = var.project_id
    location    = var.region
    model_id    = var.agent_model.id
    agent_name  = "${var.environment}-${var.agent_name}"
    instruction = var.agent_instruction
  })
}

This bridges Terraform variables into the Python agent code. When you change agent_model.id in .tfvars, the config file updates, and the next deployment uses the new model.

🐍 ADK Agent Code

The agent logic is pure Python using Google's Agent Development Kit:

# agent_source/agent.py

import json
from google.adk.agents import Agent
from google.adk.models.vertexai import VertexAi

# Load config generated by Terraform
with open("config.json") as f:
    config = json.load(f)

# Model is a variable, not hardcoded
model = VertexAi(model=config["model_id"])

# Define the agent
agent = Agent(
    name=config["agent_name"],
    model=model,
    instruction=config["instruction"],
    description="A helpful assistant deployed via Terraform and ADK",
)

Model-agnostic by design. ADK's VertexAi class accepts any model available in Vertex AI. For non-Google models, use LiteLlm instead:

# For Claude, Llama, Mistral via LiteLLM
from google.adk.models.lite_llm import LiteLlm
model = LiteLlm(model="vertex_ai/claude-sonnet-4-20250514")

Same agent code, different model backend. No framework changes needed.

🚀 Deploy to Agent Engine

Wrap the agent with AdkApp and deploy to the managed runtime:

# agent_source/deploy.py

import vertexai
from vertexai.agent_engines import AdkApp

# Initialize Vertex AI
vertexai.init(project=config["project_id"], location=config["location"])

# Wrap agent for deployment
app = AdkApp(agent=agent)

# Deploy to Agent Engine
deployed = vertexai.agent_engines.create(
    agent_engine=app,
    requirements=["google-cloud-aiplatform[agent_engines,adk]"],
    display_name=config["agent_name"],
)

print(f"Deployed: {deployed.api_resource.name}")

Agent Engine handles scaling, session management, and networking. Your agent gets an HTTP endpoint at /invocations automatically.

🧪 Query the Deployed Agent

import vertexai

vertexai.init(project="my-project", location="us-central1")
client = vertexai.Client()

# Get deployed agent
adk_app = client.agent_engines.get(name="RESOURCE_ID")

# Create a session and query
session = await adk_app.async_create_session(user_id="user-123")

async for event in adk_app.async_stream_query(
    user_id="user-123",
    session_id=session.id,
    message="What were our Q3 revenue numbers?"
):
    if event.get("content", {}).get("parts"):
        for part in event["content"]["parts"]:
            if "text" in part:
                print(part["text"], end="")

The session_id enables multi-turn conversation. Agent Engine manages session state automatically - in-memory locally, cloud-managed when deployed.

📐 The Upgrade Workflow

When a new Gemini model launches:

Update .tfvars - change agent_model.id and agent_model.display
Run terraform apply - regenerates config.json with new model ID, updates IAM tags
Redeploy agent - python deploy.py picks up the new config

Your query code doesn't change. The resource ID stays the same. The agent just uses a better model.

For non-Gemini models (Claude, Llama), swap VertexAi for LiteLlm in agent.py and update the model string. The rest of the infrastructure stays identical.

⚠️ Gotchas and Tips

ADK is Python-only. Agent Engine deployment only supports Python agents. If your team is .NET or Java, consider deploying ADK agents on Cloud Run or GKE instead.

Pin your dependencies. Always pin google-cloud-aiplatform version in requirements. Unpinned versions can break deployments when the SDK updates.

Agent Engine pricing. You pay for compute time when the agent is processing queries. Idle agents don't incur compute charges, but deployed agents have a base infrastructure cost. Check the Agent Engine pricing page for current rates.

Memory Bank. ADK agents deployed to Agent Engine automatically get cloud-managed session persistence. For long-term memory across sessions, add PreloadMemoryTool to your agent definition.

⏭️ What's Next

This is Post 1 of the GCP AI Agents with Terraform series.

Post 1: Deploy First Vertex AI Agent (you are here) 🤖
Post 2: Agent Tools - Connect to APIs
Post 3: Multi-Agent with Agent Engine
Post 4: Agent + Google Search Grounding

Your first GCP agent is deployed. ADK defines the logic, Agent Engine runs it in production, and Terraform provisions the infrastructure. When the next Gemini model drops, it's a config change - not a rewrite. 🤖

Found this helpful? Follow for the full AI Agents with Terraform series! 💬

DEV Community