Raghava Chellu

Posted on May 1

GCP AgentFlow: Building Agentic AI Orchestration on Google Cloud

#agents #ai #architecture #googlecloud

Modern cloud applications don't just move data — they make decisions. A file arrives, a risk score is calculated, and the system needs to route it, quarantine it, or escalate it — all without a human in the loop. That's the promise of agentic AI, and GCP AgentFlow is the toolkit I built to make it practical on Google Cloud.

Why Agentic Orchestration Matters

Traditional event-driven architectures are reactive: an event arrives, a function runs, data moves. Agentic architectures go a step further — they evaluate context, apply decision logic, and determine what should happen next based on the current state of the system.

On Google Cloud, this means combining:

Pub/Sub for event ingestion and routing
Cloud Workflows for multi-step orchestration
Datastore for tracking operational state across events
BigQuery for logging decisions and analytics
ML models for scoring, classification, and recommendation GCP AgentFlow provides the connective tissue across all of these without forcing a rigid architecture on your team.

Core Building Blocks

The library is intentionally lightweight and composable. Each component can be used independently or wired together into a full agentic pipeline.

Decision Engine
The heart of the library. decide_next_action evaluates an incoming event and returns a structured decision — whether to route, retry, quarantine, approve, or alert.

from gcp_agentflow import AgentDecisionInput, decide_next_action

event = AgentDecisionInput(
    event_type="file_arrived",
    source="pubsub",
    risk_score=72,
    payload={"bucket": "incoming", "name": "file.csv"}
)

decision = decide_next_action(event)
print(decision.action)   # e.g., "quarantine"
print(decision.reason)   # e.g., "Risk score exceeds threshold of 70"

The AgentDecisionInput schema is intentionally flexible — you bring the risk score from your ML model, and the decision engine applies your routing rules. This separation keeps ML logic and orchestration logic cleanly decoupled.

Workflow Payload Builder

Generates clean, structured payloads ready to pass into Google Workflows or Cloud Run HTTP endpoints. This ensures consistent schema across all your workflow trigger points, which becomes critical as your automation surface grows.

Pub/Sub Publisher Wrapper

A safe, retry-aware wrapper around the Pub/Sub publish API. It handles JSON serialization, message attribute injection, and error logging, so your event emission code stays clean and consistent across services.

BigQuery Event Logger

Every decision the agent makes gets logged as a structured analytics event in BigQuery. This gives you a complete audit trail and enables downstream analysis of your automation behavior — which events triggered which actions, how often quarantine fires, where retries cluster.

Datastore State Store

Saves and retrieves operational state by entity key. This is what makes your agent stateful — it can check whether a file has been seen before, whether a workflow step was already completed, or whether a retry budget has been exhausted.

A Real-World Pattern

Here's how these building blocks compose into a real agentic pipeline:

A file lands in a Cloud Storage bucket and triggers a Pub/Sub event
A Cloud Run service receives the event and calls decide_next_action with a risk score from your ML model
The decision engine returns quarantine because the risk score exceeds your threshold
The agent updates Datastore to mark the file as quarantined with a timestamp
The BigQuery event logger records the full decision context for audit and analytics
A Pub/Sub message is published to a quarantine topic, triggering a downstream review workflow in Google Workflows The entire flow is observable, auditable, and re-runnable — if any step fails, Datastore state prevents duplicate processing on retry.

CLI for Development and Testing

The CLI lets you simulate agent decisions without spinning up infrastructure:

gcp-agentflow decide --event-type file_arrived --risk-score 72

This is especially useful in CI pipelines to validate that your routing rules behave as expected before deploying changes to production.

Use Cases

Compliance-driven file processing — Route incoming files through validation, virus scanning, and approval gates based on ML risk scores before they reach downstream systems.
Multi-step data ingestion pipelines — Decide at each pipeline stage whether to proceed, retry with backoff, or dead-letter a record, with full state tracking in Datastore.
Fraud and anomaly detection workflows — Score transactions in real time and trigger escalation workflows in Google Workflows when confidence thresholds are crossed.
GDPR and regulatory data routing — Automatically classify and route data records based on content signals, logging every routing decision for regulatory audit trails in BigQuery.
AI agent backends — Use GCP AgentFlow as the decision and state layer behind Vertex AI Agent Builder agents, giving your AI agents persistent memory via Datastore and structured action logging via BigQuery.

Installation

pip install gcp-agentflow

The library runs cleanly in Cloud Run containers, Dataflow workers, Cloud Functions, or local environments. No heavy dependencies — just Google Cloud client libraries and standard Python.

Building and Publishing

python -m pip install --upgrade build twine
python -m build
twine check dist/*
twine upload dist/*

Design Philosophy

GCP AgentFlow doesn't try to be a full workflow engine — that's what Google Workflows is for. It doesn't try to be an ML platform — that's Vertex AI. What it does is fill the gap between those systems: the decision logic, state management, and event wiring that every agentic pipeline needs but nobody wants to rewrite from scratch.

If you're building event-driven automation on Google Cloud and finding yourself writing the same routing logic, state-check boilerplate, and BigQuery logging code across multiple services, GCP AgentFlow is the abstraction layer you've been missing.

Author:

Raghava Chellu | FBCS | Innovation Technologist, MFT and Data Infrastructure

DEV Community