<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Saurabh Mishra</title>
    <description>The latest articles on DEV Community by Saurabh Mishra (@saurabhmi).</description>
    <link>https://dev.to/saurabhmi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1150046%2Fd78df883-0fa4-4681-8595-56a3b7c82045.png</url>
      <title>DEV Community: Saurabh Mishra</title>
      <link>https://dev.to/saurabhmi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/saurabhmi"/>
    <language>en</language>
    <item>
      <title>Untrusted Code, Trusted Cluster Scaling Secure AI Agent Workspaces with GKE Agent Sandbox</title>
      <dc:creator>Saurabh Mishra</dc:creator>
      <pubDate>Sun, 31 May 2026 04:03:42 +0000</pubDate>
      <link>https://dev.to/gde/untrusted-code-trusted-cluster-scaling-secure-ai-agent-workspaces-with-gke-agent-sandbox-1mk1</link>
      <guid>https://dev.to/gde/untrusted-code-trusted-cluster-scaling-secure-ai-agent-workspaces-with-gke-agent-sandbox-1mk1</guid>
      <description>&lt;p&gt;How gVisor-powered sandbox isolates AI-generated code at the kernel level and why it changes everything for multi-tenant agentic systems.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fffa10fmqc6d97z39yixp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fffa10fmqc6d97z39yixp.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this article we are going discuss on below points&lt;/p&gt;

&lt;p&gt;The problem with AI agents writing code&lt;br&gt;
What is GKE Agent Sandbox?&lt;br&gt;
How gVisor intercepts the kernel&lt;br&gt;
Architecture deep dive&lt;br&gt;
Setting it up: step by step&lt;br&gt;
Production patterns&lt;br&gt;
Conclusion&lt;/p&gt;

&lt;p&gt;There's a moment every engineer running AI agents eventually faces: an LLM generates a perfectly plausible subprocess.run() call, pipes it to bash -c, and realise that one prompt injection away from a full container escape. The code looks reasonable. The agent trusts itself. And cluster's blast radius just became everyone's problem.&lt;/p&gt;

&lt;p&gt;This is the defining security problem of the agentic era. Language models don't just generate text anymore  they write, execute, and iterate on code in tight feedback loops. The capabilities that make them useful (unrestricted Python, shell access, file I/O) are exactly the capabilities that make them dangerous in a shared cluster.&lt;/p&gt;

&lt;p&gt;Google's answer — &lt;strong&gt;GKE Agent Sandbox&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GKE Agent Sandbox&lt;/strong&gt; is built for agentic workloads that require high-level scale, extensibility, and security. Key benefits include:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kernel-level isolation&lt;/strong&gt;: Provides strong, kernel-level isolation for untrusted, LLM-generated code by using built-in GKE features like GKE Sandbox. Agent Sandbox also supports the open source Kata Containers software.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sub-second provisioning&lt;/strong&gt;: Offers an out-of-the-box mechanism to provide sandboxes significantly faster than standard Kubernetes Pod scheduling allows (typically &amp;lt;1s).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cloud-native extensibility&lt;/strong&gt;: Leverages the power of the Kubernetes paradigm and the managed infrastructure of GKE.&lt;/p&gt;

&lt;p&gt;By providing a declarative, standardized API, GKE Agent Sandbox offers a single-container experience that provides isolation and persistence characteristics similar to a virtual machine (VM), built entirely on Kubernetes primitives&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The problem with AI agents writing code&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Agentic AI systems whether you're building with LangGraph, AutoGen, Claude's tool-use API, or rolling your own share a common architectural pattern: the model generates code, a runtime executes it, results flow back to the model, and the loop continues. At each iteration, the model has broader context about what worked and what didn't. This is enormously powerful for automating complex tasks.&lt;/p&gt;

&lt;p&gt;It also creates an attack surface that traditional Kubernetes security was never designed to handle.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2bnsnippzxnj3bpgls0m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2bnsnippzxnj3bpgls0m.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Container escape&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;LLM-generated code exploits known kernel vulnerabilities or misconfigured capabilities to break out of the container boundary.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt injection via code output&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Malicious content in retrieved data embeds instructions that manipulate the agent into executing attacker-controlled payloads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lateral network movement&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;An agent with network access can enumerate internal services, extract credentials, and pivot across your cluster — all through legitimate-looking Python requests.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Filesystem exfiltration&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Without mount restrictions, agents can read service account tokens, Kubernetes secrets mounted as volumes, and host path data.&lt;/p&gt;

&lt;p&gt;Standard container security — &lt;strong&gt;securityContext&lt;/strong&gt;, network policies, Pod Security Admission  provides defence in depth but doesn't address the fundamental issue: containers share the host kernel. If the kernel has a vulnerability, a sufficiently motivated attacker (or sufficiently capable LLM) can exploit it regardless of namespace isolation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is GKE Agent Sandbox?&lt;/strong&gt;&lt;br&gt;
GKE Agent Sandbox is a Google-managed node pool configuration that applies gVisor-based container sandboxing specifically tuned for agentic AI workloads. &lt;/p&gt;

&lt;p&gt;At its core, it combines three things:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;gVisor runtime (runsc) as the default OCI runtime&lt;/strong&gt;&lt;br&gt;
Every container in the sandbox node pool runs under runsc instead of the standard runc. This intercepts all syscalls through a user-space kernel implementation called Sentry.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agent-specific resource isolation profiles&lt;/strong&gt;&lt;br&gt;
Pre-configured seccomp and AppArmor profiles optimised for Python/Node.js/container-in-container workloads that AI agents commonly generate. No manual tuning of syscall allowlists required for standard use cases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Integrated observability via Cloud Monitoring&lt;/strong&gt;&lt;br&gt;
Syscall audit logs, sandbox violation events, and resource consumption metrics flow automatically into Cloud Monitoring — giving you behavioural baselines for agent workloads without custom instrumentation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How gVisor intercepts the kernel&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Understanding what gVisor actually does is essential for reasoning about its security guarantees. The mental model most engineers have of containers — "a process with namespaces and cgroups" — breaks down when thinking about gVisor.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl88o6ihjts4yx3x32efa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl88o6ihjts4yx3x32efa.png" alt=" " width="800" height="419"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In a standard container, your application's open(), read(), execve(), and socket() calls go directly to the host Linux kernel via the system call interface. The kernel has to handle them, which means a kernel vulnerability is reachable from inside the container.&lt;/p&gt;

&lt;p&gt;With gVisor, those same syscalls are intercepted by Sentry  a Go implementation of the Linux kernel that runs entirely in user space. Sentry implements the Linux ABI from scratch. When your agent code calls execve(), it's Sentry that handles it, not the host kernel. Sentry then makes a much smaller set of calls to the actual host kernel (through a restricted interface called the "platform") to handle things like memory mapping and scheduling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;End-to-End Architectural Blueprint&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To isolate untrusted code execution while maintaining a highly responsive management plane, the architecture splits the cluster into two distinct, specialized node pools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Standard Node Pool (The Brain)&lt;/strong&gt;- This pool runs your trusted, long-lived orchestration services. Because this code is written and audited by your team, it runs on the standard Linux host kernel for maximum performance and native access to internal cluster resources.Agent Controller: The core engine managing the life cycle of AI agent tasks, spin-up times, and state tracking.Tool Router: Mediates external API calls and manages what capabilities (e.g., web search, database querying) are exposed to the agent.Result Collector: Aggregates outputs, logs, and state changes from the runtime pods.State &amp;amp; Storage (Postgres/Redis): Highly available data layers tracking session memory and agent state.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agent Sandbox Node Pool (The Muscle)&lt;/strong&gt; - This pool is dedicated entirely to executing untrusted code generated by AI models. It uses the runtimeClassName: gvisor configuration to enforce strict kernel-level isolation.Code Executor Pods ($N$ Pods): Ephemeral, rapid-churn pods designed to spin up, run a specific snippet of generated code, and terminate.The Sentry (User-Space Kernel): gVisor’s core component. Instead of letting a Python agent talk directly to the host Linux kernel via standard system calls (syscall()), the Sentry intercepts them. It implements a core suite of Linux kernel primitives in user-space, shielding the host bare-metal or VM infrastructure from container escape vulnerabilities.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkkmtjchyjhzkf893fs5k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkkmtjchyjhzkf893fs5k.png" alt=" " width="800" height="484"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workload Identity &amp;amp; RBAC Separation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;By separating Kubernetes Service Accounts (KSAs) and mapping them to distinct Google Cloud IAM Service Accounts, we eliminate the risk of privilege escalation if an agent is compromised.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Observability and Behavioral Analysis&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Because sandbox runtimes are naturally adversarial, observability shifts from standard application performance monitoring (APM) to real-time behavioral and security auditing&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2zegoq0vtbbkkubdxsxw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2zegoq0vtbbkkubdxsxw.png" alt=" " width="799" height="252"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Syscall Audit Logs&lt;/strong&gt;: gVisor provides structural logs of intercepted system calls via its internal logging mechanisms. Unusual system calls (e.g., attempts to call forbidden network protocols or direct raw socket manipulations) are immediately streamed to Cloud Logging.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Violation Events&lt;/strong&gt;: Any attempt by a sandboxed container to bypass the Sentry or execute an invalid operation triggers an immediate containment event, surfaced directly in Google Cloud Security Command Center.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cloud Monitoring&lt;/strong&gt;: Aggregates container-level metrics (CPU, Memory, Churn rate). Crucial for detecting malicious infinite loops or resource-exhaustion (DDoS) attempts disguised as AI agent tasks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cloud Trace&lt;/strong&gt;: End-to-end distributed tracing maps exactly how long a request spends routing through the Tool Router versus how long it spends executing inside the gVisor sandbox, allowing you to fine-tune the performance overhead introduced by user-space context switching.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Setting it up: step by step&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Here's a complete walkthrough from a fresh GKE cluster to a running sandboxed agent workload. This assumes you have gcloud, kubectl, and Terraform configured for project.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Farigxfvp4idh54alhp0t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Farigxfvp4idh54alhp0t.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production patterns&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pattern 1: Warm pool with pre-forked executors&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Cold-starting a new pod for every code execution adds latency. The standard pattern is to maintain a pool of warm executor pods that listen for work over a task queue (Pub/Sub or Redis Streams). The controller dispatches code snippets to idle executors; completed executors reset their environment and return to the pool. A garbage collection sidecar restarts pods that have been warm too long to prevent state accumulation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pattern 2: Execution budget enforcement&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AI agents can get into infinite loops. Beyond Kubernetes resource limits, apply an application-level timeout using Python's signal.alarm or Go's context cancellation. A 30-second wall-clock timeout with a 10-second CPU-time budget covers almost all legitimate agent code execution patterns while preventing runaway loops from consuming pool capacity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pattern 3: Network egress allow-listing per agent type&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Different agent personas have different legitimate network needs. A data analysis agent needs access to BigQuery and GCS. A web research agent needs HTTP egress to public internet. A code review agent needs neither. Model this with separate NetworkPolicies per agent label, and use PodSpec labels to bind agents to the right policy at scheduling time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The agentic era is here, and it runs on code execution. Whether you're building autonomous research assistants, DevOps automation agents, or data pipeline orchestrators,eventually going to need a principled answer to the question: what happens when the model writes something it shouldn't?&lt;/p&gt;

&lt;p&gt;GKE Agent Sandbox doesn't make the threat go away. Prompt injection is still a model-level problem. Lateral movement still requires complementary network controls. Secrets management still requires RBAC discipline. But the sandbox answers a specific, hard question — what if agent-generated code exploits a kernel vulnerability or escalates privileges? — with a credible, production-tested answer: it runs against Sentry, not your host kernel.&lt;/p&gt;

&lt;p&gt;For most teams running agentic workloads on GKE, the operational cost is low (a single node pool configuration), the performance cost is acceptable (single-digit percentages for typical agent workload patterns), and the security benefit is significant (kernel-level isolation with full Kubernetes observability).&lt;/p&gt;

&lt;p&gt;That's the architectural question GKE Agent Sandbox is designed to answer. Build agentic systems with the assumption that the code will sometimes be wrong, sometimes be manipulated, and occasionally be malicious  and design your execution environment accordingly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;References and Documentation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/how-to/agent-sandbox" rel="noopener noreferrer"&gt;https://docs.cloud.google.com/kubernetes-engine/docs/how-to/agent-sandbox&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/concepts/machine-learning/agent-sandbox" rel="noopener noreferrer"&gt;https://docs.cloud.google.com/kubernetes-engine/docs/concepts/machine-learning/agent-sandbox&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>googlecloud</category>
      <category>kubernetes</category>
      <category>cloud</category>
    </item>
    <item>
      <title>Running Agentic AI at Scale on Google Kubernetes Engine</title>
      <dc:creator>Saurabh Mishra</dc:creator>
      <pubDate>Wed, 08 Apr 2026 04:15:15 +0000</pubDate>
      <link>https://dev.to/gde/running-agentic-ai-at-scale-on-google-kubernetes-engine-2540</link>
      <guid>https://dev.to/gde/running-agentic-ai-at-scale-on-google-kubernetes-engine-2540</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6a4nj4y6y0hrn65ck1j2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6a4nj4y6y0hrn65ck1j2.png" alt=" " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The AI industry crossed an inflection point. We stopped asking "can the model answer my question?" and started asking "can the system complete my goal?" That shift from inference to agency changes everything about how we build, deploy, and scale AI in the cloud.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google Kubernetes Engine (GKE)&lt;/strong&gt; has quietly become the platform of choice for teams running production AI workloads. Its elastic compute, GPU node pools, and rich ecosystem of observability tools make it uniquely suited not just for model serving but for the orchestration challenges that agentic AI introduces.&lt;/p&gt;

&lt;p&gt;This blog walks through the full landscape: what kinds of AI systems exist today, how agentic architectures differ, and what it actually looks like to run them reliably on GKE.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The AI Taxonomy: From Reactive to Autonomous&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before diving into infrastructure, it's worth establishing what we mean by the different modes of AI deployment. Not all AI is "agentic," and the architecture you choose should match the behavior you need&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reactive / Inference&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Stateless prompt-response. One request, one LLM call, one answer. The model has no memory between turns. Examples: text classifiers, summarizers, one-shot code generators.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conversational AI&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Multi-turn dialog with session state. The model remembers context within a conversation window. Examples: customer support bots, document Q&amp;amp;A, coding assistants.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Retrieval-Augmented (RAG)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The model can query external knowledge at runtime before generating a response. Introduces a retrieval step vector DBs, semantic search, tool calls to databases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agentic AI&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The model plans, takes actions, observes results, and loops until a goal is reached. It can call tools, spawn subagents, and make decisions across many steps autonomously.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-Agent Systems&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A network of specialized agents collaborating: an orchestrator decomposes a task and delegates to researcher, writer, executor agents that work in parallel or sequence.&lt;br&gt;
Each mode up the stack introduces new infrastructure requirements: more state to manage, longer-lived processes, more concurrent workloads, harder failure modes, and deeper observability needs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkbzyxhkg7jgrmemr43df.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkbzyxhkg7jgrmemr43df.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why GKE for AI Workloads?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Kubernetes is table stakes for any modern distributed system. But GKE specifically brings several features that make it exceptional for AI:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GKE Capabilities for AI&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GPU and TPU Node Pools&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To handle the heavy lifting of Agentic AI, GKE offers specialized Accelerator Node Pools. This infrastructure allows you to dynamically attach high-end compute resources such as NVIDIA A100, H100, or L4 GPUs and Google TPUs exactly when your agents need them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workload Identity &amp;amp; Secret Management&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Agentic systems touch many external APIs (databases, external services, third-party tools). Workload Identity Federation lets pods authenticate to Google Cloud services without storing long-lived credentials.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Horizontal Pod Autoscaling with Custom Metrics&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Scale agent runner replicas based on queue depth (Pub/Sub backlog, Redis list length) rather than CPU. This allows demand-driven scaling that matches agent workload patterns precisely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GKE Autopilot &amp;amp; Standard Modes&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Autopilot mode handles node management entirely, ideal for teams wanting to focus on agent logic. Standard mode gives full control when you need custom kernel modules or specialized hardware affinity rules.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cloud Run on GKE for Burst Workloads&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Short-lived tool execution steps in an agent pipeline can be offloaded to Cloud Run, which scales to zero between invocations avoiding the overhead of always-on Kubernetes pods for infrequent task&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Anatomy of an Agentic AI System&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;An agentic AI system isn't a single process ,it's a distributed workflow. Understanding its components is essential before mapping it onto Kubernetes primitives.&lt;br&gt;
"An agent is an LLM that can observe the world, decide what to do next, and take actions - in a loop, until a goal is satisfied."&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2rds8i4jaksxqlfw6qe0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2rds8i4jaksxqlfw6qe0.png" alt=" " width="800" height="682"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Popular Agentic Frameworks on GKE&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Several frameworks have emerged to help teams build agentic systems without reinventing the orchestration wheel. Each has a different philosophy and maps to GKE differently.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F14blngtf6gnzngjm6m6c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F14blngtf6gnzngjm6m6c.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agent Development Kit (ADK)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Google's native framework for building multi-agent systems on Vertex AI. First-class GKE support, tight Gemini integration, built-in evaluation tools. Best choice for teams already on Google Cloud.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LangGraph&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Graph-based agent orchestration with explicit state machines. Excellent for complex branching workflows. Containerizes cleanly. LangSmith provides tracing that integrates with GKE logging pipelines&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CrewAI&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Defines agents as role-playing entities (Researcher, Writer, Editor) with goals and backstories. Simple to model complex human workflows. Ideal for content, analysis, and research pipelines.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google ADK on GKE &amp;gt;&amp;gt; Native Fit&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Google Agent Development Kit (ADK) is architected to treat Kubernetes as its primary "home," creating a seamless integration where the framework and the platform operate as one. Because ADK is built with a Kubernetes-native philosophy, it transforms GKE from a simple hosting environment into a specialized runtime for autonomous systems.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgewr3889vkrog78ogj1w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgewr3889vkrog78ogj1w.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Observability: The Hard Part&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Agentic systems fail in non-obvious ways. An agent might produce a response - but the response could be hallucinated, based on a failed tool call, or the result of an unintended plan branch. Standard HTTP error monitoring doesn't catch this.&lt;/p&gt;

&lt;p&gt;The recommended observability stack for GKE-based agentic systems:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Observability Stack&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenTelemetry Instrumentation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Instrument each agent with OpenTelemetry. Emit spans for every LLM call, tool invocation, and planning step. Export to Google Cloud Trace for full distributed trace visualization.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Structured Logging to Cloud Logging&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Log each reasoning step as a structured JSON event: task ID, agent ID, step number, prompt hash, tool name, tool result summary, token counts. Query across traces in BigQuery for post-hoc analysis.&lt;br&gt;
Custom Metrics via Cloud Monitoring&lt;/p&gt;

&lt;p&gt;Track agent-specific metrics: tasks completed per minute, average steps per task, tool call success rate, LLM latency P50/P95/P99, and hallucination rate from your eval pipeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LLM-specific Tracing (LangSmith / Vertex AI Eval)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Leverage LangSmith or Vertex AI's built-in evaluation capabilities to capture complete prompt–response interactions along with semantic quality metrics. These insights can then be fed back into your continuous improvement cycle.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security Considerations for Agentic AI on GKE&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Agents with tool use are a new attack surface. An agent that can execute code, send emails, or write to a database is a powerful actor - and must be treated like one.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpg3bsmcrua52mq5stzgr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpg3bsmcrua52mq5stzgr.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt Injection&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Malicious content in retrieved documents can instruct the agent to deviate from its goal. Sanitize all retrieved content before insertion into prompts. Use system-level guardrails in your LLM configuration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Privilege Escalation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each agent should operate with the minimum IAM permissions needed for its specific tools. Use Workload Identity with role-specific service accounts never a single all-powerful SA for all agents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Human-in-the-Loop Gates&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For irreversible actions (sending emails, deploying code, database writes), require a human approval step before execution. Implement approval workflows via Pub/Sub pause + Cloud Tasks callback.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Network Policies&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Use GKE Network Policies to restrict which agent pods can talk to which services. A researcher agent has no reason to reach the database writer service directly - enforce this in the cluster, not just in code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's Next: The Agentic Platform&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The direction of travel is clear. GKE is evolving from an application runtime into an agentic platform - a place where autonomous AI systems can be deployed, composed, monitored, and governed with the same rigor we apply to microservices today.&lt;br&gt;
Several emerging capabilities are worth tracking:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agent-to-Agent Communication (A2A Protocol)&lt;/strong&gt; - Google's emerging standard for cross-agent RPC, allowing agents built with different frameworks to interoperate. GKE provides the network fabric for this via internal load balancers and service mesh.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model Context Protocol (MCP) on Kubernetes&lt;/strong&gt; - MCP is becoming the standard way for agents to discover and call tools. Running MCP servers as sidecar containers or standalone Deployments in GKE makes tool registries cluster-native.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vertex AI Agent Engine&lt;/strong&gt; - Google's fully managed orchestration layer for agents that sits above GKE, handling session management, tool routing, and evaluation out of the box. The boundary between GKE and managed agent infrastructure will continue to blur.&lt;/p&gt;

&lt;p&gt;"Kubernetes wasn't built for AI. But it turns out the problems of distributed systems - scale, failure, state, observability - are exactly the problems agentic AI inherits."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Core Reference Documentation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.cloud.google.com/kubernetes-engine/docs/integrations/ai-infra" rel="noopener noreferrer"&gt;https://docs.cloud.google.com/kubernetes-engine/docs/integrations/ai-infra&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/GoogleCloudPlatform/accelerated-platforms/blob/main/docs/platforms/gke/base/use-cases/inference-ref-arch/README.md" rel="noopener noreferrer"&gt;https://github.com/GoogleCloudPlatform/accelerated-platforms/blob/main/docs/platforms/gke/base/use-cases/inference-ref-arch/README.md&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.cloud.google.com/agent-builder/agent-development-kit/overview" rel="noopener noreferrer"&gt;https://docs.cloud.google.com/agent-builder/agent-development-kit/overview&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hands-on Tutorials&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://codelabs.developers.google.com/devsite/codelabs/build-agents-with-adk-foundation" rel="noopener noreferrer"&gt;https://codelabs.developers.google.com/devsite/codelabs/build-agents-with-adk-foundation&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://cloud.google.com/blog/topics/developers-practitioners/build-a-multi-agent-system-for-expert-content-with-google-adk-mcp-and-cloud-run-part-1" rel="noopener noreferrer"&gt;https://cloud.google.com/blog/topics/developers-practitioners/build-a-multi-agent-system-for-expert-content-with-google-adk-mcp-and-cloud-run-part-1&lt;/a&gt;&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>cloud</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>Hooking up CrewAI with Google Gemini for Multi-Agent Automation Systems</title>
      <dc:creator>Saurabh Mishra</dc:creator>
      <pubDate>Mon, 16 Feb 2026 15:55:24 +0000</pubDate>
      <link>https://dev.to/gde/hooking-up-crewai-with-google-gemini-for-multi-agent-automation-systems-4eh3</link>
      <guid>https://dev.to/gde/hooking-up-crewai-with-google-gemini-for-multi-agent-automation-systems-4eh3</guid>
      <description>&lt;p&gt;Google’s AI ecosystem is vast and powerful, featuring &lt;strong&gt;Google Gemini models&lt;/strong&gt; (accessible via API) and &lt;strong&gt;Google AI Studio&lt;/strong&gt; (a brilliant web IDE for experimenting with and deploying generative AI apps). But what happens when you combine that raw reasoning capability with an autonomous orchestration framework?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CrewAI&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;CrewAI is an open-source Python framework that lets you build and orchestrate multiple AI agents that collaborate to accomplish complex tasks like a virtual team of specialists. It organizes agents, assigns them roles and lets them delegate and share tasks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Gemini + CrewAI?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;CrewAI allows you to define agents with highly specific roles, goals and backstories. Under the hood, it uses LiteLLM (or LangChain wrappers) to route calls to the language model of your choice.&lt;/p&gt;

&lt;p&gt;By hooking CrewAI into Google’s Gemini models (like gemini-2.5-flash or other models), we get:&lt;/p&gt;

&lt;p&gt;Lightning-fast reasoning required for agentic loops.&lt;br&gt;
Massive context windows for analyzing huge codebases, logs, or documentation.&lt;br&gt;
Natively integrated Google Search grounding, perfect for agents that need to research complex code, real-time data, or modern architecture patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: **Setup and Authentication&lt;/strong&gt;**&lt;br&gt;
To get started, we need to configure CrewAI to use Gemini models.&lt;/p&gt;

&lt;p&gt;Get Gemini API Key:&lt;/p&gt;

&lt;p&gt;Go to Google AI Studio or the Google Cloud console.&lt;br&gt;
Create an API key for Gemini.&lt;br&gt;
Save this API key , we’ll need it to authenticate your LLM in CrewAI.&lt;br&gt;
Install Dependencies: Install the required packages&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install crewai
python3.11 -m pip install langchain-google-genai

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;NOTE: langchain-google-genai requires Python 3.9+&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: **The Scenario &amp;amp; Initializing the Brain&lt;/strong&gt;**&lt;br&gt;
Let’s build a highly relevant, real-world scenario: An Automated Cloud Infrastructure Design Team. We will create a two-agent crew:&lt;/p&gt;

&lt;p&gt;A Principal Cloud Architect to design the system.&lt;br&gt;
A Lead DevSecOps Engineer to tear it apart and review it for vulnerabilities.&lt;br&gt;
First, let’s set up our script and initialize the Gemini “brain” using LangChain’s wrapper.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import os
from crewai import Agent, Task, Crew, Process
from langchain_google_genai import ChatGoogleGenerativeAI

# ==========================================
# 1. Configuration &amp;amp; Setup
# ==========================================
# Replace 'YOUR_API_KEY' with your actual Gemini API key, 
# or set it in your environment variables before running the script.
os.environ["GOOGLE_API_KEY"] = os.getenv("GOOGLE_API_KEY", "YOUR_API_KEY")

# Initialize the Gemini model
# Using gemini-2.5-flash for complex reasoning and architecture design
gemini_llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    temperature=0.4 # Slightly creative, but grounded in technical reality
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 3: **Defining the Agents&lt;/strong&gt;**&lt;br&gt;
Agents need a clear identity to function properly. In CrewAI, we define their role, goal, and backstory to give the LLM strict boundaries and deep, specialized context.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# ==========================================
# 2. Define the Agents
# ==========================================
cloud_architect = Agent(
    role='Principal Cloud Architect',
    goal='Design highly scalable, resilient, and cost-effective cloud infrastructures based on user requirements.',
    backstory=(
        "You are a seasoned cloud architect with 15+ years of experience across AWS, GCP, and Azure. "
        "You excel at designing modern microservices, serverless architectures, and event-driven systems. "
        "Your primary focus is ensuring the system can handle massive scale while keeping latency low."
    ),
    verbose=True,
    allow_delegation=False,
    llm=gemini_llm
)

devsecops_engineer = Agent(
    role='Lead DevSecOps Engineer',
    goal='Rigorously review cloud architectures to identify vulnerabilities, ensure compliance, and enforce zero-trust security.',
    backstory=(
        "You are a paranoid but brilliant cybersecurity veteran. You specialize in cloud security posture management, "
        "IAM least-privilege policies, network isolation, and data encryption. You view every architecture through "
        "the lens of a potential attacker and fix flaws before deployment."
    ),
    verbose=True,
    allow_delegation=False,
    llm=gemini_llm
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 4: **Defining the Tasks&lt;/strong&gt;**&lt;br&gt;
Agents are useless without clear instructions. Tasks in CrewAI define what needs to be done, the expected output, and who is responsible for executing it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# ==========================================
# 3. Define the Tasks
# ==========================================
project_scenario = (
    "A global e-commerce platform transitioning from a monolith to microservices. "
    "It requires secure user authentication, a high-throughput inventory management system, "
    "and seamless integration with third-party payment gateways. It anticipates massive traffic spikes during holiday sales."
)

design_task = Task(
    description=(
        f"Analyze the following project scenario: '{project_scenario}'.\n"
        "Create a comprehensive cloud architecture design. You must specify the cloud provider (or multi-cloud), "
        "compute resources, databases, caching layers, message queues, and content delivery networks. "
        "Justify why you chose these specific services."
    ),
    expected_output="A detailed Architectural Design Document outlining services, data flow, and scaling strategies.",
    agent=cloud_architect
)

security_review_task = Task(
    description=(
        "Critically review the Architectural Design Document produced by the Principal Cloud Architect. "
        "Identify at least 3 potential security vulnerabilities or single points of failure. "
        "Provide concrete, actionable remediations for each vulnerability (e.g., adding WAF, adjusting VPC peering, enforcing KMS encryption)."
    ),
    expected_output="A Security Audit Report listing vulnerabilities found, risk severity, and mandatory architecture modifications.",
    agent=devsecops_engineer
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 5: **Form the Crew and Execute!&lt;/strong&gt;&lt;br&gt;
**&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# ==========================================
# 4. Form the Crew and Execute
# ==========================================
cloud_engineering_crew = Crew(
    agents=[cloud_architect, devsecops_engineer],
    tasks=[design_task, security_review_task],
    process=Process.sequential, # The DevSecOps engineer waits for the Architect
    verbose=True
)

if __name__ == "__main__":
    print("Booting up the Automated Cloud Infrastructure Design Team...")
    print("Initiating CrewAI sequence. Please wait while the agents collaborate...\n")

    # Kickoff the process
    result = cloud_engineering_crew.kickoff()

    print("\n" + "="*50)
    print("FINAL DEVSECOPS REVIEW &amp;amp; SECURED ARCHITECTURE")
    print("="*50 + "\n")
    print(result)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Complete code:-&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import os
from crewai import Agent, Task, Crew, Process
from langchain_google_genai import ChatGoogleGenerativeAI

# ==========================================
# 1. Configuration &amp;amp; Setup
# ==========================================
# Replace 'YOUR_API_KEY' with your actual Gemini API key, 
# or set it in your environment variables before running the script.
os.environ["GOOGLE_API_KEY"] = os.getenv("GOOGLE_API_KEY", "YOUR_API_KEY")

# Initialize the Gemini model
# Using gemini-2.5-flash for complex reasoning and architecture design
gemini_llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",

    temperature=0.4 # Slightly creative, but grounded in technical reality
)

# ==========================================
# 2. Define the Agents
# ==========================================
cloud_architect = Agent(
    role='Principal Cloud Architect',
    goal='Design highly scalable, resilient, and cost-effective cloud infrastructures based on user requirements.',
    backstory=(
        "You are a seasoned cloud architect with 15+ years of experience across AWS, GCP, and Azure. "
        "You excel at designing modern microservices, serverless architectures, and event-driven systems. "
        "Your primary focus is ensuring the system can handle massive scale while keeping latency low."
    ),
    verbose=True,
    allow_delegation=False,
    llm=gemini_llm
)

devsecops_engineer = Agent(
    role='Lead DevSecOps Engineer',
    goal='Rigorously review cloud architectures to identify vulnerabilities, ensure compliance, and enforce zero-trust security.',
    backstory=(
        "You are a paranoid but brilliant cybersecurity veteran. You specialize in cloud security posture management, "
        "IAM least-privilege policies, network isolation, and data encryption. You view every architecture through "
        "the lens of a potential attacker and fix flaws before deployment."
    ),
    verbose=True,
    allow_delegation=False,
    llm=gemini_llm
)

# ==========================================
# 3. Define the Tasks
# ==========================================
# The scenario we want them to work on
project_scenario = (
    "A global e-commerce platform transitioning from a monolith to microservices. "
    "It requires secure user authentication, a high-throughput inventory management system, "
    "and seamless integration with third-party payment gateways. It anticipates massive traffic spikes during holiday sales."
)

design_task = Task(
    description=(
        f"Analyze the following project scenario: '{project_scenario}'.\n"
        "Create a comprehensive cloud architecture design. You must specify the cloud provider (or multi-cloud), "
        "compute resources, databases, caching layers, message queues, and content delivery networks. "
        "Justify why you chose these specific services."
    ),
    expected_output="A detailed Architectural Design Document outlining services, data flow, and scaling strategies.",
    agent=cloud_architect
)

security_review_task = Task(
    description=(
        "Critically review the Architectural Design Document produced by the Principal Cloud Architect. "
        "Identify at least 3 potential security vulnerabilities or single points of failure. "
        "Provide concrete, actionable remediations for each vulnerability (e.g., adding WAF, adjusting VPC peering, enforcing KMS encryption)."
    ),
    expected_output="A Security Audit Report listing vulnerabilities found, risk severity, and mandatory architecture modifications.",
    agent=devsecops_engineer
)

# ==========================================
# 4. Form the Crew and Execute
# ==========================================
cloud_engineering_crew = Crew(
    agents=[cloud_architect, devsecops_engineer],
    tasks=[design_task, security_review_task],
    process=Process.sequential, # The DevSecOps engineer waits for the Architect to finish
    verbose=True
)

if __name__ == "__main__":
    print("🚀 Booting up the Automated Cloud Infrastructure Design Team...")
    print("Initiating CrewAI sequence. Please wait while the agents collaborate...\n")

    # Kickoff the process
    result = cloud_engineering_crew.kickoff()

    print("\n" + "="*50)
    print("FINAL DEVSECOPS REVIEW &amp;amp; SECURED ARCHITECTURE")
    print("="*50 + "\n")
    print(result)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Results:-&lt;br&gt;
Run this script in terminal and watch Gemini stream its thought process&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe74xdclrloo4g9qk4f4o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe74xdclrloo4g9qk4f4o.png" alt=" " width="800" height="280"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbamv0hfmcav4e7kisoev.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbamv0hfmcav4e7kisoev.png" alt=" " width="800" height="107"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl2hyxs7cyzvt3xxt2339.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl2hyxs7cyzvt3xxt2339.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc2soh4tu12cnxo44olvz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc2soh4tu12cnxo44olvz.png" alt=" " width="800" height="366"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu5z1xhb4f1g3z16yh49k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu5z1xhb4f1g3z16yh49k.png" alt=" " width="800" height="502"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Integrate Other Google Tools (Optional)&lt;/strong&gt;&lt;br&gt;
Want to take this to the enterprise level? CrewAI supports robust integrations with Google’s Workspace apps via its enterprise platform/tools ecosystem&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google Drive&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can allow agents to upload/download files to Drive — useful for storing outputs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google Docs&lt;/strong&gt;&lt;br&gt;
Create, read, and edit Google Docs documents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google Sheets&lt;/strong&gt;&lt;br&gt;
Create, read, and update Google Sheets spreadsheets and manage worksheet data.&lt;/p&gt;

&lt;p&gt;To enable these, you connect your Google account via OAuth in CrewAI’s integrations dashboard then grant permissions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvv8p80lug2b95i7iathf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvv8p80lug2b95i7iathf.png" alt=" " width="800" height="322"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;**&lt;br&gt;
Documentation References**&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.crewai.com/en/introduction" rel="noopener noreferrer"&gt;https://docs.crewai.com/en/introduction&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://ai.google.dev/gemini-api/docs/crewai-example" rel="noopener noreferrer"&gt;https://ai.google.dev/gemini-api/docs/crewai-example&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://developers.googleblog.com/building-agents-google-gemini-open-source-frameworks/" rel="noopener noreferrer"&gt;https://developers.googleblog.com/building-agents-google-gemini-open-source-frameworks/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>gemini</category>
      <category>crewai</category>
      <category>googlecloud</category>
      <category>antigravity</category>
    </item>
  </channel>
</rss>
