Kim

Posted on Sep 22

Building AI Agents with GKE

#kubernetes #ai #mcp #python

This article was created as part of my official submission to the GKE Turns 10 Hackathon.

The challenge was simple but profound: take a standard microservices application the Online Boutique and "supercharge" it with AI without touching a single line of the original code. The goal was to build an external, containerized "brain" that could add a new layer of intelligence.

My answer? I built a fully autonomous, multi-agent AI team on Google Kubernetes Engine (GKE). Here's how I did it, and how GKE's features were fundamental to making this complex architecture work.

The Foundation: GKE and Service Discovery

The entire project relies on multiple new microservices (agents, servers, UIs) communicating with each other and with the original Online Boutique app. GKE's built-in service discovery made this incredibly simple.

By defining a Kubernetes Service, I could give each component a stable DNS name. For example, the Marketing Agent's Service looks like this:

apiVersion: v1
kind: Service
metadata:
  name: marketing-campaigner-service
spec:
  selector:
    app: marketing-campaigner-agent
  ports:
  - port: 8080
    targetPort: 8080

This simple YAML meant that my Business Analyst agent could reliably send A2A messages just by calling http://marketing-campaigner-service:8080/goal. GKE handled all the complex networking behind the scenes.

The "Senses": An MCP Server with a Secure Sidecar

The first component I built was an MCP Server to act as the system's "senses." It ingests real-time "add to cart" events from the app's frontend-proxy. To connect securely to the Cloud SQL database, I used the Cloud SQL Auth Proxy running as a sidecar container, a best practice made easy by GKE's pod architecture.

    spec:
      serviceAccountName: mcp-toolbox-sa
      containers:
      - name: server # My Python App
        image: ...
        env:
          - name: DB_HOST
            value: "127.0.0.1"
      - name: cloud-sql-proxy # The Sidecar
        image: [gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.8.0](http://gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.8.0)
        args: ["..."]

The "Strategist": A Scheduled Agent with `CronJob`

My Business Analyst agent doesn't need to run 24/7; it just needs to wake up periodically to check for trends. A GKE CronJob was the perfect tool for this. With a simple schedule expression, I could define this "sense-think-act" loop to run automatically.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: business-analyst-agent
spec:
  # Run every 5 minutes
  schedule: "*/5 * * * *"
  jobTemplate:

After deploying, GKE takes care of the rest, spinning up and shutting down pods on schedule. Watching the pods transition to Completed status was proof the autonomous loop was working.

The "Executor": Agent Collaboration via A2A

When the Analyst agent finds a trend, it delegates a task to the Marketing agent. This Agent2Agent (A2A) communication is just a simple, decoupled POST request, made possible by the GKE Service we defined earlier.


def send_a2a_goal(self, product_name):
    # GKE DNS resolves 'marketing-campaigner-service' to the correct pod IP
    [self.marketing](http://self.marketing)_agent_url = "http://marketing-campaigner-service:8080/goal"

    a2a_message = {
        "goal": "create_promotional_content",
        "payload": { "product_name": product_name }
    }
    [requests.post](http://requests.post)([self.marketing](http://self.marketing)_agent_url, json=a2a_message)

The Marketing agent, running in its own Deployment, simply listens for these incoming requests and executes its own logic.

Agent 3: The AI Specialist (A2A & MCP Client)

This was the most ambitious step. I replaced the original recommendationservice with my own AI-powered Recommendation Agent. This agent is a showcase of advanced agent design, using both A2A collaboration and the MCP pattern. It acts as a gRPC server, and when it receives a request from the frontend:

It makes an A2A call to the Marketing agent to ask which product is currently being promoted, demonstrating real-time agent collaboration.
It acts as an MCP Client, using the application's own productcatalogservice as a direct "tool" (via gRPC) to get rich, grounded context on the user's cart items.
It uses Gemini to generate intelligent recommendations based on all this combined context.

# recommendation-agent/agent.py
def ListRecommendations(self, request, context):
    # A2A Collaboration: Ask another agent for its state
    promoted_product = requests.get(MARKETING_AGENT_URL).json().get('product_name')

    # MCP Client Action (Tool Use): Get data from the application
    cart_product_names = [self.get_product_name_from_id(pid) for pid in request.product_ids]

    # Think with Gemini using the rich context
    prompt = f"User's Cart: {cart_product_names}, Promoted Product: '{promoted_product}'..."
    response = model.generate_content(prompt)
    # ...

Agent 4: The Conversational Assistant (Grounded MCP Client)

To add an interactive element, I built a Customer Support Agent. This agent is a prime example of an MCP Client that is designed to be user-facing and does not use A2A. Its sole focus is to help the user. It uses the application's productcatalogservice as a tool (via gRPC) to get real-time data. Crucially, it is explicitly "grounded"—instructed to only answer questions based on the data it retrieves. This prevents the AI from hallucinating and builds user trust.

final_prompt = f"""
Answer the user's question based ONLY on the product data provided. Do not make up information.

User's Question: "{question}"
Product Data: "{product_context}"
"""
final_response = model.generate_content(final_prompt)

The "Wow" Factor: Public UIs with `LoadBalancer`

For the final submission, I needed public URLs for the judges. GKE makes this incredibly simple. By changing one line in my dashboard's Service definition to type: LoadBalancer, GKE automatically provisioned a public IP address.

apiVersion: v1
kind: Service
metadata:
  name: dashboard-ui-service
spec:
  type: LoadBalancer
  selector:
    app: dashboard-ui
  ports:
  - port: 80
    targetPort: 8080

This allowed me to build and expose two user-facing components: our Mission Control Dashboard and an interactive Customer Support chatbot.

Main Dashboard

Customer support chat

Why GKE Was the Perfect Choice

Building this complex, multi-component system in such a short time would not have been impossible without GKE. It eased the process by providing:

Declarative Infrastructure: I could define my entire system in a few YAML files.
Seamless Service Discovery: GKE's internal DNS let my agents find and talk to each other effortlessly.
The Right Tool for the Job: Using CronJobs for scheduled tasks and Deployments for 24/7 services was trivial.
Built-in Resilience: GKE ensures that if a pod crashes, it's automatically restarted.
Effortless Scalability: If my agents were under heavy load, scaling up would be as simple as changing replicas: 1 to replicas: 5.

GKE truly is the ultimate platform for building and running the next generation of AI workloads.

The Final Architecture

🚀 Live Demos & Code Repository

Code Repository: https://github.com/ki3ani/thanos

Live Demo: Mission Control Dashboard: http://34.9.104.63/

Live Demo: AI Customer Support Chatbot: http://34.123.10.111/

DEV Community

Building AI Agents with GKE

The Foundation: GKE and Service Discovery

The "Senses": An MCP Server with a Secure Sidecar

The "Strategist": A Scheduled Agent with `CronJob`

The "Executor": Agent Collaboration via A2A

Agent 3: The AI Specialist (A2A & MCP Client)

Agent 4: The Conversational Assistant (Grounded MCP Client)

The "Wow" Factor: Public UIs with `LoadBalancer`

Main Dashboard

Customer support chat

Why GKE Was the Perfect Choice

The Final Architecture

Top comments (0)

The Foundation: GKE and Service Discovery

The "Senses": An MCP Server with a Secure Sidecar

The "Strategist": A Scheduled Agent with CronJob

The "Executor": Agent Collaboration via A2A

Agent 3: The AI Specialist (A2A & MCP Client)

Agent 4: The Conversational Assistant (Grounded MCP Client)

The "Wow" Factor: Public UIs with LoadBalancer

Main Dashboard

Customer support chat

Why GKE Was the Perfect Choice

The Final Architecture

The "Strategist": A Scheduled Agent with `CronJob`

The "Wow" Factor: Public UIs with `LoadBalancer`