PRASAD TILLOO

Posted on Dec 13, 2025

From Student to AI Architect: How Multi-Agent Systems Rewired My Understanding of Intelligent Applications

#devchallenge #googlekagglechallenge #ai #aiagents

This is a submission for the Google AI Agents Writing Challenge: Learning Reflections & Capstone Showcase

My Learning Journey

Five days ago, I submitted an AI Photography Coach—a multi-agent system capstone project for the 5-Day AI Agents Intensive Course with Google that fundamentally changed how I think about building intelligent applications. This wasn't just another capstone project. It forced me to confront a question I'd been wrestling with for months: What separates a system that appears intelligent from one that genuinely solves problems?

Coming into the Google AI Agents Intensive, I understood agents conceptually. I'd read the papers. I'd tinkered with LLMs. But there was a critical gap between understanding and architecting—and this course obliterated that gap.

The breakthrough came on Day 2 when we deconstructed the difference between monolithic LLM calls and specialized agent systems. Most people use LLMs like Swiss Army knives—one model trying to do everything. The course showed me something radical: the power isn't in having one smart model; it's in having many focused ones working together.

Key Concepts & Technical Deep Dive

The ADK Native Orchestrator Pattern: Architecture That Actually Works

The game-changer for my photography coach was understanding the ADK-native orchestrator pattern. Instead of building a custom routing system, I leveraged Google's Agent Development Kit's built-in orchestration capabilities.

Here's the architecture that makes this work:

Core Agents (Shared):

Vision Agent (Sub-Agent 1): Uses Gemini 2.5 Flash Vision for image analysis—EXIF extraction, composition analysis, defect detection with severity scoring, and strength identification
Orchestrator Agent (Parent): The intelligent coordinator that manages session state, routes requests to specialized sub-agents, implements context compaction, and persists memory using SQLite + ADK Cloud Memory adapters
Knowledge Agent (Sub-Agent 2): Powered by Gemini 2.5 Flash with hybrid CASCADE RAG for query understanding, knowledge retrieval, response generation, citation grounding, and skill-level adaptation

Key Pattern: Orchestrator Mediates All Communication

This is critical: the orchestrator mediates all agent communication. The Vision Agent doesn't talk directly to the Knowledge Agent. Instead:

Vision Agent outputs structured analysis (exif dict, composition_summary, detected_issues, strengths)
Orchestrator aggregates this with session context
Knowledge Agent receives unified input context and generates the coaching response
Orchestrator updates conversation history and persists session state

This eliminates cascading errors and makes the entire system debuggable in ways direct agent-to-agent communication could never achieve. It's a pattern, not just a feature—and it's built into ADK natively.

Three Infrastructure Approaches: One System, Multiple Deployments

What fascinated me was how the same core agents can run through three different interfaces. This distinction between agent architecture and deployment architecture was the second major revelation:

1. ADK Runner (Cloud)

Components: LlmAgent, Runner, Sessions
Interface: Vertex AI / Cloud Run
When to use: Production-grade photo coaching with cloud scalability and managed infrastructure

2. MCP Server (Desktop)

Components: JSON-RPC 2.0 over stdio transport
Capabilities: 3 tools exposed per agent
Deploy: Claude Desktop, local machine
When to use: Local development, integration with Claude, running alongside other MCP-compatible tools
This was the breakthrough for me—MCP protocol meant I could integrate my agents with any compatible application without rewriting core logic

3. Python API (Custom)

Components: Direct imports, function calls
Deploy: Notebooks, custom apps, Streamlit dashboards
When to use: Research, experimentation, embedded systems, educational contexts

The realization: agent architecture is orthogonal to deployment architecture. Design the agent system once (orchestrator + specialized agents), then expose it through whichever interface makes sense for your use case. This separation of concerns is elegant and powerful.

The Critical Insight: Negative Space Design

During debugging, I discovered something counterintuitive: the best agent isn't the one with the smartest prompts; it's the one with the clearest responsibilities.

I spent as much time defining what each agent should not do as defining what it should do:

Vision Agent: Analyzes only what's in the image. Never generates teaching advice or pedagogical content.
Knowledge Agent: Teaches based on provided analysis. Never re-analyzes images or duplicates vision work.
Orchestrator: Routes and aggregates. Never generates original analysis or coaching—only synthesis.

This negative space design—drawing boundaries tighter than seemed necessary—eliminated entire categories of bugs. It forced each agent's responsibility to be so crystalline that context compaction became natural, error handling became obvious, and delegation logic became transparent.

Context Engineering and Memory as Foundation

The course's emphasis on context compaction changed how I architect systems. In a multi-agent ecosystem, context is a resource, not a convenience.

The photography coach uses a two-tier memory system:

Session memory: Short-term context about current analysis and conversation
User model: Long-term history of preferences, skill progression, learning patterns

The orchestrator implements context compaction before passing context between agents:

Summarizing vision analysis into structured fields (rather than raw model output)
Truncating conversation history intelligently
Maintaining only relevant user profile context

This isn't optimization; it's architectural necessity. With three agents and multiple turns, uncompressed context balloons quickly. Compaction forces rigor in what information actually matters for decision-making.

Tools: The Backbone of Agent Capability

The course reframed my entire thinking: agents aren't intelligent because of their prompts; they're intelligent because of their tools.

For the photography coach:

Vision APIs: Constrain analysis to structured outputs
Vector Database (CASCADE hybrid RAG): Guarantee knowledge comes from grounded sources
Custom Tools: Photography-specific calculations (depth of field relationships, shutter speed ratios, focal length conversions)
Memory Tools: SQLite adapters for persistence

Each tool is a constraint that prevents hallucination. When a Vision Agent can only output structured EXIF data and composition summaries, it can't invent. When the Knowledge Agent can only pull from photography principles via RAG, its advice has traceable citations. Tools aren't features you add; they're guardrails you build into the system's fabric.

Reflections & Takeaways

What the Course Got Right

The hands-on codelabs genuinely built intuition. I didn't just read about multi-agent systems; I implemented them, broke them, debugged them, rebuilt them. The guest speakers—engineers shipping agentic AI at scale—grounded theory in production reality. Learning about the ADK's orchestrator pattern in isolation, then building it into a real system, created understanding that no lecture could achieve.

The emphasis on architecture as design constraint was transformative. Before this course, I thought about features and interfaces. Now I think about specialization, coordination, failure modes, and the boundaries between components.

Honest Critique

The course could dive deeper into failure modes in multi-agent systems. They fail in new ways: cascading errors compounding across agents, subtle bugs in delegation logic, context compaction artifacts that only emerge in production. A dedicated deep-dive would be invaluable.

More explicit guidance on choosing deployment interfaces would help practitioners. The fact that one agent system can work through ADK Runner, MCP Server, or custom Python API is powerful—but knowing when to use each requires hands-on experience or mentorship.

How This Changes What I Build Next

I'm now architecting systems fundamentally differently:

Define agent specialization and boundaries first, before any code
Treat the orchestrator pattern as primitive, not optional
Make context compaction a first-class design concern
Use tools to constrain behavior, not enhance capability
Choose deployment interface after agent architecture is finalized, not before

The photography coach is just the beginning. The real power is understanding that intelligent systems are built through specialization and clear boundaries, not through smarter prompts or larger models. Architecture beats parameters every time.

The Bigger Picture

If you're considering the AI Agents Intensive: do it. But go in expecting it to change your architecture mindset, not just teach you new libraries.

The future of AI isn't smarter models—it's smarter systems. Systems that know their limitations, delegate to specialists, maintain clear boundaries, and communicate through structured protocols. Systems where architecture is a design tool, not an afterthought. That's what this course teaches. That's what matters now.

Technical Stack & Architecture

Core Agents (ADK Native):

Vision Agent: Gemini 2.5 Flash Vision (image analysis, EXIF extraction, composition scoring, defect detection)
Orchestrator Agent: Session management, context compaction, routing, memory persistence
Knowledge Agent: Gemini 2.5 Flash + Hybrid CASCADE RAG (knowledge retrieval, citations, skill adaptation)

Memory & Persistence:

SQLite for session state
ADK Cloud Memory adapters
Conversation history management
User model tracking

Deployment Options:

ADK Runner: Cloud/Vertex AI production deployment
MCP Server: Desktop deployment with JSON-RPC 2.0 (Claude Desktop, local tools)
Python API: Notebooks, Streamlit, custom applications

Integration Patterns:

Orchestrator-mediated agent communication (no direct agent-to-agent)
Structured context passing between agents
RAG-grounded knowledge retrieval with citations
Context compaction before inter-agent communication

Project Links:

DEV Community