Noble Ackerson

Posted on Jul 19

Your multi-agent system is probably slower than it needs to be

#multiagent #generativeui #agentdevelopmentkit #agentorchestration

From Sequential to Dynamic: Evolving a Generative UI Multi-Agent Architecture

The Problem I Started With

Building dashboards sucks. You spend hours configuring charts, mapping data, and making sure everything works together. I thought: what if I could just ask for what I want in plain English and get back a working React dashboard?

So back to the drawing board, I built a system that does exactly that. Natural language in, interactive React components out. The magic happens through AI agents that specialize in different parts of dashboard creation.

But here's the thing - my first version was painfully slow. Users would ask for a dashboard and then... wait. And wait some more. Everything happened one step at a time, like being stuck behind someone counting exact change at the grocery store.

This post is essentially a part two of my Agentic Component Recognition and GenerativeUI applied research (see part one below) fixed that using Google's Agent Development Kit (ADK) and went from "please wait 30 seconds" to "here's your dashboard streaming in real-time."

Version 1: The Slow Sequential Approach

Our original system worked like an assembly line from the 1950s. One agent would finish, pass the work to the next agent, who would finish, pass it on, and so on.

# What I started with - everything in sequence
from google.adk.agents import SequentialAgent, LlmAgent

def create_dashboard_the_slow_way():
    chart_agent = LlmAgent(name="chart_maker", tools=[chart_tools])
    map_agent = LlmAgent(name="map_maker", tools=[map_tools])  
    layout_agent = LlmAgent(name="layout_maker", tools=[layout_tools])

    # This runs one agent at a time, in order
    # Chart agent finishes -> Map agent starts -> Layout agent starts
    return SequentialAgent(
        name="slow_dashboard_maker",
        sub_agents=[chart_agent, map_agent, layout_agent]
    )

The math was brutal:

Chart agent: 5 seconds
Map agent: 5 seconds
Layout agent: 5 seconds
Total: 15+ seconds of staring at loading spinners

Users would request a dashboard and then go make coffee. That's not the experience I wanted.

The Fix: Three Phases of Getting Faster

Phase 1: Parallel Tools Within Agents

First fix was obvious once I saw it. Individual agents were doing multiple things sequentially too. Our chart agent would create a bar chart, wait for it to finish, then create a line chart, then create a pie chart.

Why not do all three at the same time?

# Let agents do multiple things at once
import asyncio

async def make_charts_parallel(chart_types):
    tasks = []
    for chart_type in chart_types:
        tasks.append(create_chart(chart_type))

    # All charts get created simultaneously
    results = await asyncio.gather(*tasks)
    return results

This was like going from dial-up to broadband for individual agents. Our chart agent went from 10 seconds to 2 seconds when it needed multiple charts.

Phase 2: Parallel Agents

Next problem: independent agents were still waiting in line. Chart generation and map generation have nothing to do with each other, so why does the map agent wait for the chart agent to finish?

ADK has a ParallelAgent that handles this perfectly:

# Multiple agents working simultaneously
from google.adk.agents import ParallelAgent

def create_dashboard_faster(query):
    # Figure out what we actually need
    agents_needed = []

    if "chart" in query.lower():
        agents_needed.append(chart_agent)
    if "map" in query.lower():
        agents_needed.append(map_agent)
    if "accessible" in query.lower():
        agents_needed.append(accessibility_agent)

    # Run only what we need, all at the same time
    parallel_maker = ParallelAgent(
        name="fast_dashboard_maker",
        sub_agents=agents_needed
    )

    return parallel_maker

Now instead of 15 seconds sequential, we're looking at 5-7 seconds with results streaming in as each agent finishes.

Phase 3: Dynamic Agent Creation

The real breakthrough was realizing I didn't need to keep all these agents sitting around doing nothing. Why have a map agent loaded in memory if someone just wants a simple chart?

I built an agent factory that creates exactly what each query needs:

# Create agents on demand
class AgentFactory:
    def __init__(self):
        # Templates for different types of agents
        self.templates = {
            "chart_basic": ChartTemplate(model="gemini-flash", complexity="basic"),
            "chart_advanced": ChartTemplate(model="gemini-pro", complexity="advanced"),
            "map_basic": MapTemplate(model="gemini-flash", complexity="basic"),
            # etc...
        }

    async def make_dashboard(self, query, requirements):
        # What do I actually need for this query?
        needed_capabilities = self.analyze_query(query)

        # Create only those agents
        agents = []
        for capability in needed_capabilities:
            template = self.templates[capability]
            agent = template.create_agent()
            agents.append(agent)

        # Run them in parallel
        if len(agents) > 1:
            orchestrator = ParallelAgent(sub_agents=agents)
        else:
            orchestrator = agents[0]

        try:
            return await orchestrator.run(query)
        finally:
            # Clean up when done
            self.cleanup_agents(agents)

This is where things got interesting. I could:

Spin up exactly the right agents for each query
Use lightweight agents for simple requests
Use heavyweight agents for complex analysis
Clean up everything when done

Real-World Examples

Simple query: "Show me Q4 sales trends"

Factory creates: 1 basic chart agent
Time: ~3 seconds
Memory: Minimal

Complex query: "Show regional sales with accessible high-contrast visualization and executive summary"

Factory creates: Chart agent + Map agent + Accessibility agent + Summary agent
Time: ~6 seconds (all running in parallel)
Memory: Only what's needed, cleaned up after

Follow-up query: "Make that chart bigger"

Factory creates: 1 lightweight chart modifier agent
Time: ~1 second
Memory: Almost nothing

ADK Made This Possible

None of this would have worked without ADK's built-in features:

Session Management: Agents automatically share data through ADK's session state. Chart agent puts data in session, map agent reads it out. No complicated message passing.

Streaming: Results show up in the UI as soon as each agent finishes, not when everything is done.

Callbacks: We can monitor performance, catch errors, and log everything without cluttering our agent code.

# ADK handles the hard parts
from google.adk.core.callbacks import before_agent, after_agent

@before_agent
def log_start(agent_name, context):
    print(f"Starting {agent_name}")

@after_agent  
def log_finish(agent_name, result, context):
    print(f"{agent_name} finished in {context.duration}ms")

The Results

Before: "Please wait 20-30 seconds for your dashboard"
After: "Here's your chart... here's your map... here's your final dashboard" (streaming in real-time)

Numbers:

70% faster for complex dashboards
60% less memory usage (only create what you need)
90% fewer timeout errors (parallel execution is more resilient)

But the real win was user experience. Instead of staring at loading spinners, users watch their dashboard come together piece by piece.

What I Learned

Start simple: I began with SequentialAgent, figured out what worked, then made it faster.

Use ADK's built-ins: Don't reinvent session management, streaming, or callbacks. ADK already solved these problems.

Parallel isn't always better: For dependencies (like layout composition), you still need sequential steps.

Clean up your mess: Dynamic agent creation means dynamic cleanup. Don't leave agents sitting around eating memory.

Monitor everything: You can't optimize what you don't measure. ADK's callback system makes this easy.

Future Plans

What's next? I'm working on:

Smart agent sizing: Use small models for simple queries, big models for complex analysis.

Learning from history: Track which agent combinations work best for different types of queries.

Cost optimization: Balance speed vs. cost by choosing the right models for each task.

Cross-query caching: If two users ask for similar dashboards, reuse what we can.

If you're interested in contributing to this create an issue or for the repo (link below)

Try It Yourself

The complete code is available in our GenUI repo. You can run the ADK web interface locally:

# Install ADK
pip install google-adk

# Run the web UI
adk web genui_agent/ --port 8080

# Visit http://localhost:8080

The ADK documentation has everything you need to build your own multi-agent systems. Start with their quickstart guides and work your way up to parallel orchestration.

Bottom Line

Building responsive AI systems isn't about having the biggest models or the most complex architecture. It's about using the right tools efficiently and not making users wait when they don't have to.

ADK gave us the building blocks. I just had to stop thinking sequentially and start thinking in parallel.

Code examples and implementation details are available at github.com/stigsfoot/google-ai-sprint. Questions? Find me on LinkedIn @noblea or check out the ADK documentation.

Resources:

Part 1 of this series
Accompanying video - part 1 - old version
Subscribe to my YouTube - Subscribers should reach out for access to colab, and notebookLM to learn about ACR.
Google ADK Docs - Start here
ADK GitHub - Source code

DEV Community