DEV Community

Hiroshi Toyama
Hiroshi Toyama

Posted on

Two Nasty Gotchas When Building Multi-Agent Systems with Google ADK

Google's Agent Development Kit (ADK) makes it straightforward to compose LlmAgent instances into multi-agent hierarchies. But two bugs bit me hard in production that aren't documented anywhere. Here's what happened and how to fix them.

The Setup

A root router LlmAgent with two sub-agents. Both sub-agents are module-level singletons — instantiated at import time, referenced from the root agent's constructor.

# Agents/my_app/root_agent.py
from Agents.my_app.sub_agent_a.agent import sub_agent_a
from Agents.my_app.sub_agent_b.agent import sub_agent_b

def _build_sub_agents() -> list:
    return [sub_agent_a, sub_agent_b]

root_agent = LlmAgent(
    name="my_app",
    sub_agents=_build_sub_agents(),
    ...
)
Enter fullscreen mode Exit fullscreen mode

Worked fine locally with adk web. Blew up on Cloud Run.


Bug 1: Agent already has a parent agent on module reload

The error

pydantic_core._pydantic_core.ValidationError: 1 validation error for LlmAgent
  Value error, Agent `SubAgentA` already has a parent agent,
  current parent: `my_app`, trying to add: `my_app`
Enter fullscreen mode Exit fullscreen mode

What's happening

ADK's agent_loader calls importlib.import_module(agent_name) on every request. On the first request, it loads the module fresh and creates root_agent. The LlmAgent constructor sets sub_agent.parent_agent = root_agent for each sub-agent.

On the second request, agent_loader reloads the module. Because sub_agent_a and sub_agent_b are module-level singletons, they're the same Python objects from the previous load — still carrying their parent_agent reference. When the new LlmAgent tries to assign the parent again, pydantic's validator rejects it.

# Inside ADK's LlmAgent.__init__ (simplified)
for sub in sub_agents:
    if sub.parent_agent is not None:
        raise ValueError(f"Agent `{sub.name}` already has a parent agent ...")
    sub.parent_agent = self
Enter fullscreen mode Exit fullscreen mode

This never surfaces locally because adk web loads the module only once per session. Cloud Run's request-per-reload behavior is what triggers it.

The fix

Reset parent_agent to None before passing sub-agents to the constructor:

def _build_sub_agents() -> list:
    agents = [sub_agent_a, sub_agent_b]
    for agent in agents:
        agent.parent_agent = None  # reset before each reload
    return agents
Enter fullscreen mode Exit fullscreen mode

This is safe because the assignment happens synchronously before the new parent is set.


Bug 2: Context variable not found in instruction strings

The error

KeyError: 'Context variable not found: `hostname`.'
Enter fullscreen mode Exit fullscreen mode

Traceback points here:

File ".../google/adk/utils/instructions_utils.py", line 124, in inject_session_state
    return await _async_sub(r'{+[^{}]*}+', _replace_match, template)
Enter fullscreen mode Exit fullscreen mode

What's happening

ADK injects session state into agent instructions at runtime. The mechanism scans the instruction string with the regex r'{+[^{}]*}+' and replaces every {var_name} with the corresponding session state value.

If your instruction contains an example URL or any template-like text with curly braces:

The URL format is `https://{hostname}/api/{resource_id}/`
Enter fullscreen mode Exit fullscreen mode

ADK sees {hostname}, looks it up in session state, finds nothing, raises KeyError.

My first instinct was to double-brace escape like Python's .format():

https://{{hostname}}/api/{{resource_id}}/
Enter fullscreen mode Exit fullscreen mode

This does not work. The regex is {+[^{}]*}+ — it matches one or more { characters followed by non-brace characters followed by one or more } characters. {{hostname}} still matches.

The fix

Don't use curly braces for literal placeholder text in instructions:

The URL format is `https://<hostname>/api/<resource_id>/`
Enter fullscreen mode Exit fullscreen mode

More broadly: any {word} pattern in an ADK instruction string is treated as a session state variable, regardless of how many braces you use. Use angle brackets, square brackets, or prose for template-like text in prompts.


Summary

Bug Trigger Fix
parent_agent collision Module-level singleton sub-agents + ADK module reload per request Reset agent.parent_agent = None before passing to constructor
Context variable not found {word} patterns in instruction strings Use <word> or square brackets instead

Both are easy to fix once you know what's happening, but the error messages don't immediately point to the root cause. The parent_agent one is especially sneaky — it only appears in production where the module is reloaded per request, never in adk web during local development.

Top comments (0)