Ömer Berat Sezer

Posted on Aug 13

Multi-agents: Parallel Agents using Google ADK 🤖 Gemini, Fast API, Streamlit - Report Generation

#ai #programming #tutorial #discuss

In the past 4-5 months, TWO powerful AI agent development frameworks have been released:

Google Agent Development Kit (ADK)
AWS Strands Agents

You can view the other posts in the Google ADK series above. In our current post, we’ll dive into the Google Agent Development Kit (ADK) and show how to create workflow parallel agents using ADK, Gemini 2.5, FastAPI, and a Streamlit interface.

What is Google Agent Development Kit?
What is Multi-Agent, Parallel Agents?
Agent App
Conclusion
References

What is Google Agent Development Kit?

Agent Development Kit (ADK) is an open-source framework to develop AI agents to run anywhere:
- VSCode, Terminal,
- Docker Container
- Google CloudRun
- Kubernetes

What is Multi-Agent, Parallel Agents?

The ParallelAgent is a workflow agent designed to run its sub-agents simultaneously, significantly accelerating workflows where tasks are independent.

When the run_async() method of the ParallelAgent is invoked:

Simultaneous Execution: It triggers the run_async() method for all agents in the sub_agents list at the same time.
Isolated Processes: Each sub-agent runs in its own thread of execution without automatically sharing conversation history or state with the others.
Gathering Results: After execution, the ParallelAgent handles collecting the results from each sub-agent. These results are typically accessible through a list or event-based mechanism, but their order might not be predictable.

It's important to recognize that sub-agents within a ParallelAgent operate independently. If coordination or data sharing is required between them, it must be handled manually. Here are some possible strategies:

Shared InvocationContext: You can provide all sub-agents with the same InvocationContext, using it as a shared data container. However, to prevent race conditions, concurrent access must be managed carefully—typically with synchronization mechanisms like locks.
External State Handling: Leverage external tools such as databases or message queues to manage shared state and facilitate inter-agent communication.
Post-Execution Coordination: Let each sub-agent complete its task independently, then process and reconcile their outputs afterward to coordinate data.

Details: https://google.github.io/adk-docs/agents/workflow-agents/parallel-agents/#independent-execution-and-state-management

Agent App

Sample project on GitHub:

https://github.com/omerbsezer/Fast-LLM-Agent-MCP/tree/main/agents/google_adk/07-multiagent-workflow-parallel

Installing Dependencies & Reaching Gemini Model

Go to: https://aistudio.google.com/
Get API key to reach Gemini
Please add .env with Gemini and Serper API Keys

# .env
GOOGLE_GENAI_USE_VERTEXAI=FALSE
GOOGLE_API_KEY=PASTE_YOUR_ACTUAL_API_KEY_HERE

Please install requirements:

fastapi
uvicorn
google-adk
google-generativeai

Frontend - Streamlit UI

# app.py
import streamlit as st
import requests

st.set_page_config(page_title="Agent Chat", layout="centered")

if "messages" not in st.session_state:
    st.session_state.messages = []

st.title("Multi-Agent Paralel Researcher")

for msg in st.session_state.messages:
    with st.chat_message(msg["role"]):
        st.markdown(msg["content"])

user_query = st.chat_input("Ask to search for real-time data or anything...")

# send and display user + assistant messages
if user_query:
    st.chat_message("user").markdown(user_query)
    st.session_state.messages.append({"role": "user", "content": user_query})
    try:
        response = requests.post(
            "http://localhost:8000/ask",
            json={"query": user_query}
        )
        response.raise_for_status()
        # respond is not str, it is list, because of multiple agent responds.
        all_replies = response.json().get("responses", ["No response."])

        for reply in all_replies:
            st.chat_message("assistant").markdown(reply)
            st.session_state.messages.append({"role": "assistant", "content": reply})

    except Exception as e:
        error_msg = f"Error: {str(e)}"
        st.chat_message("assistant").markdown(error_msg)
        st.session_state.messages.append({"role": "assistant", "content": error_msg})

Backend - Agent

# agent.py
from fastapi import FastAPI
from pydantic import BaseModel
from dotenv import load_dotenv
from google.genai import types
from google.adk.agents import Agent
from google.adk.agents.llm_agent import LlmAgent
from google.adk.agents.sequential_agent import SequentialAgent
from google.adk.agents.parallel_agent import ParallelAgent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.memory import InMemoryMemoryService
from google.adk.tools import google_search
import uvicorn

# Load environment variables
load_dotenv()

# Define model and app constants
MODEL = "gemini-2.5-flash"
APP_NAME = "search_memory_app"
USER_ID = "user123"
SESSION_ID = "session123"

# --- Agent Definitions ---
# the Pydantic model for the output of the TopicSetterAgent
class TopicOutput(BaseModel):
    subtopic_1: str
    subtopic_2: str
    subtopic_3: str

topic_setter = LlmAgent(
    name="TopicSetterAgent",
    model=MODEL,
    instruction="""
    You are a research planner. Your task is to break the user's input query into three distinct and relevant subtopics.
    Respond ONLY in this format:

    {
    "subtopic_1": "...",
    "subtopic_2": "...",
    "subtopic_3": "..."
    }
    """,
    output_schema=TopicOutput,
    output_key="topic_output"
)

researcher_agent_1 = LlmAgent(
    name="SubResearcherOne",
    model=MODEL,
    instruction="""
    You are a research assistant. Research the subtopic: "{topic_output.subtopic_1}".
    Summarize your findings in 1-2 sentences.
    """,
    tools=[google_search],
    output_key="result_1"
)

researcher_agent_2 = LlmAgent(
    name="SubResearcherTwo",
    model=MODEL,
    instruction="""
    You are a research assistant. Research the subtopic: "{topic_output.subtopic_2}".
    Summarize your findings in 1-2 sentences.
    """,
    tools=[google_search],
    output_key="result_2"
)

researcher_agent_3 = LlmAgent(
    name="SubResearcherThree",
    model=MODEL,
    instruction="""
    You are a research assistant. Research the subtopic: "{topic_output.subtopic_3}".
    Summarize your findings in 1-2 sentences.
    """,
    tools=[google_search],
    output_key="result_3"
)

# final synthesis
merger_agent = LlmAgent(
    name="ResearchSynthesizer",
    model=MODEL,
    instruction="""
    You are a synthesis assistant. Merge the three research summaries below into a single cohesive research report.

    Subtopic 1 ({topic_output.subtopic_1}): {result_1}
    Subtopic 2 ({topic_output.subtopic_2}): {result_2}
    Subtopic 3 ({topic_output.subtopic_3}): {result_3}

    Write in this format:

    ## Research Summary on the Given Topic

    ### {topic_output.subtopic_1}
    [result_1]

    ### {topic_output.subtopic_2}
    [result_2]

    ### {topic_output.subtopic_3}
    [result_3]

    ### Conclusion
    Write 2-3 sentences summarizing the topic overall.
    """
)

parallel_research_agent = ParallelAgent(
    name="ParallelWebResearchAgent",
    sub_agents=[researcher_agent_1, researcher_agent_2, researcher_agent_3],
    description="Runs multiple research agents in parallel to gather information."
)

research_pipeline = SequentialAgent(
    name="GeneralResearchPipeline",
    description="Extracts a topic, researches it in parallel, then synthesizes a report.",
    sub_agents=[topic_setter, parallel_research_agent, merger_agent],
)

root_agent = research_pipeline

session_service = InMemorySessionService()
memory_service = InMemoryMemoryService()

app = FastAPI()

class QueryRequest(BaseModel):
    query: str

@app.on_event("startup")
async def startup_event():
    """
    Initializes the session and runner on application startup.
    """
    await session_service.create_session(
        app_name=APP_NAME,
        user_id=USER_ID,
        session_id=SESSION_ID
    )

    global runner
    runner = Runner(
        agent=root_agent,
        app_name=APP_NAME,
        session_service=session_service,
        memory_service=memory_service
    )

@app.post("/ask")
async def ask_agent(req: QueryRequest):
    """
    Endpoint to send a query to the agent pipeline.
    """
    content = types.Content(role="user", parts=[types.Part(text=req.query)])
    # `runner.run()` is a generator, not a coroutine, so it should not be awaited.
    events = runner.run(user_id=USER_ID, session_id=SESSION_ID, new_message=content)

    responses = []
    for event in events:
        if event.is_final_response() and event.content and event.content.parts:
            responses.append(event.content.parts[0].text)

    # Await both the get_session and add_session_to_memory calls.
    session = await session_service.get_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)
    await memory_service.add_session_to_memory(session)

    return {"responses": responses or ["No response received."]}

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)

Run & Demo

Run backend (agent.py):

$  uvicorn agent:app --host 0.0.0.0 --port 8000
Invalid config for agent TopicSetterAgent: output_schema cannot co-exist with agent transfer configurations. Setting disallow_transfer_to_parent=True, disallow_transfer_to_peers=True
INFO:     Started server process [17790]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
INFO:     127.0.0.1:45096 - "POST /ask HTTP/1.1" 200 OK

Run frontend (app.py):

$ python3 -m streamlit run app.py

  You can now view your Streamlit app in your browser.

  Local URL: http://localhost:8501
  Network URL: http://172.28.246.163:8501

Test Prompt:

I want to research about the "LLM Agents"

Topic_setter agent gives following response:

{"subtopic_1": "Definition and architecture of LLM Agents", 
"subtopic_2": "Applications and use cases of LLM Agents", 
"subtopic_3": "Challenges and future directions for LLM Agents"}

Research Agent1 gives response:

LLM agents face challenges including hallucinations, limited memory and 
context windows, and struggles with real-time decision-making and complex 
calculations. Future directions involve developing hybrid systems 
combining LLMs with deterministic tools for precision tasks, improving 
continuous learning and scalability, and enhancing multi-agent 
collaboration and ethical considerations.

Research Agent2:

LLM agents are advanced AI systems that use large language models (LLMs) 
to reason through problems, create plans, and execute tasks with the help 
of various tools, going beyond simple text generation to handle complex, 
multi-step assignments autonomously. Their architecture typically includes 
a core LLM (the "brain"), memory modules for retaining context, tools for 
external interactions (like APIs or databases), and planning components to 
break down complex tasks and determine actions.

Research Agent3:

LLM agents are utilized across various industries and applications due to 
their ability to understand language, perform complex tasks, and interact 
with external tools and data sources. Key applications include enhancing 
customer service through personalized interactions and 24/7 support, and 
streamlining internal operations such as HR and IT support. They also play 
a significant role in automating tasks like market analysis, financial 
reporting, supply chain management, and compliance checks by gathering and 
evaluating vast amounts of data. Furthermore, LLM agents are valuable in 
software development for generating code and assisting with debugging, and 
in research and development for hypothesis testing and data analysis.

Merger Agent:

Definition and architecture of LLM Agents
LLM agents are advanced AI systems that use large language models (LLMs) 
to reason through problems, create plans, and execute tasks with the help 
of various tools, going beyond simple text generation to handle complex, 
multi-step assignments autonomously. Their architecture typically includes 
a core LLM (the "brain"), memory modules for retaining context, tools for 
external interactions (like APIs or databases), and planning components to 
break down complex tasks and determine actions.

Applications and use cases of LLM Agents
LLM agents are utilized across various industries and applications due to 
their ability to understand language, perform complex tasks, and interact 
with external tools and data sources. Key applications include enhancing 
customer service through personalized interactions and 24/7 support, and 
streamlining internal operations such as HR and IT support. They also play 
a significant role in automating tasks like market analysis, financial 
reporting, supply chain management, and compliance checks by gathering and 
evaluating vast amounts of data. Furthermore, LLM agents are valuable in 
software development for generating code and assisting with debugging, and 
in research and development for hypothesis testing and data analysis.

Challenges and future directions for LLM Agents
LLM agents face challenges including hallucinations, limited memory and 
context windows, and struggles with real-time decision-making and complex 
calculations. Future directions involve developing hybrid systems 
combining LLMs with deterministic tools for precision tasks, improving 
continuous learning and scalability, and enhancing multi-agent 
collaboration and ethical considerations.

Conclusion
LLM agents represent a significant advancement in AI, enabling autonomous 
execution of complex, multi-step tasks by integrating LLMs with planning, 
memory, and external tools. While offering vast potential across diverse 
applications from customer service to software development, they currently 
grapple with issues like hallucinations and computational limitations. 
Future developments aim to overcome these challenges through hybrid 
architectures, enhanced learning capabilities, and robust ethical frameworks.

Demo GIF: GIF on GitHub

Conclusion

In this post, we mentioned:

how to access Google Gemini 2.5,
how to implement multi-agent parallel agents

If you found the tutorial interesting, I’d love to hear your thoughts in the blog post comments. Feel free to share your reactions or leave a comment. I truly value your input and engagement 😉

For other posts 👉 https://dev.to/omerberatsezer 🧐

Follow for Tips, Tutorials, Hands-On Labs:

References

Your comments 🤔

Which tools are you using to develop AI Agents (e.g. Google ADK, AWS Strands, Google ADK, CrewAI, Langchain, etc.)?
What are you thinking about Google ADK?
Did you implement multi-agents using any framework (ADK, Strands, CrewAI)?

=> Welcome to any comments below related AI, agents for brainstorming 🤯

Top comments (2)

Ömer Berat Sezer • Aug 13 • Edited

This time, I tested how to implement multi-agents using Google ADK. I’m also testing several AI multi-agent frameworks, CrewAI, AWS Strands, and Google ADK, across use cases such as multi-agent collaboration, MCP tool integration, support for multiple language models, and workflow orchestration. Google ADK and AWS Strands stand out for their ease of deployment. Unlike CrewAI, they don’t include built-in task implementations, but both provide comparable feature sets and integrate seamlessly with open-source tools like LiteLLM, MCP components (e.g., StdioServerParameters), and short- and long-term session memory management.

Eminence Technology • Aug 13

Makes sense — the dependency array is just telling React when to re-run the effect, and in this case post changing is the real trigger. Including the state setters isn’t wrong, but it’s not necessary since they don’t change between renders.

DEV Community