Gao Dalie (Ilyass)

Posted on Jun 11, 2025

Google Adk + MCP + RAG + Ollama = The Key To Powerful Agentic AI

#datascience #machinelearning #programming #mcp

In this Story, I have a super quick tutorial showing you how to create a multi-agent chatbot using Google Adk, MCP, RAG, and Ollama to build a powerful agent chatbot for your business or personal use.

If this is your first time watching me, I highly recommend checking out my previous stories. I made a video about MCP, which became a big hit in the AI community.

Artificial intelligence (AI) is becoming more and more prevalent in our surroundings. Among them, “AI agents” are programs that think and act autonomously with specific goals in mind, like smart digital assistants that gather information and complete tasks for us.

At the Google Cloud Next 25 conference, Google open-sourced its first Agent Development Kit, ADK. This is also the second standardised agent SDK released by a major company after OpenAI.

ADK can help developers greatly simplify the development of intelligent agents with ultra-complex processes. It can complete everything from large model selection, automated process orchestration, testing the application deployment in one stop, and supports two-way audio, video, MCP and the latest A2A protocol.

For example, developing a cross-platform voice customer service agent through ADK only requires more than 100 lines of code or even less. There is no need to switch between different platform APIs, select models, or write complex code for interaction logic, which greatly improves development efficiency.

So, let me give you a quick demo of a live chatbot to show you what I mean.

Check a Video

I will ask the chatbot a question: “How can I learn about an AI Agent from GaoDalie’s YouTube Channel?” Feel free to ask any questions you like.
If you take a look at how the chatbot generates the output, you’ll see that the agent functions as a seamless research assistant.

When I prompt the chatbot, the agent first searches YouTube using the MCP Tool to find relevant videos, formatting the results for display while extracting and storing video details in the FAISS vectorstore.

Next, the agent automatically queries this same vectorstore using rag_search_tool to retrieve similar content from its knowledge base, which continues to grow as more searches are performed.

One of the problems I faced was Unicode encoding errors on Windows when handling non-ASCII characters. This was solved by wrapping Python’s stdout/stderr with UTF-8 encoders and setting the console to UTF-8 mode.

Another issue involved RAG integration, which initially prevented YouTube video content from being properly added to the vectorstore. I resolved this by implementing explicit vectorstore initialization with a default document, building a robust YouTube-to-RAG pipeline, adding detailed logging, and improving error handling for various result formats.

Finally, the agent combines both information sources into a structured three-part response: “Video Resources” presents the best YouTube tutorials, “Knowledge Base” shows relevant stored information (including previously indexed videos), and “Combined Analysis” synthesises insights from both sources.

This approach is embedded in the agent’s instruction prompt, ensuring every query leverages YouTube results and accumulated knowledge for comprehensive answers.

Note (Important): This works perfectly when running with adk run , but I still don’t know why it doesn’t run successfully on adk web, maybe because be time exceeded. I tried to fix it, but there is no luck so far

What is the Agent Development Kit?

Google’s Agent Development Kit is an open-source Python SDK designed to simplify the creation of advanced AI agents, providing flexibility and control in design and deployment.

ADK is model- and deployment-agnostic, supports seamless integration with any AI model, and can be deployed in a variety of environments, including local, cloud, or custom infrastructure.

Its main features include two-way audio and video streaming for real-time interaction, user interface playgrounds for testing and debugging, and traceability tools that provide detailed workflow insights.

ADK follows modular and extensible design principles, making it easy for developers of all experience levels to use while supporting the creation of complex multi-agent systems.

Why ADK is important:

AI development often involves complex workflows, integrating various tools, and ensuring scalability for real-world applications. ADK addresses these challenges by providing a simplified approach to creating production-ready agents.

Its main goal is to reduce friction in the development process while maintaining flexibility and scalability. The model- and deployment-agnostic foundation provided by ADK enables you to focus on innovation without being restricted to specific tools, platforms, or environments.

This adaptability ensures that your AI solutions remain forward-looking and diverse, no matter how project requirements change.

Core features of ADK:

ADK is built on three basic principles: compatibility, adaptability and interoperability, which make it a powerful tool for developers to create complex artificial intelligence agents.

Model agnostic: ADK supports seamless integration with any AI model, whether developed by Google or other providers. This flexibility enables you to choose the model that best suits your specific use case, ensuring optimal performance and customization.

Deployment agnostic: The toolkit allows you to deploy agents in a variety of environments, including on-premises, in the cloud, or on custom infrastructure. This freedom ensures that your deployment strategy is aligned with the technical and operational requirements of your project.

Interoperability: ADK integrates seamlessly with existing tools, services, and frameworks, ensuring smooth workflows and compatibility with current systems. This feature reduces the need for large-scale reconfiguration, allowing you to focus on development rather than integration challenges.

Developer Tools and Capabilities:
ADK includes a complete set of tools designed to enhance the development process and improve efficiency. These tools provide the functionality needed to build, test, and optimize AI agents, ensuring your project is both robust and scalable.

Two-way audio and video streaming: Real-time interactive capabilities make it possible to create agents that can communicate seamlessly in multimodal environments, enhancing their ability to handle complex tasks.

UI Playground: Built-in testing and visualisation environment allows you to debug and optimise your agent locally before deployment. This feature ensures that your agent behaves as expected in real-world scenarios.

Comparison with other frameworks

How is ADK different from other AI agent frameworks?

Let’s start coding
Let us now explore step by step and unravel the answer to how to create the ADK, RAG, and MCP APP. We will install the libraries that support the model. For this, we will do a pip install requirements

pip install requirements

The next step is the usual one, where we will import the relevant libraries, the significance of which will become evident as we proceed.

We initiate the code by importing classes from

Google developing with ADK is a bit like playing with Lego blocks. You can play with whichever blocks you like and fully unleash your wild ideas without having to worry about tools and underlying technologies.

LiteLLM allows developers to integrate a diverse range of LLM models as if they were calling Ollama, with support for fallbacks

from google.adk.agents import LlmAgent
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset, StdioServerParameters
import os
import dotenv
import asyncio
import json
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain.schema import Document
from google.adk.models.lite_llm import LiteLlm
import sys
import codecs

I created a function called create_mcp_tool_executor That returns an asynchronous executor designed to interact with an MCP Tool. I attempt to connect to the MCP server, calling MCPToolset.from_server, providing StdioServerParameters configured with the given command, arguments, and environment.

Once connected, I print how many tools are available, and then I iterate through each tool, trying to run it asynchronously with the provided kwargs. I print the tool's name and preview the result.

If a tool fails, I log the error and try the next one. If none succeed, I return a fallback response with an empty "results" list. I ensure proper cleanup by using exit_stack and calling aclose() When finished.

# Create MCP tool executor
def create_mcp_tool_executor(command, args=None, env=None):
    async def mcp_tool_executor(**kwargs):
        print(f"Connecting to MCP server with command: {command}")
        print(f"Using SERP_API_KEY: {env.get('SERP_API_KEY', 'Not set')}")

        try:
            tools, exit_stack = await MCPToolset.from_server(
                connection_params=StdioServerParameters(
                    command=command,
                    args=args or [],
                    env=env or {},
                )
            )

            print(f"Connected successfully, found {len(tools)} tools")

            try:
                print(f"Calling tool with arguments: {kwargs}")
                for tool in tools:
                    try:
                        print(f"Trying tool: {tool.name}")
                        result = await tool.run_async(args=kwargs, tool_context=None)
                        print(f"Tool execution successful")
                        print(f"Tool result type: {type(result)}")
                        print(f"Tool result preview: {str(result)[:200] if result else 'No result'}...")
                        return result
                    except Exception as e:
                        print(f"Tool {tool.name} failed: {e}")
                        continue

                # If we get here, all tools failed
                return {"results": []}
            finally:
                await exit_stack.aclose()
        except Exception as e:
            print(f"MCP error: {e}")
            import traceback
            traceback.print_exc()
            return {"results": []}

    return mcp_tool_executor

Then I developed a function get_embeddings to create a Google Generative AI embeddings object using the "embedding-001" model and the data GOOGLE_API_KEY from the environment.

Then I made an asynchronous function ensure_rag_system to initialise or update the RAG system. I set up this system to manage a global vectorstore list of initialized_urls.

First, I check if the vectorstore exists; if not, I initialize it using a default document with some placeholder text. I then generate embeddings using the get_embeddings function and build a FAISS vector store from the document.

If URLs are provided, I filter out any already processed ones, then fetch content from the new URLs using the requests library. I use it BeautifulSoup to extract clean text from the HTML pages and convert that into Document objects with metadata indicating the source.

I then split these documents into chunksRecursive Character Text Splitter, making them suitable for vector storage. Finally, I add the chunks to the vectorstore and update the list of initialised URLs to avoid duplication.

# Get embeddings function
def get_embeddings():
    return GoogleGenerativeAIEmbeddings(
        model="models/embedding-001",
        google_api_key=os.environ.get("GOOGLE_API_KEY")
    )

# Initialize or update RAG system
async def ensure_rag_system(urls=None):
    global vectorstore, initialized_urls

    print("Ensuring RAG system exists")

    # Initialize vectorstore if it doesn't exist
    if vectorstore is None:
        print("Initializing new vectorstore with default document")
        default_doc = Document(
            page_content="RAG System Initialization Document", 
            metadata={"source": "init", "type": "system"}
        )

        print("Creating embedding model")
        embedding_model = get_embeddings()

        print("Creating vectorstore with default document")
        vectorstore = FAISS.from_documents([default_doc], embedding_model)
        print("Successfully created new vectorstore")

    # Add URLs if provided
    if urls:
        new_urls = [url for url in urls if url not in initialized_urls]
        if not new_urls:
            return "All URLs already in the RAG system."

        print(f"Adding {len(new_urls)} URLs to the RAG system")    
        # Fetch and process documents
        import requests
        from bs4 import BeautifulSoup

        documents = []
        for url in new_urls:
            try:
                print(f"Fetching content from: {url}")
                response = requests.get(url, timeout=10)
                soup = BeautifulSoup(response.text, 'html.parser')
                text = soup.get_text(separator='\n', strip=True)
                documents.append(Document(page_content=text, metadata={"source": url, "type": "web"}))
            except Exception as e:
                print(f"Error fetching {url}: {e}")

        # Split documents and add to vectorstore
        text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=10000,
            chunk_overlap=500,
            length_function=len,
        )
        split_documents = text_splitter.split_documents(documents)

        print(f"Adding {len(split_documents)} documents to vectorstore")
        vectorstore.add_documents(split_documents)
        initialized_urls.extend(new_urls)
        return f"Added {len(new_urls)} URLs to the RAG system."

    return "RAG system initialized."

I developed a YouTube integration workflow that includes three main functions to search, process, and format video data. I made an asynchronous function calledsearch_youtube, which performs a YouTube search using an MCP tool executor.

I set up the executor by callingcreate_mcp_tool_executor specifying "mcp-youtube-search" the command and passing the SERP_API_KEY through the environment.

After executing the search with the provided query and max results, I check the result's format—if it's a dictionary with a "results" Key, I will return it directly.

If the result has an .content attribute, I try parsing it as JSON; if parsing fails, I handle the error gracefully. Next, I made another async function process_youtube_results that turns the search results into a list of Document objects, each enriched with video metadata like title, channel, link, views, and description, which can later be added to a vector store for retrieval tasks.

# Process YouTube results
async def process_youtube_results(search_query, results):
    if isinstance(results, dict) and "results" in results and results["results"]:
        documents = []
        print(f"Processing {len(results['results'])} YouTube videos")

        for video in results["results"]:
            doc_content = f"""
            YOUTUBE VIDEO
            Title: {video.get('title', 'No Title')}
            Channel: {video.get('channel', 'Unknown Channel')}
            Link: {video.get('link', '')}
            Description: {video.get('description', 'No description')}
            Published: {video.get('published_date', 'Unknown date')}
            Views: {video.get('views', 'Unknown views')}
            Duration: {video.get('duration', 'Unknown duration')}
            Search Query: {search_query}
            """
            documents.append(Document(
                page_content=doc_content,
                metadata={
                    "source": video.get('link', ''),
                    "title": video.get('title', 'No Title'),
                    "type": "youtube_video",
                    "search_query": search_query
                }
            ))

        return documents
    return []

# Format YouTube results
def format_youtube_videos(results):
    if isinstance(results, dict) and "results" in results and results["results"]:
        formatted_videos = []

        for video in results["results"]:
            title = video.get('title', 'No Title')
            link = video.get('link', '')
            channel = video.get('channel', 'Unknown Channel')
            description = video.get('description', 'No description')
            published_date = video.get('published_date', 'Unknown date')
            views = video.get('views', 'Unknown views')
            duration = video.get('duration', 'Unknown duration')

            formatted_video = f"* **{title}** ({link}) by {channel}: {description} ({published_date}, {views}, {duration})"
            formatted_videos.append(formatted_video)
            print(formatted_videos)
        return formatted_videos
    return []

# YouTube search function
async def search_youtube(search_query: str, max_results: int = 10):
    print(f"Searching YouTube for: {search_query}")

    youtube_search = create_mcp_tool_executor(
        command="mcp-youtube-search",
        args=[],
        env={"SERP_API_KEY": os.environ.get("SERP_API_KEY")}
    )

    result = await youtube_search(search_query=search_query, max_results=max_results)
    print(f"YouTube search returned data type: {type(result)}")

    # Handle different result formats
    if isinstance(result, dict) and "results" in result:
        return result
    elif hasattr(result, 'content') and result.content:
        try:
            content_text = result.content[0].text
            return json.loads(content_text)
        except Exception as e:
            print(f"Error parsing YouTube search result: {e}")
            return {"results": []}
    else:
        print(f"Unknown YouTube search result format: {type(result)}")
        return {"results": []}

After that, I designed a RAG workflow that connects YouTube search results and external URLs into a searchable vector store. I created the search_rag function, which performs a similarity search on the vector store using the given query.

If the store isn’t initialized, I automatically call ensure_rag_system to set up a default environment, ensuring the system never breaks on first use. Then, I built the youtube_search_tool function to fetch YouTube videos related to a search query using the MCP-powered search_youtube function.

I formatted the results for readability and converted them into structured Document objects with metadata. These documents are added to the vector store only if they haven’t already been integrated under the same topic, which I track using the topics_added list.

Finally, I created therag_create_tool which takes a comma-separated list of URLs, cleans them into a list, and passes them to the ensure_rag_system function to ingest their contents into the vector store.

# RAG search function
async def search_rag(query: str, vectorstore_obj: FAISS = None):
    print(f"Performing similarity search in vectorstore for: {query}")

    vs = vectorstore_obj or vectorstore

    if vs is None:
        print("RAG system not initialized, creating default RAG")
        await ensure_rag_system()
        vs = vectorstore

    if vs is None:
        print("Failed to initialize vectorstore, returning empty results")
        return []

    results = vs.similarity_search(query, k=5)
    print(f"RAG search returned {len(results)} results")
    return results

# Tool functions
async def youtube_search_tool(search_query: str):
    global vectorstore, topics_added

    print(f"YouTube search for: {search_query}")
    print(f"Current RAG status - vectorstore exists: {vectorstore is not None}")
    print(f"Topics already in RAG: {topics_added}")

    # Search YouTube
    youtube_data = await search_youtube(search_query)

    # Process results and add to RAG
    if youtube_data and isinstance(youtube_data, dict) and "results" in youtube_data:
        formatted_videos = format_youtube_videos(youtube_data)

        if formatted_videos:
            # Add to RAG system
            if vectorstore is None:
                print("Creating new vectorstore since none exists")
                await ensure_rag_system()

            if vectorstore:
                youtube_docs = await process_youtube_results(search_query, youtube_data)
                if youtube_docs:
                    print(f"RAG Integration: Adding {len(youtube_docs)} YouTube videos to vectorstore")
                    vectorstore.add_documents(youtube_docs)
                    if search_query not in topics_added:
                        topics_added.append(search_query)
                        print(f"Updated topics in RAG: {topics_added}")

            return "\n".join(formatted_videos)

    return "No videos found for that search."

async def rag_create_tool(links: str):
    url_list = [url.strip() for url in links.split(',')]
    print(f"Creating RAG from URLs: {url_list}")
    return await ensure_rag_system(url_list)

I created the research_tool function to automate the end-to-end research process using a Retrieval-Augmented Generation system. I initialized the RAG system ensure_rag_system() to make sure the vectorstore is ready to store and retrieve documents.

I then searched YouTube for the given topicsearch_youtube()capturing the results and checking their format to ensure valid data. I processed those YouTube results into document format using process_youtube_results() and if the vectorstore wasn't already set, I re-initialised it to avoid storing issues.

I added the YouTube documents to the vectorstore and updated the topics_added list to prevent reprocessing the same topic in the future. After enriching the knowledge base, I performed a similarity search using the search_rag() function to fetch semantically relevant documents.

I formatted the YouTube results for display format_youtube_videos() and combined them with the knowledge base results into a well-structured Markdown response.

async def research_tool(topic: str):
    global topics_added, vectorstore

    # Step 1: Ensure RAG system is initialized
    print("Initializing RAG system for research")
    init_msg = await ensure_rag_system()
    print(f"RAG initialization: {init_msg}")

    # Step 2: Search YouTube for the topic
    print(f"Searching YouTube for: {topic}")
    youtube_data = await search_youtube(search_query=topic)
    print(f"YouTube search returned data type: {type(youtube_data)}")

    # Step 3: Add YouTube content to the RAG system
    if youtube_data and isinstance(youtube_data, dict) and "results" in youtube_data:
        youtube_docs = await process_youtube_results(topic, youtube_data)
        if youtube_docs:
            if vectorstore is None:
                print("YouTube results found but no vectorstore, creating now")
                await ensure_rag_system()

            if vectorstore:
                print(f"Adding {len(youtube_docs)} YouTube videos to RAG system")
                vectorstore.add_documents(youtube_docs)
                if topic not in topics_added:
                    topics_added.append(topic)
                    print(f"Added topic '{topic}' to RAG system. Topics now: {topics_added}")

    # Step 4: Search RAG system for the topic
    print(f"Searching RAG system for: {topic}")
    rag_results = []
    if vectorstore:
        rag_results = await search_rag(topic, vectorstore)
        print(f"RAG search returned {len(rag_results)} results")
    else:
        print("WARNING: Vectorstore is None, cannot search RAG")

    # Format YouTube results
    formatted_youtube = format_youtube_videos(youtube_data) if youtube_data and isinstance(youtube_data, dict) and "results" in youtube_data else []

    # Build response
    combined_response = f"# Research Results for: {topic}\n\n"

    if formatted_youtube:
        combined_response += "## YouTube Videos:\n"
        combined_response += "\n".join(formatted_youtube)
        combined_response += "\n\n"
    else:
        combined_response += "No YouTube videos found for this topic.\n\n"

    if rag_results:
        combined_response += "## Knowledge Base Information:\n\n"
        for i, doc in enumerate(rag_results, 1):
            source = doc.metadata.get('source', 'Unknown')
            doc_type = doc.metadata.get('type', 'unknown')

            if doc_type == 'youtube_video':
                title = doc.metadata.get('title', 'No Title')
                content = doc.page_content.replace("YOUTUBE VIDEO", "").strip()
                combined_response += f"### Video Knowledge: {title}\n\n{content[:800]}...\n\n"
            else:
                content = doc.page_content[:800]
                last_period = content.rfind('.')
                if last_period > 600:
                    content = content[:last_period+1]

                combined_response += f"### Source {i}: {source}\n\n{content}...\n\n"
    else:
        combined_response += "No relevant information found in the knowledge base.\n\n"

    combined_response += "---\n"
    combined_response += f"Research complete: Found {len(formatted_youtube)} videos and {len(rag_results)} knowledge base entries."

    return combined_response

Finally, I created an intelligent agent, multi_tool_agent, using the LlmAgent class with the A gemini-2.0-flash model to act as a research assistant that combines YouTube videos with a RAG knowledge base.

I wrote strict instructions so the agent always performs a YouTube search first, then uses either rag_search_tool or the integrated search research_tool based on the topic. Educational queries usersearch_tool, while complex ones use rag_search_tool with multiple queries.

The response format includes "Video Resources," "Knowledge Base," and "Combined Analysis." If RAG data is missing, the agent offers to create new entries using rag_create_tool. I registered all four tools and exported the setup root agent for ADK deployment

# Create the agent
agent = LlmAgent(
    name="multi_tool_agent",
    #model="gemini-2.0-flash",
    #ollama
    model=LiteLlm(
            model="ollama/gemma3:12b",
            response_format={"type": "text"},
            force_json=False,
            temperature=0.1,
        ),
    instruction="""You are a powerful research assistant that seamlessly integrates YouTube videos with RAG-based knowledge. For EVERY query, you MUST use both YouTube and the RAG knowledge base to provide comprehensive answers.

When a user asks about any topic, follow this exact process:
1. First, search YouTube using the youtube_search_tool to find relevant videos
2. Then, ALWAYS use either rag_search_tool or research_tool to search your knowledge base
3. Combine both sources of information in your response

This 2-step process is MANDATORY for ALL queries - never skip the RAG step. Your most important feature is the ability to combine video content with your knowledge base.

Specific tool usage instructions:
- For general queries: First use youtube_search_tool, then use rag_search_tool
- For educational topics: Use the research_tool which automatically combines YouTube and RAG
- For detailed analysis: First use youtube_search_tool, then use rag_search_tool with multiple queries

Your responses must ALWAYS include:
1. A "Video Resources" section with the most relevant YouTube videos
2. A "Knowledge Base" section with information from your RAG system
3. A "Combined Analysis" section that synthesizes insights from both sources

If the RAG system doesn't contain relevant information, you must say "RAG knowledge is limited on this topic" and offer to create a new RAG entry using rag_create_tool with relevant URLs.

Remember: Your primary value comes from COMBINING video tutorials with your growing knowledge base. Every response must leverage both systems - this dual-source approach is what makes you unique and powerful.

Technical details you should know:
- Your RAG system automatically indexes all YouTube videos you search for
- The topics_added list tracks which topics have been added to your knowledge base
- You can check if information exists in your knowledge base using rag_search_tool
""",
    tools=[youtube_search_tool, rag_create_tool, rag_search_tool, research_tool],
)

# Export for ADK
root_agent = agent

Summary:

MCP and RAG integration with Google ADK is a cutting-edge technology that transforms LLM into a tool-driven agent. This configuration will be a strong foundation as the future of AI development accelerates the transition from simple prompt execution to “actionable AI.” With the integration into the Google ecosystem also progressing, now is the time for developers to learn ADK + MCP + RAG.

If this article might be helpful to your friends, please forward it to them.

🧙‍♂️ I am an AI Generative expert! If you want to collaborate on a project, drop an inquiry here or book a 1-on-1 Consulting Call With Me.

I would highly appreciate it if you

❣ Join my Patreon: https://www.patreon.com/GaoDalie_AI
Book an Appointment with me: https://topmate.io/gaodalie_ai
Support the Content (every Dollar goes back into the video):https://buymeacoffee.com/gaodalie98d
Subscribe to the Newsletter for free:https://substack.com/@gaodalie

Top comments (1)

Nevo David • Jun 11 '25

growth like this is always nice to see. kinda makes me wonder - you think sticking with process and routine is what actually keeps these projects getting better, or does it feel more like luck and timing?

Some comments may only be visible to logged-in visitors. Sign in to view all comments.