In this video, I have a super quick tutorial showing you how to create a multi-agent chatbot using LangGraph, MCP, and GPT-5 to build a powerful agent chatbot for your business or personal use.
In 2025, the evolution of AI will enter a new stage. OpenAI's long-awaited next-generation AI model, GPT-5, is a "super assistant" that overturns conventional wisdom.
This new Model, which OpenAI touts as "the smartest, fastest, and most useful model on the market," goes beyond simply improving performance and has the potential to revolutionize the very way we interact with AI.
ChatGPT-3 felt like talking to a high school student - there were great moments and frustrating ones, but people still found value in using it. GPT-4 felt closer to speaking with a college student, showing signs of real intelligence and practical utility. With GPT-5, it feels like you're talking to an expert - a PhD-level specialist who can provide on-demand support in virtually any area you need.
An MCP server is like a "window" that collectively provides external functions (tools) that users want to use. For example, it has the advantage of being able to handle multiple tools with a single system, such as searching local files, retrieving information from YouTube, and querying weather APIs.
Because it can be flexibly integrated with chat tools like Cursor, it is becoming an indispensable feature for developers and creators.
So, let me give you a quick demo of a live chatbot to show you what I mean.
Check a video
I will ask the chatbot two different questions. The first one is: "Summarise this YouTube video in 50 words. Here is the video link: https://www.youtube.com/watch?v=2f3K43FHRKo." Feel free to ask any questions you want.
If you look at how the chatbot generates the output, you will see that the agent uses the YouTube Transcript API to parse YouTube video IDs from URLs and fetch transcripts. For the second question, I asked, "What is FastMCP?" The agent uses the Web Scraper MCP to extract data from the web and formats the response to return only the results array.
The client loads all available tools from each server into a unified toolset, with error handling for failed servers. It then creates a React agent using GPT-5 that can reason about which tools to use and call them sequentially. A system message instructs the agent about its available capabilities.
The agent processes user queries by deciding which tools to invoke based on the request context and returns the final response after all tool interactions are complete.
What's so great about it?
Just 20 minutes into the press conference, we concluded that this was "AI's moon landing moment. "
GPT-5 is not just an iteration of GPT-4; GPT-5 is a true paradigm shift in intelligence!
1. Fusion Model
GPT-5 is an integrated model. In the future, there will be no need to switch models manually. GPT-5 will decide on its own when it needs to think more deeply. It is also the new default option for all logged-in users.
2. Say goodbye to illusions
The current situation, where large models are "topped up as soon as they are released", is no longer the most important thing. What really matters is "practicality".
The accuracy of GPT-5 has been improved at an epic level, and one can almost say, "Say goodbye to model hallucinations and start with GPT-5."
It has also gotten better at writing (I'm more emotional)
In a text demo, it was able to generate eulogies with more rhythm and emotion than GPT-4o. This means that it not only has a higher IQ, but also a higher EQ (emotional intelligence). It is capable of responding more human-like and heartfelt than 4o.
A huge leap in programming capabilities
In ChatGPT, complex concepts like the Bernoulli effect can be explained to you through interactive content, and hundreds of lines of code can be generated in minutes. GPT-5 is very good at writing software.
Create a web app for learning French in just a few minutes. (GPT-5 is quite interesting; the French learning website is called "Midnight in Paris").
For developers, using the API will allow them to experience the strongest Vibe Coding capabilities.
OpenAI's president, Greg, had a lot of fun playing the games generated on the spot by GPT-5 during the press conference.
Improved memory and personalization
• Its memory will become smarter, allowing it to provide more tailored answers to users. It will be able to link with Gmail and Google Calendar (starting next week). It will tell you when your schedule is free and let you know which emails you forgot to reply to!
AI is finally becoming a full-fledged "everyday support partner."
8. Medical treatment has been formally intervened in
Altman also invited a cancer patient to talk about how she used ChatGPT to help fight the disease.
The scary thing is: GPT-5 not only helps patients understand complex pathology reports, but also provides correct treatment recommendations.
The patient's husband said with emotion that GPT-5 is fully capable of understanding "the problem behind the problem" and is quite professional.
Let's start coding
Let us now explore step by step and unravel the answer to how to create the Langgraph & MCP Application. We will install the libraries that support the model. For this, we will do a pip install requirements
pip install requirements
The next step is the usual one, where we will import the relevant libraries, the significance of which will become evident as we proceed.
We initiate the code by importing classes from
YouTube Transcript Api: This is a Python API which allows you to get the transcripts/subtitles for a given YouTube video.
langchain_mcp_adapters: Convert MCP tools to LangChain tools
for use with the LangGraph agent and provide a client implementation
Allows users to connect to and load tools from multiple MCP servers
from mcp.server.fastmcp import FastMCP
import os
import httpx
from youtube_transcript_api import YouTubeTranscriptApi
import asyncio
from langchain_mcp_adapters.client import MultiServerMCPClient, load_mcp_tools
from langgraph.prebuilt import create_react_agent
from langchain_core.messages import SystemMessage, HumanMessage
from langchain_openai import ChatOpenAI
Math Mcp File
Let's define two functions: add(), which takes two integers and returns their sum, andmultiply(), which takes two integers and returns their product. The @mcp.tool() decorator automatically registers these functions as tools that can be used by GPT-5, handling protocol communication, argument validation, and response formatting behind the scenes. When run as the main module, the script starts the server using the stdio transport.
from mcp.server.fastmcp import FastMCP
mcp =FastMCP("Math")
@mcp.tool()
def add(a: int, b: int) -> int:
return a + b
@mcp.tool()
def multiply(a: int, b: int) -> int:
return a * b
if __name__ =="__main__":
mcp.run(transport="stdio")
Tavily Mcp File
Next, we created an MCP server that integrates Tavily web search functionality. It loads the Tavily API key from environment variables and defines an async search_tavily function that makes HTTP requests to Tavily's API with configurable search parameters-basic depth, a maximum of 5 results, a 3-day timeframe, inclusion of answers, but exclusion of raw content and images.
The server handles various error cases, including missing API keys, network errors, and HTTP errors. This functionality is exposed through the @mcp.tool()-decorated get_tavily_results function, which formats the response to return only the results array. It runs the server on the stdio transport, enabling MCP clients to perform web searches through Tavily's service.
import os
import httpx
from dotenv import load_dotenv
from mcp.server.fastmcp import FastMCP
# Load environment variables from .env
load_dotenv()
# Initialize FastMCP server with name "tavily_search"
mcp = FastMCP("tavily_search")
# Tavily API details
TAVILY_API_KEY = os.getenv("TAVILY_API_KEY")
TAVILY_SEARCH_URL = "https://api.tavily.com/search"
async def search_tavily(query: str) -> dict:
"""Performs a Tavily web search and returns up to 5 results."""
if not TAVILY_API_KEY:
return {"error": "Tavily API key is missing. Please set TAVILY_API_KEY in your .env file."}
payload = {
"query": query,
"topic": "general",
"search_depth": "basic",
"chunks_per_source": 3,
"max_results": 5,
"time_range": None,
"days": 3,
"include_answer": True,
"include_raw_content": False,
"include_images": False,
"include_image_descriptions": False,
"include_domains": [],
"exclude_domains": []
}
headers = {
"Authorization": f"Bearer {TAVILY_API_KEY}",
"Content-Type": "application/json"
}
try:
async with httpx.AsyncClient(timeout=30.0) as client:
response = await client.post(TAVILY_SEARCH_URL, json=payload, headers=headers)
response.raise_for_status()
return response.json()
except httpx.RequestError as e:
return {"error": f"Network error: {str(e)}"}
except httpx.HTTPStatusError as e:
return {"error": f"HTTP error {e.response.status_code}: {e.response.text}"}
except Exception as e:
return {"error": f"Unexpected error: {str(e)}"}
@mcp.tool()
async def get_tavily_results(query: str):
"""Fetches Tavily search results for a given query."""
results = await search_tavily(query)
if isinstance(results, dict):
return {"results": results.get("results", [])}
else:
return {"error": "Unexpected Tavily response format"}
if __name__ == "__main__":
print("Tavily MCP server is running and waiting for requests...")
mcp.run(transport="stdio")
YouTube Transcript File
After that, we built an MCP server for YouTube transcript extraction. It uses regex to parse YouTube video IDs from URLs and the YouTube Transcript API to fetch transcripts - first listing available transcripts, then finding and retrieving the English version.
The server concatenates all transcript entries into a single text string and includes extensive debug logging to stderr for troubleshooting, which is crucial since MCP servers run in isolation.
It also provides a test tool for connectivity verification and handles errors gracefully by returning structured error responses. This allows MCP clients to reliably extract and work with YouTube video transcripts programmatically.
import re
from mcp.server.fastmcp import FastMCP
from youtube_transcript_api import YouTubeTranscriptApi
import sys
mcp = FastMCP("youtube_transcript")
@mcp.tool()
def test_tool(message: str) -> dict:
"""A simple test tool that always works."""
print(f"DEBUG: Test tool called with: {message}", file=sys.stderr)
return {"result": f"Test successful: {message}"}
@mcp.tool()
def get_youtube_transcript(url: str) -> dict:
"""Fetches transcript from a given YouTube URL."""
print(f"DEBUG: YouTube tool called with: {url}", file=sys.stderr)
try:
video_id_match = re.search(r"(?:v=|\/)([0-9A-Za-z_-]{11}).*", url)
if not video_id_match:
return {"error": "Invalid YouTube URL"}
video_id = video_id_match.group(1)
print(f"DEBUG: Video ID: {video_id}", file=sys.stderr)
api = YouTubeTranscriptApi()
transcript_list = api.list(video_id)
transcript = transcript_list.find_transcript(['en'])
transcript_data = transcript.fetch()
transcript_text = "\n".join([entry.text for entry in transcript_data])
return {"transcript": transcript_text}
except Exception as e:
print(f"DEBUG: Exception: {str(e)}", file=sys.stderr)
return {"error": str(e)}
if __name__ == "__main__":
print("DEBUG: Server starting", file=sys.stderr)
mcp.run(transport="stdio")
Agent File
Finally, we developed a multi-tool AI agent using LangGraph that orchestrates multiple MCP servers. It sets up a Multi-Server MCP Client to manage three separate MCP servers (YouTube transcript, Tavily search, and math tools) running as subprocesses via stdio transport.
The client loads all available tools from each server into a unified toolset, with error handling for failed servers. It then creates a ReAct agent using GPT-5 that can reason about which tools to use and call them sequentially.
A system message instructs the agent about its available capabilities. The agent processes user queries by deciding which tools to invoke based on the request context, and returns the final response after all tool interactions are complete - essentially creating an intelligent assistant that can extract YouTube transcripts, perform web searches, and do mathematical calculations as needed.
import asyncio
from langchain_mcp_adapters.client import MultiServerMCPClient, load_mcp_tools
from langgraph.prebuilt import create_react_agent
from langchain_core.messages import SystemMessage, HumanMessage
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
load_dotenv()
query = input("Query: ")
model = ChatOpenAI(model="gpt-5")
async def run_agent():
client = MultiServerMCPClient({
"youtube_transcript": {
"command": "python",
"args": ["servers/yt_transcript.py"],
"transport": "stdio",
},
"tavily": {
"command": "python",
"args": ["servers/tavily.py"],
"transport": "stdio",
},
"math": {
"command": "python",
"args": ["servers/math.py"],
"transport": "stdio",
}
})
# Load all tools
tools = []
for server_name in ["youtube_transcript", "tavily", "math"]:
try:
async with client.session(server_name) as session:
server_tools = await load_mcp_tools(session)
tools.extend(server_tools)
except Exception as e:
print(f"✗ {server_name}: {e}")
# Create agent with all tools
agent = create_react_agent(model, tools)
system_message = SystemMessage(content=(
"You have access to tools including get_youtube_transcript for YouTube URLs. "
"Use the appropriate tools based on the user's request."
))
agent_response = await agent.ainvoke({
"messages": [system_message, HumanMessage(content=query)]
})
return agent_response["messages"][-1].content
if __name__ == "__main__":
response = asyncio.run(run_agent())
print("\nFinal Response:", response)
Conclusion :
Personally, I feel that GPT-5 is an update that has greatly improved its "practicality" rather than its numerical breakthrough. Claude 3 and Opus 4 also received little attention immediately after their release, but their reputation has grown over time.
Similarly, I believe that the more we use GPT-5, the more its true value will become apparent. Even now, we can fully sense the high creativity unique to OpenAI models and the unique nuances seen in the O3 series.
If it also possesses the agent-like capabilities demonstrated by Claude Opus, GPT-5 is extremely easy to use, and I think it is a model that is increasingly exciting to watch in the future.
I would highly appreciate it if you
❣ Join my Patreon: https://www.patreon.com/GaoDalie_AI
Book an Appointment with me: https://topmate.io/gaodalie_ai
Support the Content (every Dollar goes back into the -video):https://buymeacoffee.com/gaodalie98d
Subscribe to the Newsletter for free:https://substack.com/@gaodalie
Top comments (0)