DEV Community

Cover image for Building a Model Context Protocol Server using Jina.ai and FastMCP in Python
Ahmad Ragab
Ahmad Ragab

Posted on

Building a Model Context Protocol Server using Jina.ai and FastMCP in Python

In this post, we'll discuss the Model Context Protocol, why it might be important and walk through building an MCP Server to help us talk to Jina.ai and be able to add web search and fact-checking functionality in Claude Desktop using Python and FastMCP.

The Model Context Protocol

Anthropic announced it around Thanksgiving last year. Although it garnered some attention, the recognition it has received may be insufficient, considering it could be a pivotal stepping stone in developing the next layer of the AI software stack.

What

The Model Context Protocol (MCP) is a standardized communication protocol designed specifically for large language models (LLMs) and other Gen AI tools.

Think of it as the "HTTP of Gen AI"β€”just as HTTP standardized how web browsers communicate with web servers, MCP standardizes how LLM applications communicate with tools and data sources.

Why Do We Need MCP?

The current landscape of LLM development faces several hurdles:

  1. Tool Integration Complexity: Each LLM service (like OpenAI, Anthropic, etc.) has its way of implementing tool calls and function calling, making it complex to build portable tools.

  2. Context Management: LLMs need access to various data sources and tools, but managing this access securely and efficiently has been challenging.

  3. Standardization: Without a standard protocol, developers must rebuild integration layers for each LLM platform they want to support.

MCP solves these challenges by providing:

  • A standardized way to expose tools and data to LLMs
  • A secure client-server architecture
  • A consistent interface regardless of the underlying LLM

How Does MCP Work?

MCP follows a client-server architecture with three main components:

  1. MCP Server: A service that exposes:

    • Tools (functions that LLMs can call)
    • Resources (data sources)
    • Prompts (templated instructions)
    • Context (dynamic information)
  2. MCP Client: The application connects to MCP servers and manages communication between the LLM and the servers. Client support is in its early stages, with only a handful of tools that implement any part of the protocol specification thus far and some functionality that no clients support yet.

Feature Support Matrix for Clients

And, of course, the LLM...

Workflow

  1. An MCP server registers its capabilities (tools, resources, etc.)
  2. A client connects to the server or launches it as a child process
  3. The LLM can then use these capabilities through a standardized interface

The Transport Protocols

  • Multiple transport mechanisms are supported by the protocol
    • SSE (Server Sent Events)
      • Communicates over HTTP bidirectionally, and the server process is isolated from the client
    • Stdio (Standard Input/Output)
      • Communicates over Standard Input/Output pipes, the server process is essentially a child process of the client

Security

The security situation is more nuanced. While servers using stdio transport are typically colocated with the client, and thus API keys are not necessarily exposed to the internet. They do seem to get passed around fairly casually, IMO.

These keys needed to be loaded into the client when the server started so they could be passed to the child process, and they even appeared in the desktop app logs, which was...concerning.

The widespread use of API keys is a broader issue affecting Gen AI services, platforms, and tooling. Companies like Okta and Auth0 are working on a solution to manage and authorize Gen AIs without exclusively relying on keys.

SDKs

Anthropic officially supports low-level SDKs for TypeScript, Python, and Kotlin. Some of the higher-level wrappers that have recently been created already cover some of the existing boilerplate and have other nice features, such as a packaged CLI for debugging, inspecting, and installing servers on the client to make developing MCP servers easier.

Getting Started with FastMCP

GitHub logo jlowin / fastmcp

The fast, Pythonic way to build Model Context Protocol servers πŸš€

FastMCP πŸš€

The fast, Pythonic way to build MCP servers.

PyPI - Version Tests License

Model Context Protocol (MCP) servers are a new, standardized way to provide context and tools to your LLMs, and FastMCP makes building MCP servers simple and intuitive. Create tools, expose resources, and define prompts with clean, Pythonic code:

# demo.py

from fastmcp import FastMCP


mcp = FastMCP("Demo πŸš€")


@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers"""
    return a + b
Enter fullscreen mode Exit fullscreen mode

That's it! Give Claude access to the server by running:

fastmcp install demo.py
Enter fullscreen mode Exit fullscreen mode

FastMCP handles all the complex protocol details and server management, so you can focus on building great tools. It's designed to be high-level and Pythonic - in most cases, decorating a function is all you need.

Key features:

  • Fast: High-level interface means less code and faster development
  • Simple…

FastMCP is one such framework. We'll now explore how to create an almost practical tool for reading websites, answering search queries through the web, and fact-checking information. We will be using Jina.ai.

It is a very slick service that provides a "Search Foundation platform" that combines "Embeddings, Rerankers, and Small Language Models" to aid businesses in building Gen AI and Multimodal search experiences.

Prerequisites

  • uv

You will need uv installed. It is the recommended way to create and manage Python projects. It's part of a relatively recent but exciting Python toolchain called astral.sh. I recommend you check it out.

uv aims to be a one-stop shop for managing projects, dependencies, virtual environments, versions, linting, and executing Python scripts and modules. It's written in Rust. Do with that information what you will 😏.

  • Claude Desktop App

You will also need to install the Claude Desktop App. For our purposes, the Claude Desktop App will serve as the MCP Client and is a key target client for Anthropic.

GitHub logo ASRagab / mcp-jinaai-reader

Model Context Protocol (MCP) Server for the Jina.ai Reader API




Project Setup

Using uv you can initialize a project with:

uv init mcp-jinaai-reader --python 3.11
Enter fullscreen mode Exit fullscreen mode

This will create a folder called mcp-jinaai-reader and a .python-version along with a pyproject.toml.

cd mcp-jinaai-reader
uv venv 
Enter fullscreen mode Exit fullscreen mode

This will create a virtual env corresponding to the python version we chose.

After creating the environment, it will provide instructions on how to activate it for the session.

source .venv/bin/activate
Enter fullscreen mode Exit fullscreen mode

Add a src directory and install the one dependency we need

uv add fastmcp
Enter fullscreen mode Exit fullscreen mode

Create a .env file at the project root and add your JINAAI_API_KEY to the file. You can obtain one for free by signing up at Jina. Generally, any API keys or other env variables your server needs to run will go in this file.

JINAAI_API_KEY=jina_*************
Enter fullscreen mode Exit fullscreen mode

In the src directory, create a server.py file...and you should be ready to code.

Server Code

from fastmcp import FastMCP
import httpx
from urllib.parse import urlparse
import os
Enter fullscreen mode Exit fullscreen mode

Starting with the imports: httpx, will be the library we use here to make http requests. The urlparse method helps us determine whether a string is possibly a valid URL.

# Initialize the MCP server
mcp = FastMCP("search", dependencies=["uvicorn"])
Enter fullscreen mode Exit fullscreen mode

This initializes the server; the first argument is the tool's name. I am not 100% sure why uvicorn needs to be explicitly added as a dependency here since it is a transitive dependency of FastMCP but it does seem to be required.

It's possibly due to how the fastmcp cli (more on that shortly) installs the server. If you have other dependencies, you must add them here so the client knows you need to install them before running the client; we will see how that works in a moment.

# Configuration
JINAAI_SEARCH_URL = "https://s.jina.ai/"
JINAAI_READER_URL = "https://r.jina.ai/"
JINAAI_GROUNDING_URL = "https://g.jina.ai/"
JINAAI_API_KEY = os.getenv("JINAAI_API_KEY")
Enter fullscreen mode Exit fullscreen mode

You can probably suss out the pattern here, but Jina uses different subdomains to route particular requests. The search endpoint expects a query, the reader endpoint expects a URL, and the grounding endpoint can provide the llm with a specific response or answer.

Grounding is a much larger topic and is used with other techniques, such as RAG and fine-tuning, to assist LLMs in reducing hallucinations and improving decision-making.

Our first tool

@mcp.tool()
async def read(query_or_url: str) -> str:
  """
  Read content from a URL or perform a search query.
  """
  try:
    if not JINAAI_API_KEY:
        return "JINAAI_API_KEY environment variable is not set"
    headers = { 
        "Authorization": f"Bearer {JINAAI_API_KEY}",
        "X-Retain-Images": "none",
        "X-Timeout": "20",
        "X-Locale": "en-US",
    }

    async with httpx.AsyncClient() as client:
      if is_valid_url(query_or_url):
        headers["X-With-Links-Summary"] = "true"
        url = f"{JINAAI_READER_URL}{query_or_url}"
      else:
        url = f"{JINAAI_SEARCH_URL}{query_or_url}"
      response = await client.get(url, headers=headers)
      return response.text
  except Exception as e:
      return str(e)
Enter fullscreen mode Exit fullscreen mode

The annotation @mcp.tool does a lot of the heavy lifting. Similar annotations for resources and prompts exist in the library. The annotation extracts the details of the function signature and return type to create an input and output schema for the llm to call the tool. It configures the tool so the client understands the server's capabilities. It also registers the function calls as handlers for the configured tool.

Next, you'll notice that the function is async. No runtime configuration is needed, and no asyncio.run stuff either. If you need to, for some reason, run the server as a standalone service, you do need to handle some of this yourself. There is an example in the FastMCP repo for how to do this.

The function body is reasonably uninteresting; it validates whether it is receiving a URL, sets the appropriate headers, calls the Jina endpoint, and returns the text.

def is_valid_url(url: str) -> bool:
  """
  Validate if the given string is a proper URL.
  """
  try:      
    result = urlparse(url)
    return all([result.scheme in ("http", "https"), result.netloc])
  except:
    return False
Enter fullscreen mode Exit fullscreen mode

Here is the rather simple function for determining the validity of a possible url string.

Second Tool

@mcp.tool()
async def fact_check(query: str) -> str:
  """
  Perform a fact-checking query.
  """
  try:
    if not JINAAI_API_KEY:
      return "JINAAI_API_KEY environment variable is not set"

      headers = { 
         "Authorization": f"Bearer {JINAAI_API_KEY}",
         "Accept": "application/json",
        }

      async with httpx.AsyncClient() as client:
        url = f"{JINAAI_GROUNDING_URL}{query}"
        response = await client.get(url, headers=headers)
        res = response.json()
        if res["code"] != 200:
            return "Failed to fetch fact-check result"
        return res["data"]["reason"]
  except Exception as e:
    return str(e)
Enter fullscreen mode Exit fullscreen mode

And that's it...

Testing and Debugging

fastmcp dev src/server.py --with-editable .
Enter fullscreen mode Exit fullscreen mode

Running the above command will start the mcp inspector. It's a tool that the sdk provides in order to test and debug server responses. The --with-editable flag allows you to make changes to the server without having to relaunch the inspector (highly, HIGHLY recommended).

You should see:

πŸ” MCP Inspector is up and running at http://localhost:5173 πŸš€
Enter fullscreen mode Exit fullscreen mode

By default the inspector runs on port 5173, and the server (the code you just wrote) will run on port 3000, you can change this by setting the SERVER_PORT and CLIENT_PORT before invocation.

SERVER_PORT=3000 fastmcp dev src/server.py
Enter fullscreen mode Exit fullscreen mode

The Inspector

Screenshot of the inspector tool

If all goes well you should see something like the above, on the left you can add the environment variables you'll need, here the JINAAI_API_KEY is the only one.

If you click on Tools on the top menu bar, and then List Tools you should see the tools we created, notice that the docstring serves as the description for the tool.

Clicking on a particular tool will bring up the textboxes for you to enter the parameters needed to call the tool.

Installing the Server

After you are satisfied things are working as expected, you are now ready to install the server on the Claude Desktop App client.

fastmcp install src/server.py -f .env
Enter fullscreen mode Exit fullscreen mode

Will do so, I am sure in the future it will support other clients, but for now, this is all you need to do. The -f .env will pass the env variables to the app client.

What this does under the hood is update the claude_desktop_config.json and provides the necessary command and arguments to run the server. By default this uses uv which must be available on your PATH.

If you now open the Claude Desktop App, and go to the Menu Bar and Click Claude > Settings and then click on Developer you should see the name of your tool you set when initializing the server.

Screenshot of Claude Desktop App Settings

Clicking on it should bring up it's config. Not only will you know how it gets executed, but in the Advanced Options you'll see the env variables that have been set 😬.

You can also edit this config directly, but I wouldn't necessarily recommend it here.

Profit

If all goes well when you go the Desktop App you should see no errors (if you do, going to the Settings should give you a button to check out the logs and investigate from there).

Chat Window with Hammer Symbol

Additionally you should see a hammer symbol with the number of individual tools you have at your disposal (note: your count should be 2 unless you've installed other MCP servers)

Rather than invoking the tool directly you chat with the app as you would normally, and when it encounters a situation where it decides that the tool is helpful it will ask if you want to use it. No additional code or configuration here is necessary.

I think it relies both on the tool name and description in order to decide whether it is appropriate, so it's worth crafting a clear simple description of what the tool does.

You will get a prompt like the following:

Prompt for tool

And you can just "chat" with it, admittedly the tool as written sometimes runs into issues. Occasionally it decides it can't access the internet, sometimes it fails to retrieve results, but sometimes you get this:

Chatting Using Tool

This had kind of a natural flow, where it read the page, provided a summary, and you ask it to go to a specific article and read that.

Final Notes

Hopefully, that gave you some insight into MCP Servers. There's plenty to read and watch but one more site I'll recommend is glama.ai. They maintain a fairly comprehensive list of available MCP Servers to download and try out, including other web search tools that are more reliable than our toy example. Check it out, and thank you for following along.

Top comments (0)