Ajit Kumar

Posted on Dec 12, 2025

Building Your First Agentic AI: Complete Guide to MCP + Ollama Tool Calling

#mcp #llm #ai

Learn how to build AI agents that can use tools, make decisions, and take actions - all running locally on your machine!

📋 Structure:

Introduction - What agentic AI is with a clear example
Background - Explains Ollama, tool calling, and MCP from scratch
Architecture - Visual diagram showing how everything connects
Ollama Setup - Complete installation guide with verification steps
Model Guide - Tables showing which models support tools (✅) and which don't (❌)
Project Setup - Virtual environment and dependencies
Building the MCP Server - Step-by-step with code explanations
Building the Client - Detailed walkthrough with annotations
Running the System - Multi-terminal setup guide
How It Works - Deep dive into the request flow with diagrams
Customization - Examples of adding new tools
Troubleshooting - 5 common problems with solutions
6 Real Project Ideas - Each with tool implementations, examples, and tech stack

🎯 Project Ideas Included:

Email Assistant - Manage emails with AI
Personal Knowledge Base - Smart note-taking system
Finance Manager - Track expenses and budgets
Smart Home Controller - Control IoT devices
Data Analysis Assistant - Analyze datasets and create reports
Study Assistant - Flashcards and quiz system

Each project includes actual code snippets, example usage, and recommended tech stack!

🎯 What We're Building

By the end of this tutorial, you'll have an AI agent that can:

Understand natural language requests
Decide which tools to use
Execute functions automatically
Return intelligent responses

Example:

You: "Hey, greet Alice and then calculate 150 + 75"

AI: *Thinking... I need to use the greet tool and the add tool*
    *Calls greet("Alice") → "Hello, Alice! Welcome!"*
    *Calls add(150, 75) → 225*

AI: "Hello Alice! Welcome! The sum of 150 and 75 is 225."

All of this runs locally on your machine - no API keys, no cloud costs!

📚 Background: What You Need to Know

What is Ollama?

Ollama is like Docker, but for AI models. It lets you:

Download AI models with one command
Run them locally on your laptop
Use them without internet or API keys

Think of it as "AI models in a box" - simple, fast, and private.

What is Tool Calling?

Tool calling (also called function calling) is when an AI model can:

Recognize it needs to use a tool
Choose the right tool for the job
Provide the correct parameters
Interpret the results

Example Without Tool Calling:

You: "What's 1,247 × 893?"
AI: "Approximately 1,113,571" ❌ (Could be wrong!)

Example WITH Tool Calling:

You: "What's 1,247 × 893?"
AI: *Uses calculator tool*
AI: "1,113,571" ✅ (Always correct!)

What is MCP (Model Context Protocol)?

MCP is a standardized way for AI models to connect with tools and data sources. Think of it as USB-C for AI:

Without MCP (Chaos):
AI Model A → Custom API → Tool 1
AI Model B → Different API → Tool 1
AI Model C → Another API → Tool 1

With MCP (Standardized):
AI Model A ─┐
AI Model B ─┼→ MCP → Tool 1
AI Model C ─┘

FastMCP is a Python library that makes implementing MCP servers super easy.

🏗️ Architecture Overview

Here's how all the pieces fit together:

┌─────────────────────────────────────────────────────────────┐
│                    YOUR COMPUTER                            │
│                                                             │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐ │
│  │              │    │              │    │              │ │
│  │   Ollama     │◄──►│    Python    │◄──►│   FastMCP    │ │
│  │   (AI Brain) │    │   Client     │    │   Server     │ │
│  │              │    │ (Orchestrator)│    │   (Tools)    │ │
│  └──────────────┘    └──────────────┘    └──────────────┘ │
│                                                             │
└─────────────────────────────────────────────────────────────┘

The Flow:

You ask a question
Python Client sends it to Ollama with available tools
Ollama decides if it needs tools
Client executes tools via FastMCP
Client sends results back to Ollama
Ollama generates final answer
You get the response

🚀 Part 1: Installing Ollama

Step 1: Download and Install

For Linux:

curl -fsSL https://ollama.ai/install.sh | sh

For macOS:

brew install ollama

For Windows:
Download from: https://ollama.ai/download

Step 2: Verify Installation

ollama --version

You should see something like: ollama version 0.1.x

Step 3: Start Ollama Server

ollama serve

Keep this terminal open! Ollama needs to be running in the background.

Step 4: Pull a Model

Open a new terminal and pull a model that supports tool calling:

ollama pull llama3.2

This downloads about 2GB - grab a coffee! ☕

Step 5: Test Your Model

ollama run llama3.2

Try chatting with it:

>>> Hello! Who are you?
I'm Llama 3.2, an AI assistant...

>>> /bye

✅ Checkpoint: You now have a working local AI model!

🔍 Understanding Ollama Models

Not all models support tool calling! Here's what you need to know:

✅ Models That Support Tools

Model	Size	Speed	Best For
`llama3.2`	3GB	Fast	Recommended for this tutorial
`llama3.1`	5GB	Medium	More accurate responses
`mistral`	4GB	Fast	Good general purpose
`qwen2.5`	4GB	Fast	Multilingual support

Pull any of these:

ollama pull llama3.2      # Start with this one!
ollama pull llama3.1      # If you want more power
ollama pull mistral       # Alternative option

❌ Models That DON'T Support Tools

Model	Why Not?
`codellama`	Built only for code generation
`llama2`	Older architecture, no tool support
`phi`	Too small for complex tool reasoning

What happens if you use them?

Error: does not support tools (status code: 400)

Checking Running Models

# See what's currently loaded
ollama list

# Stop all models to free memory
ollama stop llama3.2

🛠️ Part 2: Setting Up the Project

Step 1: Create Project Directory

mkdir mcp-ollama-tutorial
cd mcp-ollama-tutorial

Step 2: Create Virtual Environment

# Create virtual environment
python -m venv myenv

# Activate it
source myenv/bin/activate  # Linux/Mac
# OR
myenv\Scripts\activate     # Windows

# You should see (myenv) in your prompt

Step 3: Install Dependencies

pip install fastmcp ollama requests

What we're installing:

fastmcp - For creating MCP tool servers
ollama - Python client for Ollama
requests - For HTTP communication

Step 4: Verify Installation

python -c "import fastmcp; import ollama; print('✅ All packages installed!')"

🔧 Part 3: Creating the MCP Server

The MCP server is where we define our tools. Let's create mcp_server.py:

# mcp_server.py
from fastmcp import FastMCP

# Create the MCP server instance
mcp = FastMCP("My First MCP Server")

# Define Tool 1: Add two numbers
@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers together"""
    return a + b

# Define Tool 2: Greet someone
@mcp.tool()
def greet(name: str) -> str:
    """Greet someone by name"""
    return f"Hello, {name}! Welcome!"

# Define Tool 3: Multiply numbers
@mcp.tool()
def multiply(a: float, b: float) -> float:
    """Multiply two numbers"""
    return a * b

# Define Tool 4: Get current time
@mcp.tool()
def get_time() -> str:
    """Get the current time"""
    from datetime import datetime
    return datetime.now().strftime("%I:%M %p")

if __name__ == "__main__":
    # Start the server
    mcp.run(transport="sse", port=8080)

Understanding the Code

Line by line breakdown:

from fastmcp import FastMCP

Import the FastMCP library.

mcp = FastMCP("My First MCP Server")

Create a server instance. The name is just for identification.

@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers together"""
    return a + b

@mcp.tool() - Decorator that registers this function as a tool
a: int, b: int - Type hints tell the AI what parameters to send
"""Add two numbers together""" - Description the AI reads
return a + b - The actual logic

FastMCP automatically:

Exposes this function over HTTP
Generates JSON schema for the AI
Handles all the communication

Start the Server

python mcp_server.py

You should see:

INFO:     Started server process
INFO:     Uvicorn running on http://127.0.0.1:8080

✅ Checkpoint: Your tool server is running!

Keep this terminal open. Open a new terminal for the next steps.

🤖 Part 4: Creating the Ollama Client

Now let's create the client that connects everything. Create client_ollama.py:

# client_ollama.py
import json
import ollama
from fastmcp import Client as MCPClient
import asyncio
import sys

# Configuration
OLLAMA_MODEL = "llama3.2"
MCP_SERVER_URL = "http://127.0.0.1:8080/mcp"

# ------------------------------------------------------------
# Step 1: Discover available tools from MCP server
# ------------------------------------------------------------
async def load_mcp_tools():
    """Connect to MCP server and get list of available tools"""
    try:
        async with MCPClient(MCP_SERVER_URL) as mcp:
            # Ask server: "What tools do you have?"
            tools_list = await mcp.list_tools()

            # Convert to format Ollama understands
            ollama_tools = []
            for tool in tools_list:
                ollama_tools.append({
                    "type": "function",
                    "function": {
                        "name": tool.name,
                        "description": tool.description,
                        "parameters": tool.inputSchema,
                    },
                })
            return ollama_tools
    except Exception as e:
        print(f"❌ ERROR connecting to MCP server: {e}")
        print(f"\nMake sure the server is running:")
        print("  python mcp_server.py")
        sys.exit(1)

# ------------------------------------------------------------
# Step 2: Execute a tool when AI requests it
# ------------------------------------------------------------
async def execute_tool(tool_name: str, arguments: dict):
    """Call a tool on the MCP server with given arguments"""
    try:
        async with MCPClient(MCP_SERVER_URL) as mcp:
            result = await mcp.call_tool(tool_name, arguments)
            return result
    except Exception as e:
        print(f"❌ ERROR executing tool {tool_name}: {e}")
        return {"error": str(e)}

# ------------------------------------------------------------
# Step 3: Main conversation loop
# ------------------------------------------------------------
async def main():
    print("🔍 Loading MCP tools...")
    tools = await load_mcp_tools()
    print(f"✅ Loaded {len(tools)} tools:")
    for tool in tools:
        print(f"   - {tool['function']['name']}: {tool['function']['description']}")
    print()

    # The user's question
    user_msg = "Please greet John and then add 150 + 75."
    print(f"👤 User: {user_msg}\n")

    # Send to Ollama with tools available
    try:
        response = ollama.chat(
            model=OLLAMA_MODEL,
            messages=[{"role": "user", "content": user_msg}],
            tools=tools,  # ← AI now knows these tools exist!
            stream=False,
        )
    except Exception as e:
        print(f"❌ ERROR calling Ollama: {e}")
        print(f"\nMake sure:")
        print(f"  1. Ollama is running (ollama serve)")
        print(f"  2. Model is installed (ollama pull {OLLAMA_MODEL})")
        sys.exit(1)

    # Check: Did AI want to use tools?
    if not response.get("message", {}).get("tool_calls"):
        print("🤖 AI answered directly (no tools needed):")
        print(response["message"]["content"])
        return

    # Process tool calls
    messages = [
        {"role": "user", "content": user_msg},
        response["message"]
    ]

    for tool_call in response["message"]["tool_calls"]:
        tool_name = tool_call["function"]["name"]
        args = tool_call["function"]["arguments"]

        # Parse if arguments are JSON string
        if isinstance(args, str):
            args = json.loads(args)

        print(f"🔧 Tool requested: {tool_name}")
        print(f"📝 Arguments: {args}")

        # Execute the tool
        tool_result = await execute_tool(tool_name, args)
        print(f"✅ Tool result: {tool_result}\n")

        # Add tool response to conversation
        messages.append({
            "role": "tool",
            "content": json.dumps(tool_result) if isinstance(tool_result, dict) else str(tool_result),
        })

    # Send tool results back to AI for final answer
    final = ollama.chat(
        model=OLLAMA_MODEL,
        messages=messages,
    )

    print("🤖 Final AI response:")
    print(final["message"]["content"])

if __name__ == "__main__":
    asyncio.run(main())

Understanding the Client Code

The Three-Step Process:

Discovery Phase (load_mcp_tools)
- Connects to MCP server
- Gets list of available tools
- Converts to Ollama's format
Execution Phase (execute_tool)
- When AI requests a tool
- Client calls MCP server
- Returns result
Conversation Loop (main)
- Send question + tools to Ollama
- If AI wants tools → execute them
- Send results back to AI
- Get final answer

🎬 Part 5: Running the Complete System

Step 1: Start MCP Server (Terminal 1)

cd mcp-ollama-tutorial
source myenv/bin/activate
python mcp_server.py

Wait for: Uvicorn running on http://127.0.0.1:8080

Step 2: Start Ollama (Terminal 2)

ollama serve

Step 3: Run Client (Terminal 3)

cd mcp-ollama-tutorial
source myenv/bin/activate
python client_ollama.py

Expected Output

🔍 Loading MCP tools...
✅ Loaded 4 tools:
   - add: Add two numbers together
   - greet: Greet someone by name
   - multiply: Multiply two numbers
   - get_time: Get the current time

👤 User: Please greet John and then add 150 + 75.

🔧 Tool requested: greet
📝 Arguments: {'name': 'John'}
✅ Tool result: Hello, John! Welcome!

🔧 Tool requested: add
📝 Arguments: {'a': 150, 'b': 75}
✅ Tool result: 225

🤖 Final AI response:
Hello, John! Welcome! The sum of 150 and 75 is 225.

🎉 Success! Your AI agent is working!

🔬 Part 6: How It Actually Works

Let's trace exactly what happens behind the scenes:

Request Flow Diagram

1. User asks question
   ↓
2. Client gets tools from MCP server
   {
     "tools": [
       {"name": "add", "description": "Add two numbers", ...},
       {"name": "greet", "description": "Greet someone", ...}
     ]
   }
   ↓
3. Client sends to Ollama:
   {
     "model": "llama3.2",
     "messages": [{"role": "user", "content": "Greet John and add 150 + 75"}],
     "tools": [...tool definitions...]
   }
   ↓
4. Ollama AI thinks:
   "I need to:
    1. Use 'greet' tool with name='John'
    2. Use 'add' tool with a=150, b=75"

   Returns:
   {
     "tool_calls": [
       {"function": {"name": "greet", "arguments": {"name": "John"}}},
       {"function": {"name": "add", "arguments": {"a": 150, "b": 75}}}
     ]
   }
   ↓
5. Client executes each tool via MCP:

   MCP Request 1:
   POST http://127.0.0.1:8080/mcp/call_tool
   {"tool": "greet", "arguments": {"name": "John"}}

   Response: "Hello, John! Welcome!"

   MCP Request 2:
   POST http://127.0.0.1:8080/mcp/call_tool
   {"tool": "add", "arguments": {"a": 150, "b": 75}}

   Response: 225
   ↓
6. Client sends results back to Ollama:
   {
     "messages": [
       {"role": "user", "content": "Greet John..."},
       {"role": "assistant", "tool_calls": [...]},
       {"role": "tool", "content": "Hello, John! Welcome!"},
       {"role": "tool", "content": "225"}
     ]
   }
   ↓
7. Ollama generates final natural language response:
   "Hello, John! Welcome! The sum of 150 and 75 is 225."
   ↓
8. Client displays to user

The Magic of Tool Schemas

The AI knows how to call tools because of the schema:

{
  "type": "function",
  "function": {
    "name": "add",
    "description": "Add two numbers together",
    "parameters": {
      "type": "object",
      "properties": {
        "a": {"type": "integer", "description": "First number"},
        "b": {"type": "integer", "description": "Second number"}
      },
      "required": ["a", "b"]
    }
  }
}

The AI reads this and understands:

✅ There's a tool called "add"
✅ It needs two integers: a and b
✅ It's for adding numbers
✅ Both parameters are required

🎨 Part 7: Customizing and Extending

Adding a Weather Tool

Add to mcp_server.py:

@mcp.tool()
def get_weather(city: str) -> str:
    """Get weather information for a city"""
    # In real app, call a weather API
    # For demo, return fake data
    weather_data = {
        "New York": "Sunny, 72°F",
        "London": "Rainy, 15°C",
        "Tokyo": "Cloudy, 20°C"
    }
    return weather_data.get(city, f"Weather data not available for {city}")

Restart the server, and now the AI can check weather!

# Test it:
user_msg = "What's the weather in Tokyo?"

Adding a File Operations Tool

@mcp.tool()
def save_note(title: str, content: str) -> str:
    """Save a note to a file"""
    import os
    filename = f"notes/{title.replace(' ', '_')}.txt"
    os.makedirs("notes", exist_ok=True)
    with open(filename, "w") as f:
        f.write(content)
    return f"Note saved to {filename}"

@mcp.tool()
def list_notes() -> list:
    """List all saved notes"""
    import os
    if not os.path.exists("notes"):
        return []
    return os.listdir("notes")

Now your AI can manage notes!

Adding a Database Tool

@mcp.tool()
def search_users(name: str) -> list:
    """Search for users by name"""
    # Connect to your database
    # For demo, return fake data
    users = [
        {"id": 1, "name": "Alice", "email": "alice@example.com"},
        {"id": 2, "name": "Bob", "email": "bob@example.com"}
    ]
    return [u for u in users if name.lower() in u["name"].lower()]

🐛 Troubleshooting Guide

Problem 1: "does not support tools"

Error:

requests.exceptions.HTTPError: 400 Client Error: 
registry.ollama.ai/library/codellama:latest does not support tools

Solution:
Your model doesn't support tool calling. Switch to a compatible model:

ollama pull llama3.2

Then update client_ollama.py:

OLLAMA_MODEL = "llama3.2"

Problem 2: "Cannot connect to MCP server"

Error:

ERROR connecting to MCP server: Connection refused

Solutions:

Make sure server is running:

   python mcp_server.py

Check the port:

   netstat -an | grep 8080
   # Should show: LISTEN on 8080

Try a different port: In mcp_server.py: mcp.run(transport="sse", port=8081) In client_ollama.py: MCP_SERVER_URL = "http://127.0.0.1:8081/mcp"

Problem 3: "Ollama not found"

Error:

ollama: command not found

Solution:
Reinstall Ollama:

# Linux
curl -fsSL https://ollama.ai/install.sh | sh

# Mac
brew install ollama

# Windows
# Download from https://ollama.ai/download

Problem 4: AI Doesn't Call Tools

Symptoms:

AI answers directly without using tools
Even when tools would be helpful

Solutions:

Be explicit in your prompt:

   # Instead of:
   user_msg = "What's 150 + 75?"

   # Try:
   user_msg = "Use the add tool to calculate 150 + 75"

Improve tool descriptions:

   @mcp.tool()
   def add(a: int, b: int) -> int:
       """Add two numbers. Use this for any arithmetic addition."""
       return a + b

Use a larger model:

   ollama pull llama3.1:8b

Problem 5: Import Errors

Error:

ModuleNotFoundError: No module named 'fastmcp'

Solution:
Activate your virtual environment:

source myenv/bin/activate  # Linux/Mac
myenv\Scripts\activate     # Windows

# Then reinstall:
pip install fastmcp ollama requests

💡 Real-World Project Ideas

1. 📧 Email Assistant

What it does: AI that can read, send, and manage emails.

Tools to implement:

@mcp.tool()
def send_email(to: str, subject: str, body: str) -> str:
    """Send an email"""
    # Use smtplib or email API
    pass

@mcp.tool()
def search_emails(query: str, limit: int = 10) -> list:
    """Search emails by keyword"""
    # Connect to IMAP or Gmail API
    pass

@mcp.tool()
def mark_as_read(email_id: str) -> bool:
    """Mark an email as read"""
    pass

Example usage:

User: "Send an email to boss@company.com thanking them for the meeting"
AI: *Uses send_email tool*
AI: "Email sent successfully!"

Tech stack:

Gmail API or SMTP
OAuth for authentication
SQLite for local email cache

2. 🗂️ Personal Knowledge Base

What it does: AI that stores and retrieves notes, documents, and ideas.

Tools to implement:

@mcp.tool()
def save_note(title: str, content: str, tags: list) -> str:
    """Save a note with tags"""
    pass

@mcp.tool()
def search_notes(query: str) -> list:
    """Search notes using semantic search"""
    # Use embeddings for smart search
    pass

@mcp.tool()
def summarize_document(file_path: str) -> str:
    """Summarize a PDF or text document"""
    pass

Example usage:

User: "Save this: Meeting notes - Q4 review, discussed revenue targets"
AI: *Saves with appropriate tags*

User: "What did we discuss about revenue?"
AI: *Searches and finds relevant notes*

Tech stack:

Vector database (ChromaDB, FAISS)
Document parsing (PyPDF2, python-docx)
Embeddings for semantic search

3. 💰 Personal Finance Manager

What it does: AI that tracks expenses, creates budgets, generates reports.

Tools to implement:

@mcp.tool()
def add_transaction(amount: float, category: str, description: str) -> str:
    """Record a transaction"""
    pass

@mcp.tool()
def get_spending_report(month: str, category: str = None) -> dict:
    """Get spending summary"""
    pass

@mcp.tool()
def set_budget(category: str, amount: float) -> str:
    """Set a budget for a category"""
    pass

@mcp.tool()
def check_budget_status() -> list:
    """Check all budgets and spending"""
    pass

Example usage:

User: "I spent $45 on groceries at Walmart"
AI: *Records transaction, checks budget*
AI: "Transaction recorded. You've spent $245/$300 of your grocery budget this month."

Tech stack:

SQLite for transaction storage
Pandas for data analysis
Matplotlib for visualizations

4. 🏠 Smart Home Controller

What it does: AI that controls IoT devices, sets schedules, monitors home.

Tools to implement:

@mcp.tool()
def control_light(room: str, action: str, brightness: int = 100) -> str:
    """Control smart lights (on/off/dim)"""
    # Integrate with Phillips Hue, HomeKit, etc.
    pass

@mcp.tool()
def set_thermostat(temperature: int) -> str:
    """Set home temperature"""
    pass

@mcp.tool()
def get_camera_feed(camera_id: str) -> str:
    """Get snapshot from security camera"""
    pass

@mcp.tool()
def create_scene(name: str, devices: list) -> str:
    """Create a scene (e.g., 'Movie Night')"""
    pass

Example usage:

User: "I'm going to bed"
AI: *Creates bedtime routine*
   - Turns off living room lights
   - Dims bedroom lights to 20%
   - Lowers thermostat to 68°F
   - Locks front door
AI: "Goodnight! Your bedtime routine is active."

Tech stack:

Home Assistant API
MQTT for IoT communication
Webhooks for device control

5. 📊 Data Analysis Assistant

What it does: AI that analyzes datasets, creates visualizations, finds insights.

Tools to implement:

@mcp.tool()
def load_dataset(file_path: str) -> str:
    """Load a CSV or Excel file"""
    pass

@mcp.tool()
def describe_data(dataset_id: str) -> dict:
    """Get statistical summary"""
    pass

@mcp.tool()
def create_chart(dataset_id: str, chart_type: str, x: str, y: str) -> str:
    """Create visualization"""
    pass

@mcp.tool()
def find_correlations(dataset_id: str) -> list:
    """Find correlated variables"""
    pass

@mcp.tool()
def export_report(dataset_id: str, format: str) -> str:
    """Generate PDF or HTML report"""
    pass

Example usage:

User: "Analyze sales_2024.csv and show me trends"
AI: *Loads data, analyzes, creates charts*
AI: "I found that sales peak in Q4, with 35% growth. 
     Electronics category drives most revenue. 
     Here's a chart showing monthly trends."

Tech stack:

Pandas for data manipulation
Matplotlib/Plotly for visualization
Scipy for statistical analysis

6. 🎓 Study Assistant & Flashcard System

What it does: AI that creates study materials, quizzes you, tracks progress.

Tools to implement:

@mcp.tool()
def create_flashcard(question: str, answer: str, topic: str) -> str:
    """Create a new flashcard"""
    pass

@mcp.tool()
def start_quiz(topic: str, num_questions: int = 10) -> list:
    """Start a quiz session"""
    pass

@mcp.tool()
def record_answer(question_id: str, correct: bool) -> dict:
    """Record quiz performance"""
    pass

@mcp.tool()
def get_weak_topics() -> list:
    """Identify topics that need more study"""
    pass

@mcp.tool()
def summarize_textbook(file_path: str) -> str:
    """Create study notes from textbook"""
    pass

Example usage:

User: "Quiz me on Python basics"
AI: *Generates quiz from flashcards*
AI: "Question 1: What is a list comprehension? 
     Type 'answer' when ready!"

User: "It's a concise way to create lists"
AI: *Checks answer*
AI: "Correct! Score: 1/10. Next question..."

Tech stack:

Spaced repetition algorithm
NLP for answer checking
PDF parsing for textbook import
SQLite for progress tracking

🎯 Next Steps

Congratulations! You've built your first agentic AI system. Here's what to explore next:

1. Add More Models

ollama pull mistral        # Try different personalities
ollama pull qwen2.5        # Better multilingual support
ollama pull llama3.1:70b   # More powerful reasoning

2. Add Authentication

Secure your MCP server with API keys:

@mcp.tool()
def secret_tool(api_key: str, data: str) -> str:
    if api_key != "your-secret-key":
        return "Unauthorized"
    # ... rest of logic

3. Add Streaming Responses

Make the AI respond in real-time:

response = ollama.chat(
    model=OLLAMA_MODEL,
    messages=messages,
    tools=tools,
    stream=True,  # ← Enable streaming
)

for chunk in response:
    print(chunk['message']['content'], end='', flush=True)

4. Deploy to Production

Use Docker for consistent environments
Add logging and monitoring
Implement rate limiting
Set up health checks

5. Connect Multiple AI Agents

Create a multi-agent system where different AIs specialize in different tasks!

📚 Resources

Ollama Documentation: https://ollama.ai/docs
FastMCP Documentation: https://gofastmcp.com
MCP Protocol Spec: https://modelcontextprotocol.io
Ollama Python Library: https://github.com/ollama/ollama-python

🤝 Join the Community

Share your projects on Twitter with #MCPOllama
Join Ollama Discord: https://discord.gg/ollama
Contribute to FastMCP on GitHub

🎉 Conclusion

You now know how to:

✅ Install and use Ollama locally
✅ Understand tool calling and why it matters
✅ Build MCP servers with custom tools
✅ Connect AI models to real-world functions
✅ Create agentic AI applications

The possibilities are endless. What will you build?

Share your creations in the comments below! 👇

Found this helpful? Give it a ❤️ and follow me for more AI tutorials!

Top comments (3)

Rajkumar Chellappa • Dec 16 '25

Hi, ollama model is running and mcp server also running but when i execute client ollama , getting below error.

tcp4 0 0 127.0.0.1.8081 . LISTEN

llama3.2:latest a80c4f17acd5 2.0 GB 19 minutes ago

Error:
Make sure the server is running:
python mcp_server.py

Can you please confirm what am i missing here ?

Rajkumar Chellappa • Dec 16 '25

it was working fine, after correcting the below code instead of mcp.run(transport="sse", port=8080)

if name == "main":
stateless_http = True
# Start the server
# mcp.run(transport="sse", port=8080)
mcp.run(transport="http", port=8080,stateless_http=True)

Ajit Kumar • Dec 30 '25 • Edited

I think, you changed the transport from "sse" to "http", now ollama runs on http with 8080 port.. that might be the issue. Need to check, but i am sure it is port collision issue.

Note: Sorry, for late response, i got busy with year end work.