Give Your Local AI Tool-Calling Superpowers with Open WebUI and MCP

#ai #beginners #opensource #tutorial

Want a ChatGPT-like experience where your AI can search the web, read your files, query databases, and run code? Open WebUI + MCP makes it possible - all running locally on your hardware.

The Model Context Protocol (MCP) is an open standard that lets AI connect to external tools. Open WebUI supports MCP natively, turning your local Ollama setup into a tool-equipped AI assistant.

Prerequisites

GPU: RTX 3060 12GB or better (for Qwen3 14B at Q8)
Software: Docker + Docker Compose
Time: ~25 minutes setup

Installation

Create a docker-compose.yml:

services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
      - MCP_ENABLE=true
      - ENABLE_TOOLS=true
    volumes:
      - open-webui:/app/backend/data
    ports:
      - "3000:8080"

  ollama:
    image: ollama/ollama:latest
    volumes:
      - ollama:/root/.ollama
    ports:
      - "11434:11434"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

volumes:
  ollama:
  open-webui:

docker compose up -d

Pull a model with strong tool-calling:

docker exec ollama ollama pull qwen3:14b:q8_0

Open http://localhost:3000 and create your admin account.

Adding MCP Tools

Go to Admin Panel → Settings → External Tools in Open WebUI.

Web Search Tool

npx -y @anthropic/mcp-server-brave-search

Filesystem Access

npx -y @modelcontextprotocol/server-filesystem /allowed/path

Configure each tool in the Open WebUI admin panel to give your AI real-world capabilities.

Usage

Start a new chat and click the tools icon (wrench) next to the input box. Select which tools the AI can use, then ask:

"Search the web for latest AI news"
"Read my project's README and summarize it"
"Query the sales database for Q3 results"

The AI decides when to call tools and incorporates results into its responses.

Results

With Qwen3 14B Q8 on an RTX 4070 Super: tool calls complete in 3-5 seconds. Web search results are returned in 2-3 seconds. All data stays on your machine.

Originally published on everylocalai.com