NovaNest

Posted on Jun 29

Forget Cloud AI — Build Your Own Private Chat App with Web Context Using Ollama

#mcp #ai #python #programming

Why You Should Have Your Own Local LLM

The AI revolution is hard to ignore — chatbots, coding assistants, AI writing tools are everywhere. But behind that convenience often lies a hidden cost: your data, your privacy, and your control.

Most popular AI tools rely on cloud-based large language models (LLMs) running on servers you don’t own, with prompts and data sent over the internet. Even when companies promise security, the reality is simple: you don't fully control what happens to your inputs or the models generating your outputs.

That’s where local LLMs change the game.

Thanks to projects like Ollama and the rise of efficient, open-source models like LLaMA, Mistral, and others, it’s now possible to run capable AI models entirely on your own machine.

Why Should You Care About Local AI?

Full Privacy: Your conversations, data, and context never leave your device.

No API Costs: Forget subscriptions or API limits.

Offline Friendly: Your AI assistant works even without internet.

Tinker & Learn: Dive into how prompts, context, and LLMs really work under the hood.

Total Control: You decide what models to use, how they're updated, and how your data flows.

In this post, I’ll show you how I built ContextChat — a simple, private chat app that combines a local LLM with web context you control — and how you can do the same.

What is ContextChat? — Project Overview

Imagine having your own AI assistant — one that not only chats with you, but also understands the web pages you care about — all without sending a single byte of data to the cloud.

That’s exactly what ContextChat does.

In Simple Terms:

ContextChat is a fully local, open-source AI chat application that enhances your conversations with context from web pages you choose. It runs entirely on your machine, powered by open-source LLMs like LLaMA or Mistral through Ollama.

But unlike a basic chat app, it doesn’t just rely on the model’s static knowledge. You can add URLs, and ContextChat automatically extracts information from those pages, injecting that into your chat session as live, dynamic context.

Who is it For?

Developers exploring AI tooling
Privacy-conscious users avoiding cloud AI
Researchers experimenting with context-aware LLMs
Anyone curious about building practical, local AI apps
Tinkerers who want to fork, extend, or modify their own private AI assistant

Key Features at a Glance:

Desktop chat app with clean, simple UI
Runs a local server to manage chat, context, and LLM interaction
Lets you add web pages as context sources
Sends combined prompts to your locally running LLM
Fully private — no internet required after initial setup

Open source — anyone can fork, modify, and build on top of it

In short, ContextChat lets you experience a more useful, privacy-first AI assistant — without giving up control of your data or relying on external APIs — and gives you the source code to make it your own.

Under the Hood: Architecture Breakdown

Building a private AI chat app with web context sounds complex, but the core system is designed to be simple, modular, and entirely local.

Here’s how ContextChat works behind the scenes.

The Core Components

The project is divided into three main parts:

GUI Chat App (Tkinter, Python)

A minimal desktop interface for sending messages and viewing responses. Built with Tkinter for quick prototyping and cross-platform compatibility.
MCP Server (FastAPI, Python)

The Message/Context/Prompt (MCP) server handles the logic:

- Manages conversation history
- Stores added URLs
- Crawls web pages for content
- Assembles the final prompt with context
- Sends it to the local LLM via Ollama

LLM Inference (Ollama + GGUF Models)
- Ollama runs a chosen open-source LLM entirely on your machine
- Supports models like LLaMA, Mistral, and others in efficient GGUF format
- Processes the prompt and returns the AI-generated response

The Data Flow

A typical interaction looks like this:

You type a message in the chat GUI
The GUI sends your message to the MCP server
MCP gathers relevant context:
- Recent chat history
- Any added web page content
MCP combines everything into a single prompt
The prompt is sent to your local Ollama LLM
The LLM generates a response, returned via MCP to the GUI

All of this happens locally. No external APIs, no cloud dependencies — giving you full control over your data and AI interaction.

Step-by-Step: Setting Up Your Own ContextChat

One of the biggest advantages of ContextChat is how easy it is to get started. In just a few steps, you can have a private, local AI chat app running on your own machine.

Prerequisites

Any Computer
Python 3.9 or newer installed
Basic familiarity with terminal or command prompt

Note: The project is primarily tested on Mac and Linux. Windows works, but some steps may vary slightly depending on your setup.

Step 1: Install Ollama

Ollama handles running the local LLM efficiently. To install it, follow the official instructions: https://ollama.com/download

For Linux users, here’s a quick example:

curl -fsSL https://ollama.com/install.sh | sh

Once installed, pull your preferred LLM model (e.g., Mistral or LLaMA):

ollama serve
ollama pull mistral

The ollama serve command starts the Ollama service, making the LLM available for local apps like ContextChat.

Step 2: Set Up the MCP Server

The MCP Server handles context gathering, prompt construction, and communication with Ollama.

Navigate to the mcp_server directory:

cd contextchat/mcp_server
pip install -r requirements.txt
uvicorn main:app --reload

For Windows users: You can use Command Prompt or PowerShell for the same commands. If uvicorn isn't recognized, ensure your Pythn Scripts folder is added to your PATH.

Step 3: Run the GUI Chat App

The chat interface lives in the gui_app folder:

cd contextchat/gui_app
pip install -r requirements.txt
python app.py

For Windows users: Same commands apply in Command Prompt or PowerShell.

Troubleshooting Tips

Make sure ollama serve is running before starting ContextChat.
If you run into missing package errors, double-check that all requirements.txt dependencies are installed.
Windows users: You may need to adjust environment variables or use python vs. python3 depending on your setup.
The project is still evolving, so Windows-specific bugs may occur.

The ContextChat Approach: Local, Dynamic Knowledge

ContextChat enhances your AI assistant by injecting real-time context from web pages you choose — entirely on your device.

Here’s how it works:

You add URLs via the chat interface
The MCP server fetches and extracts text content from those pages
When you send a message, the extracted content is combined with your prompt
This richer, context-aware prompt goes to your local LLM via Ollama
The response reflects both your query and the added knowledge

All of this happens without your data or browsing activity leaving your machine.

Why It Matters

Get AI responses tailored to your chosen information sources
Explore new ways of making LLMs truly useful in your workflow
Maintain complete privacy and control — no third-party servers involved
Pave the way for more advanced local AI use cases, like document summarization or research assistants

Future Possibilities

This system lays the groundwork for even more powerful features:

Ingesting PDF or text documents as context
More advanced web crawlers that handle JavaScript-heavy pages
Real-time context updates during conversations
Smarter context filtering to avoid overwhelming the LLM

Future Development & How You Can Contribute

ContextChat is just getting started. While the current version offers a fully local, privacy-respecting AI chat experience with web context, there’s plenty of room to grow.

Here’s what’s planned — and how you can be part of it.

Planned Features

The following improvements are on the roadmap:

Show Added URLs in the GUI: So you can see, manage, and review your context sources at a glance.
Reset Context with One Click: A simple button to clear conversation history and added URLs.
Save and Load Chat History: Preserve your conversations across sessions.
Visual Theme Improvements: A more polished, user-friendly interface with better layouts and fonts.
Streaming AI Responses: See responses appear in real-time for a more natural chat feel.
Modern GUI Options: Exploring frameworks like Flet or PyQt to enhance the desktop app experience.
Document Ingestion: Add PDFs or text files as context, not just web pages.
Advanced Web Crawler: Better handling of complex websites, including JavaScript-rendered content.
Standalone Desktop Builds: Easy installers for Mac and Linux without requiring manual setup.

How You Can Contribute

ContextChat is open source — designed to evolve with community feedback and contributions.

Ways to get involved:

Fork the project and experiment with your own improvements
Submit pull requests for new features or bug fixes
Report issues or suggest ideas on the GitHub repository
Share feedback if you’re using it in your workflow
Help test on Windows or other environments

The vision is to build a practical, privacy-first AI toolkit that anyone can use and extend — without compromising control or security.

Conclusion

Cloud AI tools offer convenience — but at the price of privacy and control. With open-source projects like ContextChat, you no longer have to make that trade-off.

By running your own AI chat app locally, enhanced with web context you choose, you gain:

Complete privacy — your prompts, data, and browsing never leave your device
Total control over what models you use and how your assistant behaves
The ability to experiment, extend, and build on an open-source foundation

ContextChat is just the beginning. Whether you're a developer, researcher, or simply curious about private AI, this project shows what’s possible — and how easy it is to get started.

DEV Community