Mahmud Rahman

Posted on Apr 20 • Edited on Jun 5

Meet TalkLLM — A Local AI Assistant in React

#llm #webllm #chatapp #ai

🧠 Meet TalkLLM — A Local AI Assistant in React

Over the weekend, I built TalkLLM — a minimal, privacy-focused AI chat app that runs directly in the browser using WebLLM.

It uses Meta’s Llama 3 model via MLC AI’s WebLLM engine — meaning you get:

✅ Fast, local response generation
🛡️ Full privacy (no API keys, no cloud)
⚡ A smooth chat interface with history, loading state, and error handling

🛠️ What is WebLLM?

WebLLM lets you run large language models in-browser using WebGPU + WebAssembly — no servers or Python required.

It’s powered by MLC, a powerful compiler stack for machine learning models. WebLLM brings that to your frontend.

You just need a compatible browser, and you can run powerful models like Llama 2/3, Phi, or Mistral — completely client-side.

⚙️ Tech Stack

React (with hooks)
@mlc-ai/web-llm
Llama 3.1 8B Instruct
Sass for lightweight styling

📦 Project Setup

🔗 GitHub Repo: https://github.com/mahmud-r-farhan/TalkLLM

🖥️ Requirements

✅ A WebGPU-compatible browser
- Chrome (v113+), Edge, Safari (macOS), or Brave
- 👉 Test your browser here
✅ WebAssembly support
✅ Node.js v16+
✅ npm or yarn

📂 Installation

git clone https://github.com/mahmud-r-farhan/TalkLLM
cd TalkLLM
npm install
npm run dev

The app will run locally at http://localhost:5173.

🔍 Key Features in the Code

Local model initialization using CreateMLCEngine()
Loading indicators during model setup and generation
Clean useCallback()-wrapped sendMessageToLlm() function
Input validation (no empty prompts)
UI blocking during loading states

🧠 Chat Memory Example

const [messages, setMessages] = useState([
  { role: "system", content: "Hello, I am TalkLLM. How can I assist you today?" }
]);

The model keeps context using a messages array, just like the OpenAI API format.

🧪 Bonus: Want to Use a Different Model?

WebLLM supports multiple models. To switch, update this line:

const selectedModel = "Llama-3.1-8B-Instruct-q4f32_1-MLC";

Check out the model catalog for available variants.

🙋‍♂️ Why Build This?

I wanted a way to explore LLMs without vendor lock-in. And with WebGPU maturing, it felt like the perfect time to experiment with truly local AI.

This app is a proof-of-concept — and a great starting point if you’re building privacy-first AI tools or offline chat experiences.

📚 Resources

🙌 Final Thoughts

WebLLM is changing the game. Whether you're working on privacy-focused apps, internal tools, or want to tinker with LLMs — this opens up huge opportunities.

Follow for more!

DEV Community