Build Your Own Private ChatGPT in 15 Minutes – Local AI, Zero Cloud Cost

#beginners

Want a ChatGPT-like experience that runs entirely on your own GPU? No monthly fees, no data leaving your machine, and it works offline. Here's how to set it up in 15 minutes.

What You'll Build

A full ChatGPT-style web UI running locally
Your choice of open-source LLM (Qwen3 14B or Llama 3.1 8B)
Multiple user accounts for your LAN
100% private - nothing leaves your network

Prerequisites

A GPU with 12GB+ VRAM (RTX 3060 12GB works great)
Docker + Docker Compose installed
NVIDIA Container Toolkit for GPU passthrough (Linux) or WSL2 (Windows)

Setup

Create a docker-compose.yml file:

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    volumes:
      - ollama:/root/.ollama
    ports:
      - "11434:11434"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    depends_on:
      - ollama
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
    volumes:
      - open-webui:/app/backend/data
    ports:
      - "3000:8080"
    restart: unless-stopped

volumes:
  ollama:
  open-webui:

Run It

docker compose up -d
docker exec ollama ollama pull qwen3:14b

Open http://localhost:3000, create your admin account, pick qwen3:14b from the dropdown, and start chatting.

What Makes It Great

$0/month vs $20/month for ChatGPT Plus
Full privacy - conversations stay on your machine
Works offline - no internet connection needed after setup
Multi-user - share with family or your team on the same LAN
Model switching - swap between different models mid-conversation

Performance

On an RTX 3060 12GB with Qwen3 14B (Q4): ~20-25 tok/s, smooth for chat. For 8GB cards, use Llama 3.1 8B instead.

Originally published on everylocalai.com

DEV Community