DEV Community

EveryLocalAI
EveryLocalAI

Posted on

Build Your Own Private ChatGPT in 15 Minutes – Local AI, Zero Cloud Cost

Want a ChatGPT-like experience that runs entirely on your own GPU? No monthly fees, no data leaving your machine, and it works offline. Here's how to set it up in 15 minutes.

What You'll Build

  • A full ChatGPT-style web UI running locally
  • Your choice of open-source LLM (Qwen3 14B or Llama 3.1 8B)
  • Multiple user accounts for your LAN
  • 100% private - nothing leaves your network

Prerequisites

  • A GPU with 12GB+ VRAM (RTX 3060 12GB works great)
  • Docker + Docker Compose installed
  • NVIDIA Container Toolkit for GPU passthrough (Linux) or WSL2 (Windows)

Setup

Create a docker-compose.yml file:

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    volumes:
      - ollama:/root/.ollama
    ports:
      - "11434:11434"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    depends_on:
      - ollama
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
    volumes:
      - open-webui:/app/backend/data
    ports:
      - "3000:8080"
    restart: unless-stopped

volumes:
  ollama:
  open-webui:
Enter fullscreen mode Exit fullscreen mode

Run It

docker compose up -d
docker exec ollama ollama pull qwen3:14b
Enter fullscreen mode Exit fullscreen mode

Open http://localhost:3000, create your admin account, pick qwen3:14b from the dropdown, and start chatting.

What Makes It Great

  • $0/month vs $20/month for ChatGPT Plus
  • Full privacy - conversations stay on your machine
  • Works offline - no internet connection needed after setup
  • Multi-user - share with family or your team on the same LAN
  • Model switching - swap between different models mid-conversation

Performance

On an RTX 3060 12GB with Qwen3 14B (Q4): ~20-25 tok/s, smooth for chat. For 8GB cards, use Llama 3.1 8B instead.


Originally published on everylocalai.com

Top comments (0)