DEV Community

Cover image for We Built a Self-Hosted AI Platform That Runs 100% on Your Hardware — Introducing local-ai.run
Rizwan Hameed
Rizwan Hameed

Posted on

We Built a Self-Hosted AI Platform That Runs 100% on Your Hardware — Introducing local-ai.run

TL;DR: local-ai.run is a free, open-source, self-hosted AI platform. Chat with your files, generate audio, bring your own models — all offline, all on your hardware, zero data leaving your network. One command to install.

🔗 Website: local-ai.run
⭐ GitHub: github.com/360solutions-dev/local-ai


The problem that started it all

Every time I needed to analyze a document with AI, I had to choose between convenience and privacy. Paste it into ChatGPT? Fast, but that document is now on someone else's server. Run a local model through a terminal? Private, but clunky — no UI, no file uploads, no real workflow.

I kept looking for a self-hosted platform that handled the full stack: file ingestion, vector embeddings, model routing, and a clean interface. Something I could spin up with Docker and actually hand to a non-technical teammate.

I couldn't find one that hit all the marks, so I built it.


What is local-ai.run?

local-ai.run is an open-source, self-hosted AI platform that runs entirely on your own hardware. It gives you a production-ready web interface for AI tools — chat with your files, generate audio, manage your models — without a single byte leaving your network.

It is MIT licensed, Docker-based, and designed to work in fully air-gapped environments.


What's included right now

💬 Chat with Files (RAG pipeline)

Upload PDFs, Word documents, spreadsheets, CSVs, Markdown files, or code. local-ai indexes them using local embeddings, stores them in a vector database, and lets you have natural-language conversations against your own data.

  • Supported formats: .pdf, .docx, .xlsx, .csv, .txt, .md, .py, .js, .ts, .json
  • Vector stores: ChromaDB (default), Qdrant, Milvus, Weaviate
  • Embedding models: nomic-embed-text, all-minilm, bge-large, or any GGUF model
  • Full conversation history, multi-file context, source attribution

🔊 Text to Audio

Convert any text to natural-sounding speech using locally-running TTS models. Adjust voice, speed, and pitch. Export audio files. No cloud TTS API, no per-character billing.

🔌 Pluggable Model Engines

This is the part I'm most proud of. local-ai.run is not tied to any single model runtime. You can connect:

  • Ollama (default) — easiest to set up, huge model library
  • LM Studio — great for GUI-driven model management
  • vLLM — production-grade inference server
  • llama.cpp — maximum control and portability
  • Any OpenAI-compatible API — self-hosted or commercial

You switch engines by changing a single environment variable. Your data, your prompts, your conversations stay exactly where they are.


How it works under the hood

The stack is four isolated services, each running in its own container:

[ React UI : 3000 ]  →  [ Django API Gateway : 8000 ]
                              ↓                  ↓
                   [ Model Engine : 11434 ]  [ ChromaDB : 8001 ]
Enter fullscreen mode Exit fullscreen mode
Layer Tech
Web UI React 18 + TypeScript + Tailwind CSS, served via Nginx
API Gateway Python 3.11 / Django / Django REST Framework / LangChain
Model Engine Ollama (default), LM Studio, vLLM, llama.cpp, any OpenAI-compat
Storage ChromaDB for vectors, SQLite for metadata, Docker Volumes for files

Everything is stateless at the service level. Your files and conversation history live in Docker volumes that you own and control.


Getting started in under 2 minutes

Option 1 — Quick install (recommended)

curl -sSL https://get.local-ai.run | bash
Enter fullscreen mode Exit fullscreen mode

This checks your Docker installation, pulls all images, configures defaults, and starts the stack. Works on Linux, macOS, and WSL2.

Option 2 — Docker Compose (manual)

git clone https://github.com/360solutions-dev/local-ai
cd local-ai
cp .env.example .env
docker compose up -d
Enter fullscreen mode Exit fullscreen mode

Then open http://localhost:3000.

Key environment variables

MODEL_ENGINE=ollama
VECTOR_STORE=chromadb
OLLAMA_MODEL=llama3.2
EMBED_MODEL=nomic-embed-text
ENABLE_GPU=true
UI_PORT=3000
MAX_UPLOAD_SIZE=50
Enter fullscreen mode Exit fullscreen mode

Swap MODEL_ENGINE to lmstudio, vllm, or llamacpp and restart — that's it.


Why self-hosted AI actually matters

Every major AI assistant today has the same tradeoff buried in its terms: when you send data to their API, it may be used to improve their models, stored on their servers, or subject to subpoenas you'll never know about.

For most personal use cases, that's probably fine. But if you work with:

  • Internal company documents
  • Legal or medical records
  • Source code with business logic
  • Anything in a regulated industry

...then "just use ChatGPT" is not a real answer. local-ai.run is built for exactly these scenarios. It works fully offline — including in air-gapped environments with no internet access at all.


What's coming next

A few things actively in progress:

  • Image Analysis — upload images, ask questions, run vision models locally
  • AI Agents — multi-step tasks using tool-calling and local model function support
  • Document Summarizer — one-click long-document summarization
  • Semantic Search — full-corpus vector search across all indexed files
  • Helm Chart — for Kubernetes deployments (helm install local-ai local-ai/local-ai)

The honest part

This is a v1 launch. There are rough edges. The documentation is thinner than I'd like. Some features listed above are still in progress.

But the core — file chat, text-to-audio, pluggable model engines, Docker-based deployment — is solid and works today. I've been running it daily on my own hardware.

I'm releasing it publicly because I'd rather get real feedback from real users than polish it in private forever.


Try it and tell me what breaks

curl -sSL https://get.local-ai.run | bash
Enter fullscreen mode Exit fullscreen mode

If you run into issues, open an issue on GitHub. If you want to contribute, PRs are welcome — the codebase is React + Django + ChromaDB + Ollama, all documented in the repo.

🔗 Website: local-ai.run
GitHub: github.com/360solutions-dev/local-ai


Built with React, Django, ChromaDB, Ollama, and Docker. MIT licensed.


Top comments (0)