<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rizwan Hameed</title>
    <description>The latest articles on DEV Community by Rizwan Hameed (@rizwan_butt_e9c6522604d1c).</description>
    <link>https://dev.to/rizwan_butt_e9c6522604d1c</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3983248%2Fd59e5f39-daa2-4a45-81be-04b0d574e3c7.jpg</url>
      <title>DEV Community: Rizwan Hameed</title>
      <link>https://dev.to/rizwan_butt_e9c6522604d1c</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rizwan_butt_e9c6522604d1c"/>
    <language>en</language>
    <item>
      <title>We Built a Self-Hosted AI Platform That Runs 100% on Your Hardware — Introducing local-ai.run</title>
      <dc:creator>Rizwan Hameed</dc:creator>
      <pubDate>Sun, 14 Jun 2026 18:48:54 +0000</pubDate>
      <link>https://dev.to/rizwan_butt_e9c6522604d1c/we-built-a-self-hosted-ai-platform-that-runs-100-on-your-hardware-introducing-local-airun-57he</link>
      <guid>https://dev.to/rizwan_butt_e9c6522604d1c/we-built-a-self-hosted-ai-platform-that-runs-100-on-your-hardware-introducing-local-airun-57he</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; local-ai.run is a free, open-source, self-hosted AI platform. Chat with your files, generate audio, bring your own models — all offline, all on your hardware, zero data leaving your network. One command to install.&lt;/p&gt;

&lt;p&gt;🔗 Website: &lt;a href="https://local-ai.run" rel="noopener noreferrer"&gt;local-ai.run&lt;/a&gt;&lt;br&gt;
⭐ GitHub: &lt;a href="https://github.com/360solutions-dev/local-ai" rel="noopener noreferrer"&gt;github.com/360solutions-dev/local-ai&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The problem that started it all
&lt;/h2&gt;

&lt;p&gt;Every time I needed to analyze a document with AI, I had to choose between convenience and privacy. Paste it into ChatGPT? Fast, but that document is now on someone else's server. Run a local model through a terminal? Private, but clunky — no UI, no file uploads, no real workflow.&lt;/p&gt;

&lt;p&gt;I kept looking for a self-hosted platform that handled the full stack: file ingestion, vector embeddings, model routing, and a clean interface. Something I could spin up with Docker and actually hand to a non-technical teammate.&lt;/p&gt;

&lt;p&gt;I couldn't find one that hit all the marks, so I built it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is local-ai.run?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;local-ai.run&lt;/strong&gt; is an open-source, self-hosted AI platform that runs entirely on your own hardware. It gives you a production-ready web interface for AI tools — chat with your files, generate audio, manage your models — without a single byte leaving your network.&lt;/p&gt;

&lt;p&gt;It is MIT licensed, Docker-based, and designed to work in fully air-gapped environments.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's included right now
&lt;/h2&gt;

&lt;h3&gt;
  
  
  💬 Chat with Files (RAG pipeline)
&lt;/h3&gt;

&lt;p&gt;Upload PDFs, Word documents, spreadsheets, CSVs, Markdown files, or code. local-ai indexes them using local embeddings, stores them in a vector database, and lets you have natural-language conversations against your own data.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Supported formats: &lt;code&gt;.pdf&lt;/code&gt;, &lt;code&gt;.docx&lt;/code&gt;, &lt;code&gt;.xlsx&lt;/code&gt;, &lt;code&gt;.csv&lt;/code&gt;, &lt;code&gt;.txt&lt;/code&gt;, &lt;code&gt;.md&lt;/code&gt;, &lt;code&gt;.py&lt;/code&gt;, &lt;code&gt;.js&lt;/code&gt;, &lt;code&gt;.ts&lt;/code&gt;, &lt;code&gt;.json&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Vector stores: ChromaDB (default), Qdrant, Milvus, Weaviate&lt;/li&gt;
&lt;li&gt;Embedding models: &lt;code&gt;nomic-embed-text&lt;/code&gt;, &lt;code&gt;all-minilm&lt;/code&gt;, &lt;code&gt;bge-large&lt;/code&gt;, or any GGUF model&lt;/li&gt;
&lt;li&gt;Full conversation history, multi-file context, source attribution&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔊 Text to Audio
&lt;/h3&gt;

&lt;p&gt;Convert any text to natural-sounding speech using locally-running TTS models. Adjust voice, speed, and pitch. Export audio files. No cloud TTS API, no per-character billing.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔌 Pluggable Model Engines
&lt;/h3&gt;

&lt;p&gt;This is the part I'm most proud of. local-ai.run is not tied to any single model runtime. You can connect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ollama&lt;/strong&gt; (default) — easiest to set up, huge model library&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LM Studio&lt;/strong&gt; — great for GUI-driven model management&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;vLLM&lt;/strong&gt; — production-grade inference server&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;llama.cpp&lt;/strong&gt; — maximum control and portability&lt;/li&gt;
&lt;li&gt;Any &lt;strong&gt;OpenAI-compatible API&lt;/strong&gt; — self-hosted or commercial&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You switch engines by changing a single environment variable. Your data, your prompts, your conversations stay exactly where they are.&lt;/p&gt;




&lt;h2&gt;
  
  
  How it works under the hood
&lt;/h2&gt;

&lt;p&gt;The stack is four isolated services, each running in its own container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ React UI : 3000 ]  →  [ Django API Gateway : 8000 ]
                              ↓                  ↓
                   [ Model Engine : 11434 ]  [ ChromaDB : 8001 ]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Tech&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Web UI&lt;/td&gt;
&lt;td&gt;React 18 + TypeScript + Tailwind CSS, served via Nginx&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API Gateway&lt;/td&gt;
&lt;td&gt;Python 3.11 / Django / Django REST Framework / LangChain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Model Engine&lt;/td&gt;
&lt;td&gt;Ollama (default), LM Studio, vLLM, llama.cpp, any OpenAI-compat&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Storage&lt;/td&gt;
&lt;td&gt;ChromaDB for vectors, SQLite for metadata, Docker Volumes for files&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Everything is stateless at the service level. Your files and conversation history live in Docker volumes that you own and control.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting started in under 2 minutes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Option 1 — Quick install (recommended)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-sSL&lt;/span&gt; https://get.local-ai.run | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This checks your Docker installation, pulls all images, configures defaults, and starts the stack. Works on Linux, macOS, and WSL2.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 2 — Docker Compose (manual)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/360solutions-dev/local-ai
&lt;span class="nb"&gt;cd &lt;/span&gt;local-ai
&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then open &lt;code&gt;http://localhost:3000&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key environment variables
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MODEL_ENGINE=ollama
VECTOR_STORE=chromadb
OLLAMA_MODEL=llama3.2
EMBED_MODEL=nomic-embed-text
ENABLE_GPU=true
UI_PORT=3000
MAX_UPLOAD_SIZE=50
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Swap &lt;code&gt;MODEL_ENGINE&lt;/code&gt; to &lt;code&gt;lmstudio&lt;/code&gt;, &lt;code&gt;vllm&lt;/code&gt;, or &lt;code&gt;llamacpp&lt;/code&gt; and restart — that's it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why self-hosted AI actually matters
&lt;/h2&gt;

&lt;p&gt;Every major AI assistant today has the same tradeoff buried in its terms: when you send data to their API, it may be used to improve their models, stored on their servers, or subject to subpoenas you'll never know about.&lt;/p&gt;

&lt;p&gt;For most personal use cases, that's probably fine. But if you work with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Internal company documents&lt;/li&gt;
&lt;li&gt;Legal or medical records&lt;/li&gt;
&lt;li&gt;Source code with business logic&lt;/li&gt;
&lt;li&gt;Anything in a regulated industry&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;...then "just use ChatGPT" is not a real answer. local-ai.run is built for exactly these scenarios. It works fully offline — including in air-gapped environments with no internet access at all.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's coming next
&lt;/h2&gt;

&lt;p&gt;A few things actively in progress:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Image Analysis&lt;/strong&gt; — upload images, ask questions, run vision models locally&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI Agents&lt;/strong&gt; — multi-step tasks using tool-calling and local model function support&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document Summarizer&lt;/strong&gt; — one-click long-document summarization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Semantic Search&lt;/strong&gt; — full-corpus vector search across all indexed files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Helm Chart&lt;/strong&gt; — for Kubernetes deployments (&lt;code&gt;helm install local-ai local-ai/local-ai&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The honest part
&lt;/h2&gt;

&lt;p&gt;This is a v1 launch. There are rough edges. The documentation is thinner than I'd like. Some features listed above are still in progress.&lt;/p&gt;

&lt;p&gt;But the core — file chat, text-to-audio, pluggable model engines, Docker-based deployment — is solid and works today. I've been running it daily on my own hardware.&lt;/p&gt;

&lt;p&gt;I'm releasing it publicly because I'd rather get real feedback from real users than polish it in private forever.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try it and tell me what breaks
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-sSL&lt;/span&gt; https://get.local-ai.run | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you run into issues, open an issue on GitHub. If you want to contribute, PRs are welcome — the codebase is React + Django + ChromaDB + Ollama, all documented in the repo.&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;Website:&lt;/strong&gt; &lt;a href="https://local-ai.run" rel="noopener noreferrer"&gt;local-ai.run&lt;/a&gt;&lt;br&gt;
⭐ &lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/360solutions-dev/local-ai" rel="noopener noreferrer"&gt;github.com/360solutions-dev/local-ai&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with React, Django, ChromaDB, Ollama, and Docker. MIT licensed.&lt;/em&gt;&lt;/p&gt;




</description>
      <category>ai</category>
      <category>llm</category>
      <category>selfhosted</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
