<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Loki Bein Blodsson </title>
    <description>The latest articles on DEV Community by Loki Bein Blodsson  (@stingingraven).</description>
    <link>https://dev.to/stingingraven</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3920781%2F87eb6443-afa2-4109-996c-c8fc7d4bd9c8.png</url>
      <title>DEV Community: Loki Bein Blodsson </title>
      <link>https://dev.to/stingingraven</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/stingingraven"/>
    <language>en</language>
    <item>
      <title>Open-WebUI + Ollama Guide: Run LLMs Locally with Docker</title>
      <dc:creator>Loki Bein Blodsson </dc:creator>
      <pubDate>Sat, 09 May 2026 00:07:17 +0000</pubDate>
      <link>https://dev.to/stingingraven/open-webui-ollama-guide-run-llms-locally-with-docker-54el</link>
      <guid>https://dev.to/stingingraven/open-webui-ollama-guide-run-llms-locally-with-docker-54el</guid>
      <description>&lt;p&gt;1️⃣ &lt;strong&gt;Introduction&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Welcome to the ultimate Open-WebUI guide. If you've ever wanted the power and sleek interface of ChatGPT but with the privacy of a local server, you are in the right place.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flit67ymrz866idd6pff3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flit67ymrz866idd6pff3.png" alt=" " width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Ollama is a lightweight inference engine that makes running large language models (LLMs) dead simple, while Open-WebUI (formerly Ollama WebUI) provides a beautiful, feature-rich, and extensible front-end. By combining them, you can build your own private AI assistant.&lt;br&gt;
Why a self-hosted FOSS version matters:&lt;br&gt;
Absolute Privacy: Your chats, code snippets, and intellectual property never leave your machine.&lt;br&gt;
Zero Subscription Costs: Run powerful open-source models for free.&lt;br&gt;
Offline Access: Work seamlessly even without an internet connection.&lt;br&gt;
TL;DR - What you will accomplish today:&lt;br&gt;
Install Docker &amp;amp; Docker Compose.&lt;br&gt;
Deploy a unified Ollama and Open-WebUI stack using a single file.&lt;br&gt;
Prevent data loss with persistent volumes.&lt;br&gt;
Download and run Llama locally.&lt;/p&gt;

&lt;p&gt;2️⃣ &lt;strong&gt;Prerequisites&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvyf69ct0c6s6cro6awr9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvyf69ct0c6s6cro6awr9.png" alt=" " width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Before we spin up our Ollama local LLM stack, ensure your system meets these baseline requirements:&lt;br&gt;
Hardware: * RAM: 8 GB minimum (16 GB highly recommended to run 7B-8B parameter models).&lt;br&gt;
CPU: Modern multi-core processor.&lt;br&gt;
GPU (Optional but recommended): An NVIDIA GPU with at least 6GB VRAM will drastically improve token generation speed.&lt;br&gt;
Software: Docker and Docker-Compose installed on your system. (If you haven't done this yet, check out our Beginner's Guide to Docker).&lt;br&gt;
Network: Ports 8080 (WebUI) and 11434 (Ollama API) available.&lt;/p&gt;

&lt;p&gt;3️⃣ &lt;strong&gt;Quick-Start Installation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The biggest mistake beginners make is running Open-WebUI and Ollama in separate, disjointed Docker commands, leading to localhost connection errors. We will solve this by deploying them together in a single docker-compose.yml file.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk4rwzwgcxdz6or4g0vi1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk4rwzwgcxdz6or4g0vi1.png" alt=" " width="800" height="560"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Create a new directory and create your compose file:&lt;/p&gt;

&lt;p&gt;mkdir open-webui-stack &amp;amp;&amp;amp; cd open-webui-stack&lt;br&gt;
nano docker-compose.yml&lt;/p&gt;

&lt;p&gt;Paste the following configuration:&lt;/p&gt;

&lt;p&gt;version: '3.8'&lt;/p&gt;

&lt;p&gt;services:&lt;br&gt;
  ollama:&lt;br&gt;
    image: ollama/ollama:latest&lt;br&gt;
    container_name: ollama&lt;br&gt;
    volumes:&lt;br&gt;
      - ollama_data:/root/.ollama&lt;br&gt;
    ports:&lt;br&gt;
      - "11434:11434"&lt;br&gt;
    restart: unless-stopped&lt;br&gt;
    # Uncomment the following lines if you have an NVIDIA GPU and nvidia-docker2 installed&lt;br&gt;
    # deploy:&lt;br&gt;
    #   resources:&lt;br&gt;
    #     reservations:&lt;br&gt;
    #       devices:&lt;br&gt;
    #         - driver: nvidia&lt;br&gt;
    #           count: 1&lt;br&gt;
    #           capabilities: [gpu]&lt;/p&gt;

&lt;p&gt;open-webui:&lt;br&gt;
    image: ghcr.io/open-webui/open-webui:main&lt;br&gt;
    container_name: open-webui&lt;br&gt;
    volumes:&lt;br&gt;
      - open-webui_data:/app/backend/data&lt;br&gt;
    ports:&lt;br&gt;
      - "8080:8080"&lt;br&gt;
    environment:&lt;br&gt;
      - OLLAMA_BASE_URL=&lt;a href="http://ollama:11434" rel="noopener noreferrer"&gt;http://ollama:11434&lt;/a&gt;&lt;br&gt;
      - WEBUI_AUTH=True&lt;br&gt;
    depends_on:&lt;br&gt;
      - ollama&lt;br&gt;
    restart: unless-stopped&lt;/p&gt;

&lt;p&gt;volumes:&lt;br&gt;
  ollama_data:&lt;br&gt;
  open-webui_data:&lt;/p&gt;

&lt;p&gt;Save the file and run:&lt;br&gt;
docker compose up -d&lt;/p&gt;

&lt;p&gt;First-run verification: Wait about 60 seconds for the containers to initialize, then open your browser and navigate to &lt;a href="http://localhost:8080" rel="noopener noreferrer"&gt;http://localhost:8080&lt;/a&gt;. You should be greeted by the Open-WebUI login screen!&lt;/p&gt;

&lt;p&gt;4️⃣ &lt;strong&gt;Detailed Configuration&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let's break down why this configuration solves the most common self-hosting headaches:&lt;br&gt;
Persistent Storage (Volumes): Notice the ollama_data and open-webui_data volumes? Without these, every time you update or restart your container, you would lose your downloaded models and chat history. This setup ensures your data is permanently safe.&lt;br&gt;
Internal Network Routing: By setting OLLAMA_BASE_URL=&lt;a href="http://ollama:11434" rel="noopener noreferrer"&gt;http://ollama:11434&lt;/a&gt;, we tell the WebUI to talk directly to the Ollama container via Docker's internal DNS. This completely bypasses annoying localhost or 127.0.0.1 routing conflicts.&lt;br&gt;
Authentication (WEBUI_AUTH=True): This forces users to create an account before accessing the AI, securing your server from unauthorized use.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pro-Tip&lt;/strong&gt;: If you want to access this outside your home network, we highly recommend putting Open-WebUI behind Nginx Proxy Manager or Traefik with an SSL certificate.&lt;/p&gt;

&lt;p&gt;5️⃣ &lt;strong&gt;Common Use-Cases &amp;amp; Mini-Projects&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Downloading Your First Model&lt;br&gt;
Once logged into Open-WebUI, click on the Settings gear, navigate to Models, and type llama3 or llama3.1 into the pull model field. Click download.&lt;br&gt;
Alternatively, you can pull a model directly via your terminal:&lt;br&gt;
docker exec -it ollama ollama run llama3.1&lt;/p&gt;

&lt;p&gt;API Access for Developers&lt;br&gt;
Because we exposed port 11434, you can use your new local LLM server just like the OpenAI API. Test it with this simple curl request:&lt;br&gt;
curl -X POST &lt;a href="http://localhost:11434/api/generate" rel="noopener noreferrer"&gt;http://localhost:11434/api/generate&lt;/a&gt; -d '{&lt;br&gt;
  "model": "llama3.1",&lt;br&gt;
  "prompt": "Why is the sky blue? Explain in one sentence.",&lt;br&gt;
  "stream": false&lt;br&gt;
}'&lt;/p&gt;

&lt;p&gt;6️⃣ &lt;strong&gt;Troubleshooting &amp;amp; FAQ&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Q: My GPU isn't being detected by Ollama. Tokens are generating very slowly! &lt;/p&gt;

&lt;p&gt;A: If you are on Linux with an NVIDIA card, you must install the NVIDIA Container Toolkit (nvidia-docker2). Once installed, uncomment the deploy block in the docker-compose.yml file and restart the stack (docker compose up -d --force-recreate).&lt;/p&gt;

&lt;p&gt;Q: I keep getting "Out of Memory" errors on my 8GB RAM machine. &lt;/p&gt;

&lt;p&gt;A: Standard 7B or 8B models might be too heavy for your system. Switch to a smaller, highly efficient model. Try pulling gemma:2b or Microsoft's phi3 inside the Open-WebUI interface.&lt;/p&gt;

&lt;p&gt;Q: Open-WebUI says "Ollama connection failed." &lt;/p&gt;

&lt;p&gt;A: Double-check that your OLLAMA_BASE_URL is set to &lt;a href="http://ollama:11434" rel="noopener noreferrer"&gt;http://ollama:11434&lt;/a&gt; (not localhost) and that the ollama container is running without restart loops (docker ps).&lt;/p&gt;

&lt;p&gt;7️⃣ &lt;strong&gt;Security &amp;amp; Production Hardening&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk7ag0ictt97hpmnigdhr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk7ag0ictt97hpmnigdhr.png" alt=" " width="800" height="465"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you plan to expose this setup to the internet, you must harden it:&lt;br&gt;
Disable Open Signups: Once you have created your admin account, go to the WebUI Admin Panel -&amp;gt; Settings -&amp;gt; General, and turn off "Enable New User Signups".&lt;br&gt;
Backup Strategy: Regularly back up your Docker volumes. You can easily tarball your volumes located in /var/lib/docker/volumes/ to keep your chat history safe.&lt;br&gt;
Reverse Proxy: Never expose port 8080 directly to the web. Route it through a proxy manager with Let's Encrypt SSL.&lt;/p&gt;

&lt;p&gt;8️⃣ &lt;strong&gt;Extending the Stack&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbftknz8fer5mfa1e3gt3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbftknz8fer5mfa1e3gt3.png" alt=" " width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Your private AI assistant doesn't have to exist in a vacuum. You can seamlessly integrate this setup with other FOSS homelab tools:&lt;br&gt;
Give it internet access: Connect Open-WebUI to SearXNG (Self-Hosted Search Engine) to allow your LLM to scrape the live web.&lt;br&gt;
Safe Code Execution: Integrate Open-Terminal to give your AI agents a sandboxed browser-based shell to write and test code safely.&lt;/p&gt;

&lt;p&gt;9️⃣ &lt;strong&gt;Conclusion &amp;amp; Next Steps&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgbffub34jxv3f67pv4nl.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgbffub34jxv3f67pv4nl.jpg" alt=" " width="800" height="618"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You now have a fully functional, highly secure, and persistent Ollama local LLM server with a gorgeous user interface. You've eliminated third-party privacy risks and unlocked the world of open-weight AI models.&lt;/p&gt;

</description>
      <category>docker</category>
      <category>llm</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
