Running a local LLM is easy. Turning it into something useful for daily IT administration is a different story.
This article is not only about installing Ollama and Open WebUI in Docker. There are already many tutorials for that. What I wanted to build was a local assistant that could actually help with SysOps tasks: reading logs, explaining errors, preparing safe Bash commands, writing Ansible playbooks, reviewing YAML files, and helping with Docker-related troubleshooting.
My goal was to prepare a local AI setup that could later be moved to a dedicated Actina machine and used as a small internal IT assistant.
The interesting part was not the installation itself.
The interesting part was what happened after the installation.
I had to solve a few practical problems:
- how to choose a default model,
- how to route coding-related prompts to a better model,
- how to avoid manually switching models in the UI,
- how to make Open WebUI start with the right assistant,
- how to make the whole setup survive a VM restart,
- and how to keep the models and data persistent.
This is Part 1 of the setup.
Final goal
The final idea was simple:
I wanted to open Open WebUI, select one assistant, and use it for IT administration.
Under the hood, however, I wanted different models to be used depending on the task.
For example:
General chat, logs, CSV analysis, explanations -> qwen3:8b
Bash, Python, Docker, Ansible, YAML, errors -> qwen2.5-coder:7b
So instead of manually switching between models, I created a custom Pipe Function in Open WebUI.
From the user perspective, it looked like one model:
AUTO Local IT Assistant
But internally it worked as a simple model router.
Environment overview
The local setup was based on Docker Compose.
The project directory looked like this:
/home/kkadzielawa/ai-local
├── docker-compose.yml
├── .env
└── data
├── ollama
└── open-webui
The main containers were:
ollama
open-webui
The local models were:
qwen3:8b
qwen2.5-coder:7b
The idea was:
qwen3:8b -> general local IT assistant
qwen2.5-coder:7b -> coding, scripting, Docker, Ansible, YAML
At first, I thought that setting a default model would be enough.
It was not.
Problem 1: A default model is not enough
The first approach was to configure Open WebUI with a default model.
In Docker Compose, this can be done using environment variables like this:
environment:
- DEFAULT_MODELS=qwen3:8b
- DEFAULT_PINNED_MODELS=qwen3:8b,qwen2.5-coder:7b
This works, but only solves one problem:
Which model should be selected by default?
It does not solve the more important problem:
Which model is best for this specific prompt?
For example, this prompt is probably fine for a general model:
Explain what should be checked before upgrading a Debian server in production.
But this one should go to a coding-oriented model:
Write an Ansible playbook that checks hostname, uptime, OS version and free disk space.
Read-only tasks only. Do not modify the system.
I did not want to remember which model to select every time.
I wanted Open WebUI to make that decision automatically.
So I needed a custom layer between the user prompt and the actual model.
Problem 2: Tools, Pipelines or Functions?
This was one of the first confusing parts.
Open WebUI has several places that may look similar at first:
Workspace -> Tools
Settings -> Pipelines
Admin -> Functions
My first assumption was that this kind of logic should be created as a tool or pipeline.
But that was not the right direction.
A model router is not a normal tool called by the model during a conversation. It is not something like a calculator, web search or custom API tool.
What I needed was something that behaves like a model in the UI, receives the user request, decides what to do with it, and then forwards it to the selected real model.
The right place was:
Admin Settings
-> Functions
-> New Function
And the right type of function was a Pipe Function.
Problem 3: Open WebUI showed an Example Filter, but I needed a Pipe
When creating a new function, Open WebUI may show an example based on a filter.
That can be misleading.
For this use case, a filter was not what I needed.
A filter can modify input or output, but I wanted to create something that appears as a selectable model and routes the request to another model.
So the function had to be based on:
class Pipe:
not:
class Filter:
This was an important detail.
The custom function was named:
AUTO Local IT Assistant
The function had its own configurable valves:
Default Model: qwen3:8b
Coder Model: qwen2.5-coder:7b
Show Selected Model: false
This allowed me to change the underlying models later without rewriting the whole function.
The routing logic
The first version of the router was intentionally simple.
It checked the user message and looked for keywords or patterns suggesting that the prompt was related to coding, scripting or infrastructure automation.
Examples of keywords:
ansible
playbook
yaml
bash
python
powershell
docker compose
terraform
traceback
exception
stack trace
Examples of patterns:
docker ...
systemctl
journalctl
apt install
def function_name(...)
class Something
#!/bin/bash
- hosts:
tasks:
If the prompt matched coding or automation-related patterns, the function routed it to:
qwen2.5-coder:7b
Otherwise, it used:
qwen3:8b
This is not an advanced agent.
It is a simple and predictable router.
And that was exactly what I wanted at this stage.
For a local IT assistant, predictable behavior is often better than clever behavior that is hard to debug.
Adding a safety-oriented system prompt
Because this assistant is meant for IT administration, I did not want it to randomly suggest destructive commands.
So I added a system prompt with practical safety rules.
The assistant should:
- answer in Polish,
- classify commands as read-only, system-changing or risky,
- suggest diagnostics before making production changes,
- avoid destructive commands without backups,
- explain Bash, Python and Ansible snippets,
- prefer safe troubleshooting steps first.
This is a very small addition, but it changes the behavior a lot.
For example, instead of immediately suggesting a dangerous command, the assistant should first propose read-only diagnostics:
df -h
free -m
uptime
systemctl status docker
journalctl -u docker --no-pager -n 100
For me, this is one of the most important parts of building a local SysOps assistant.
The model should not only answer.
It should answer in a way that matches how administrators actually work with production systems.
Problem 4: The router existed, but Open WebUI still showed qwen3
After creating the Pipe Function, I wanted Open WebUI to start with the router selected by default.
So I changed the Docker Compose environment variables to something like this:
environment:
- DEFAULT_MODELS=auto-local-it-assistant
- DEFAULT_PINNED_MODELS=auto-local-it-assistant,qwen3:8b,qwen2.5-coder:7b
At first, it looked like it did not work.
Open WebUI still showed:
qwen3:8b
This was confusing, because the configuration seemed correct.
The reason was simple but easy to miss:
Open WebUI can remember the model selected in an existing chat or user session.
So the fix was not another Docker Compose change.
The fix was:
1. Hard refresh the browser with CTRL + F5
2. Start a new chat
3. Search for "auto" in the model selector
4. Select AUTO Local IT Assistant
After that, the router worked as expected.
The lesson here was important:
DEFAULT_MODELS affects the default state,
but existing chats or user sessions may still remember the old model.
This is a very practical troubleshooting point.
Without knowing this, it is easy to waste time restarting containers, changing YAML files or assuming the Pipe Function is broken.
Problem 5: Containers must start after VM reboot
Another issue was more operational.
This setup was running on a VM, and I did not want the local assistant to disappear after a reboot.
The solution had two parts.
First, Docker and containerd should start with the system:
sudo systemctl enable docker.service
sudo systemctl enable containerd.service
Second, the containers should have a proper restart policy.
In docker-compose.yml:
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
volumes:
- ./data/ollama:/root/.ollama
restart: unless-stopped
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
volumes:
- ./data/open-webui:/app/backend/data
restart: unless-stopped
I used:
restart: unless-stopped
instead of:
restart: always
because I prefer this behavior for a local admin tool.
If I manually stop the container, I usually mean it. I do not want Docker to bring it back immediately against my decision.
To verify the restart policy:
docker inspect -f '{{.HostConfig.RestartPolicy.Name}}' ollama
docker inspect -f '{{.HostConfig.RestartPolicy.Name}}' open-webui
Expected result:
unless-stopped
unless-stopped
Then the real test:
sudo reboot
After the VM comes back:
docker ps
Both containers should be running again.
Problem 6: Models and data must survive container recreation
Another thing that matters a lot with Ollama and Open WebUI is persistence.
I did not want downloaded models or Open WebUI data to live only inside containers.
That is why I used local directories mounted into the containers:
volumes:
- ./data/ollama:/root/.ollama
and:
volumes:
- ./data/open-webui:/app/backend/data
This gives me a clean project structure:
/home/kkadzielawa/ai-local
├── docker-compose.yml
├── .env
└── data
├── ollama
└── open-webui
It also makes the setup easier to back up or move to another machine:
cd ~
tar -czf ai-local-backup-$(date +%F).tar.gz ai-local
This was important because the environment was prepared with a future migration in mind.
The VM was only the first step.
The target was a dedicated machine for local AI-assisted administration.
Testing the assistant
I tested the router with a few different prompts.
A general SysOps question:
Explain what should be checked before upgrading a Debian server in production.
Expected model:
qwen3:8b
A Bash-related prompt:
Write a Bash script that checks CPU, RAM, disks, IP addresses and whether Docker is running.
Expected model:
qwen2.5-coder:7b
An Ansible prompt:
Write an Ansible playbook that checks hostname, uptime, OS version and free disk space.
Read-only tasks only.
Expected model:
qwen2.5-coder:7b
A YAML troubleshooting prompt:
Check this docker-compose.yml and explain why Open WebUI cannot reach Ollama.
Expected model:
qwen2.5-coder:7b
A log analysis prompt:
Analyze this log and tell me what looks suspicious.
Expected model:
qwen3:8b
At this stage, the routing does not need to be perfect.
It needs to be predictable and useful.
That is enough for Part 1.
What I learned
The biggest lesson from this setup is:
Local AI is not only about running a model.
It is about building the workflow around the model.
Ollama and Open WebUI give you a very good starting point.
But the real value starts when you customize the environment for your own use case.
In my case, the most important improvements were:
- one visible assistant in the UI,
- automatic routing between a general model and a coding model,
- a safety-oriented system prompt,
- persistent model and Open WebUI data,
- container autostart after VM reboot,
- a structure that can be moved to another machine later.
The most useful parts were not the obvious installation steps.
The most useful parts were the small issues found along the way:
- a model router is not a normal Tool,
- the correct Open WebUI feature is a Function,
- the function should be a Pipe, not a Filter,
- Open WebUI may remember the previous model in an existing chat,
- DEFAULT_MODELS may look broken until a new chat is created,
- restart policies should be verified, not assumed,
- persistent volumes are mandatory if the setup is meant to survive recreation.
These are the details that usually do not appear in a clean installation guide.
But these are exactly the details that matter when building something usable.
What comes next
This is only the first part of the project.
The current setup gives me a local IT assistant with simple model routing.
But there are many possible next steps.
The most obvious one is adding a vision model.
That would allow the assistant to handle screenshots, UI errors, diagrams or monitoring views.
For example:
screenshot or image -> vision model
code or automation -> coder model
general IT question -> general model
Another important step is RAG.
I want the assistant to work with local documentation, internal notes, procedures, PDFs, CSV exports and troubleshooting guides.
That would require adding an embedding model and a local knowledge base.
Possible future improvements:
1. Add a vision model for screenshots and UI troubleshooting
2. Add a local knowledge base with RAG
3. Add an embedding model for document search
4. Improve the routing logic
5. Add monitoring for Ollama and Open WebUI
6. Add backups for models and Open WebUI data
7. Add reverse proxy and HTTPS
8. Restrict access to VPN or internal LAN only
9. Build more specialized assistants for specific IT tasks
So this is not a finished platform.
It is the base layer.
The first working version of a local SysOps assistant.
And for me, this is the most interesting part of local AI: not just running a model, but slowly turning it into a practical internal tool.
Top comments (0)