Jovan Chan

Posted on Jun 2 • Originally published at runaihome.com

Home AI Server with Tailscale: Access Your LLM from Anywhere (2026)

#tailscale #ollama #remoteaccess #homeserver

This article was originally published on runaihome.com

The problem with running local AI at home: your 24/7 Ollama server sitting on the basement rig is useless when you're at a coffee shop, at the office, or visiting family. You're on a different network, and your GPU is idle.

Port forwarding is the first solution most people try. It works, but it exposes port 11434 (or 3000 for Open WebUI) to the public internet—with no authentication on Ollama's side by default. Ollama has no built-in API auth. One port scan from a botnet, and someone else is burning through your GPU for free. Or worse, they're using your server to run inference on content you'd rather not host.

Tailscale solves this cleanly. It's a private WireGuard-based mesh network: you install it on your server and your laptop, they join the same encrypted virtual network, and your server gets a stable 100.x.x.x address plus a human-readable hostname. Traffic between devices is end-to-end encrypted. Nothing touches the public internet. And setup takes under 30 minutes.

This guide walks through the full setup for a Linux AI server: installing Tailscale, configuring Ollama to accept remote connections, testing from a client, adding Open WebUI access, and enabling MagicDNS for clean hostnames instead of raw IPs.

What you need

A Linux server running Ollama (Ubuntu 22.04 or 24.04 recommended; also works on Debian and Arch)
Ollama already installed (curl -fsSL https://ollama.com/install.sh | sh if not)
At least one client device: laptop, phone, or second machine
A free Tailscale account — the Personal plan as of the April 2026 pricing overhaul supports 6 users and unlimited devices at no cost

Windows server note: If Ollama is running on Windows, the approach is the same but you configure OLLAMA_HOST via System Properties → Environment Variables rather than systemd. The Tailscale install is a standard Windows installer from tailscale.com.

Step 1: Install Tailscale on the server

Tailscale provides a one-liner installer that handles the apt repo, package install, and daemon setup:

curl -fsSL https://tailscale.com/install.sh | sh

If you prefer the auditable manual approach on Ubuntu:

curl -fsSL "https://pkgs.tailscale.com/stable/ubuntu/$(lsb_release -cs).noarmor.gpg" \
  | sudo tee /usr/share/keyrings/tailscale-archive-keyring.gpg > /dev/null

curl -fsSL "https://pkgs.tailscale.com/stable/ubuntu/$(lsb_release -cs).tailscale-keyring.list" \
  | sudo tee /etc/apt/sources.list.d/tailscale.list

sudo apt update && sudo apt install tailscale -y

Start the daemon and authenticate:

sudo systemctl enable --now tailscaled
sudo tailscale up

This prints an authentication URL. Open it in a browser, sign in to your Tailscale account, and the server joins your tailnet. Confirm with:

tailscale ip -4

You'll see a 100.x.x.x address. That's the server's permanent Tailscale IP—it stays stable across reboots and network changes.

Step 2: Configure Ollama to accept Tailscale connections

By default, Ollama binds to 127.0.0.1:11434—localhost only. Requests from any other interface, including Tailscale, are refused. You need to change the bind address.

The correct way on systemd systems is a drop-in override file. Editing the main service file directly works but gets overwritten on Ollama upgrades—don't do that. Instead:

sudo systemctl edit ollama

This opens a blank drop-in file. Add:

[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"

Save and exit. Then reload systemd and restart Ollama:

sudo systemctl daemon-reload
sudo systemctl restart ollama

Verify Ollama is now listening on all interfaces:

ss -tlnp | grep 11434

Output should show 0.0.0.0:11434. If it still shows 127.0.0.1:11434, the override wasn't applied—double-check the drop-in file exists at /etc/systemd/system/ollama.service.d/override.conf.

Tighten the firewall if you're on a public-facing machine

Binding to 0.0.0.0 means Ollama listens on all interfaces, including any public NIC. If your server has a public IP (VPS, machine in DMZ), add a firewall rule to block 11434 externally while allowing it on the Tailscale interface:

sudo ufw deny in on eth0 to any port 11434
sudo ufw allow in on tailscale0 to any port 11434
sudo ufw reload

Replace eth0 with your actual external interface name (ip link lists them). Home machines behind NAT don't need this—the router blocks 11434 from the outside anyway—but it's good practice.

Step 3: Install Tailscale on your client devices

macOS:

brew install tailscale

Or download the macOS app from tailscale.com/download. Sign in with the same Tailscale account as the server.

Linux client:

curl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale up

Windows client: Download the MSI installer from tailscale.com and run it. Tailscale will authenticate via browser.

iOS / Android: Install from the App Store or Play Store, sign in to your Tailscale account. Mobile access works the same as desktop once connected.

All devices on the same Tailscale account are automatically placed in the same tailnet. No configuration beyond sign-in.

Step 4: Test the connection

From your client machine, query the Ollama API using the server's Tailscale IP (the 100.x.x.x from Step 1):

curl http://100.x.x.x:11434/api/tags

You should receive a JSON list of your pulled models. If the request hangs or returns connection refused, check these in order:

Is Ollama running? sudo systemctl status ollama on the server
Is OLLAMA_HOST=0.0.0.0:11434 in the running config? Check the systemd status output for Environment= lines
Is the client connected to Tailscale? tailscale status should show the server as active
Can you ping the server's Tailscale IP? ping 100.x.x.x

Run an actual inference test once the API responds:

curl http://100.x.x.x:11434/api/generate -d '{
  "model": "llama3.2:3b",
  "prompt": "Respond in exactly three words.",
  "stream": false
}'

Latency over Tailscale is typically within 5–10 ms of your raw LAN speed when both devices are on the same ISP. Cross-country connections add real latency, but for non-streaming queries it's barely noticeable. For streaming responses (token-by-token), even 50 ms added latency is imperceptible.

Step 5: Open WebUI over Tailscale

If you have Open WebUI running (the Docker-based chat UI covered in Open WebUI multi-user setup), remote access requires no additional config. Just hit the server's Tailscale IP on port 3000:

http://100.x.x.x:3000

This works from any device on your tailnet—laptop, phone, tablet—exactly as if you were on your home LAN.

If you're running Open WebUI on a separate machine from the Ollama server, point it at the Ollama Tailscale IP via the OLLAMA_BASE_URL environment variable:

docker run -d \
  --name open-webui \
  -e OLLAMA_BASE_URL=http://100.x.x.x:11434 \
  -p 3000:8080 \
  ghcr.io/open-webui/open-webui:main

This is cleaner than trying to route WebUI traffic through localhost when the two services aren't colocated.

Step 6: MagicDNS — hostnames instead of IPs

Remembering 100.64.x.x addresses is annoying. Tailscale's MagicDNS feature automatically assigns short hostnames to every device in your tailnet. Your server becomes ai-server (or whatever you named it during OS setup), and Tailscale resolves it to the right 100.x.x.x address on any tailnet device.

Enable it in the Tailscale admin console at admin.tailscale.com → DNS → Enable MagicDNS.

After enabling, instead of:

http://100.x.x.x:11434

you can use:

http://ai-server:11434

The full tailnet hostname format is machine-name.tailNNNN.ts.net (where tailNNNN is your tailnet's unique identifier, visible in

DEV Community