Rob

Posted on May 8 • Originally published at vibescoder.dev

From Idea to Infrastructure: Standing Up a Self-Hosted AI Dev Environment

#coder #agents #homelab

The first wave of content here on Vibes Coder was meta by design: a blog about building a blog, from a cabana in Cabo, on an iPhone. But that was always just the foundation. The thing I actually want to explore is local and self-hosted AI, and that starts with infrastructure.

This post is the journey from "I should build a home lab" to a fully running Coder server with GitHub integration, workspace templates, multi-user support, and AI agents that are genuinely useful out of the box. Everything here was done conversationally through Coder Agents.

Why Self-Hosted, Why Now

We're in the middle of an explosion in local hardware capabilities. Apple's shipped insanely powerful M-series silicon for generations. Qualcomm's latest Snapdragon Elite processors are serious. NVIDIA keeps pushing consumer GPUs with more VRAM, and is now getting into CPUs with the N1 chips. The combination of CPUs, GPUs, and NPUs available today far exceeds what standard productivity apps actually require.

It's pretty clear where this is heading: sophisticated LLMs running directly on our devices. I genuinely believe the future is Siri interfacing with a local model on an iPhone. A self-hosted home lab is the best approximation for testing that future before on-device capabilities go mainstream.

So I broke this into three phases:

Get the hardware capable of real inference
Build the dev environment to work with it (Coder, agents, templates)
Run local models and wire them into the coding workflow

This post covers phases 1 and 2. Phase 3 posts next.

The Hardware Hack: Buy a Gaming PC

How do you get a machine powerful enough for serious AI work when RAM, storage, and GPU prices are brutal?

Buy a pre-built gaming PC.

Individually sourcing components means paying extreme markups, thanks to AI's ripple effects on GPUs, memory, and storage. But a gaming PCs built a few months ago with all the latest parts are just sitting on shelves at Best Buy, Newegg, and Micro Center. These complete systems are actually worth more parted out than what they're selling for. The pricing is inverted.

I picked up a rig from Newegg. Here's what's inside:

Category	Component	Spec
CPU	AMD Ryzen 9 9950X3D	16-core Zen 5, 5.75 GHz boost
GPU	Zotac RTX 5090	32 GB GDDR7
RAM	G.Skill Trident Z5 RGB	64 GB (2x32 GB) DDR5-6000
Storage	Samsung 9100 Pro	2 TB Gen5 NVMe
PSU	Thermaltake Toughpower GT	1200W 80+ Gold ATX 3.1
OS	Ubuntu 24.04 LTS	NVIDIA driver 590.48.01

The spec that matters most for local LLMs is VRAM. The 32 GB on the RTX 5090 is the sweet spot: enough to run 27B-35B parameter models at full quality, or 70B models at aggressive quantization. The 64 GB system RAM provides headroom for KV cache spillover, and the 2 TB NVMe means models load fast and you can store plenty without worry. More on all of that in the next post. Now I have a capable AI workstation, in all it's RGB puke glory.

But a powerful machine sitting in a closet isn't useful until you can actually develop on it. That's where Coder comes in.

Standing Up Coder

I installed Ubuntu on the workstation. Why Ubuntu? It has the most documentation and is often what surfaces first in searches. Basically, it's the most agent-friendly distro. I didn't want troubleshooting my deployment with an agent to conflate Mint or Pop!_OS solutions. This was pretty straight forward minus a snafu getting the RTX 5090 drivers. Ends up you have to the install the open ones, and not the NVIDIA proprietary ones. Thankfully my motherboard had a built-in HDMI port I could use with the Ryzen's iGPU.

15 minutes later I connected my Ubuntu workstation via a Coder tunnel. This gives me a full cloud development environment accessible from anywhere, including my phone. Workspaces run as Docker containers on the machine, each with its own isolated environment, tools, and credentials.

The goal: anyone who creates a workspace from the template gets GitHub access, a full toolchain, and AI agents that know how to use everything, automatically. No manual setup.

GitHub Auth: The Long Way Around

The first task was connecting Coder workspaces to GitHub so agents could clone repos, commit, push, and create PRs without manual token management.

I explored three options:

Personal Access Tokens — works but doesn't scale to multiple users
SSH keys — same problem
Coder External Auth (OAuth) — configure once on the server, every user authenticates through the browser with their own GitHub account

Chose option 3. Created a GitHub OAuth App, configured the callback URLs, and started fighting with the server configuration.

The first struggle: Coder wasn't running in Docker (just using Docker for workspaces). It was running as a manual coder server process. The config file at /etc/coder.d/coder.env existed but wasn't being loaded because the file uses VAR=value format without export, and source reads the file but doesn't export to child processes. Had to export the variables directly in the shell before running the server.

The plot twist: After all the OAuth App setup, I discovered the Coder version had a built-in default GitHub provider that was already enabled. Navigating to /external-auth/github just worked. Didn't even need the custom OAuth App.

Lesson: Check coder server --help before manually configuring things. Or, realistically, ask your agent to do it for you. The answer was in the flags the whole time.

Wiring GitHub Into the Workspace Template

Even after authenticating, workspaces didn't automatically have GitHub credentials available. The external auth token existed but nothing told git or gh to use it.

The fix was template changes to main.tf:

data "coder_external_auth" "github" {
  id = "github"
}

Plus injecting GITHUB_TOKEN into the agent's environment variables and adding a startup script that configures the git credential helper and installs the GitHub CLI.

The template workflow I learned:

mkdir -p ~/coder-templates/docker
cd ~/coder-templates/docker
coder templates pull docker .
# edit main.tf
coder templates push docker
coder update my-workspace  # critical — stop/start alone reuses the old version

That last line is a gotcha worth highlighting: stopping and starting a workspace doesn't update the template version. You must run coder update to apply new template changes to an existing workspace.

System Instructions That Actually Work

With GitHub fully wired up, agents still had a problem: they'd ask users to authenticate or provide tokens. They didn't know the environment was pre-configured.

The fix was adding system instructions in the Coder admin panel (Agents > Settings > Behavior) that apply to all users. The key points:

GitHub access is pre-configured. Never ask users to authenticate.
Use gh CLI for all GitHub operations.
Always commit and push. Workspaces are ephemeral; GitHub is the source of truth.
Bias toward action. Build first, ask questions only when genuinely ambiguous.
Do the full loop: write code, install deps, test, commit, push.
Install tools with sudo as needed without asking permission.
Don't ask "would you like me to..." for obvious next steps.

This is the difference between an agent that's technically capable and one that's actually useful. Without these instructions, every session started with five minutes of the agent asking permission to do things it already had access to do.

The Vibe Coding Toolchain

The base Docker image was missing most of what a modern coding session needs. Added to the startup script:

Tool	Why
GitHub CLI (`gh`)	Repo management, PRs, issues from the terminal
Node.js + npm	Most web projects need it
Vercel CLI	Deploy directly from the workspace
uv	Fast Python package manager for new projects
zip, unzip, sqlite3	Common utilities that were missing

All installs are idempotent (if ! command -v ... &> /dev/null) so they only run on first boot.

Multi-User Setup

The real test: could my partner use the same server with her own account and GitHub credentials?

Setup was three steps:

coder users create on the host
She logs in, creates a workspace from the Docker template
Visits /external-auth/github once to link her GitHub account

Everything else (gh, git credentials, Vercel, system instructions) was automatic from the template. That's the whole point of doing this at the template level rather than per-workspace.

The Architecture

Here's what we ended up with:

Ubuntu AI Workstation (home lab)
├── coder server (running via tunnel)
│   ├── Built-in GitHub OAuth provider
│   ├── Agents with system instructions
│   └── Docker template
│       ├── GITHUB_TOKEN auto-injected per user
│       ├── gh CLI pre-installed
│       ├── Node.js + npm + Vercel CLI
│       ├── Python 3.12 + uv
│       └── code-server (VS Code in browser)
├── Docker (runs workspace containers)
└── Coder tunnel (*.try.coder.app)

Every user gets their own isolated workspace with full GitHub integration, a complete toolchain, and AI agents that know how to use all of it. The server handles auth, templates handle environment setup, and system instructions handle agent behavior.

Gotchas Worth Knowing

A few things that cost us time:

source vs export: source /etc/coder.d/coder.env reads the file but doesn't export variables to child processes. If your env file doesn't use export statements, child processes (like coder server) won't see the values.
Template versioning: Stopping and starting a workspace reuses the old template version. You must run coder update <workspace> to pick up new template changes. This one bit us three times before it stuck.
Agents settings vs Deployment settings: They're in completely different places in the Coder UI. Agents settings control AI behavior; deployment settings control server config. Easy to confuse.
The built-in GitHub provider: We spent time creating a custom OAuth App before discovering Coder ships with a default GitHub provider that was already enabled. The --help output had the answer all along.
Agent session refresh: After template changes that modify environment variables, you need a fresh Agents session. The running session won't pick up the new values.

What's Next: Local LLMs

The hardware is ready. The dev environment is running. But right now, all the AI work is still going through cloud APIs: Claude for blog generation, Claude for coding agents.

Tomorrow, we change that.

The RTX 5090's 32 GB of VRAM is sitting idle, and there's an entire ecosystem of open-source models that can run locally on this hardware. We're going to install Ollama, pull a stack of models purpose-built for different coding tasks, and start wiring local inference into the development workflow.

If you've ever wondered what it takes to run a 35-billion-parameter model on consumer hardware, or whether local models can actually keep up with cloud APIs for real coding work, that's what we're testing next.

By the Numbers

1 gaming PC purchased from Newegg
1 Coder server running via tunnel
1 GitHub OAuth integration (built-in, no custom app needed)
1 workspace template with 6 pre-installed tools
2 users configured
3 template pushes to get everything right
~15 minutes debugging export vs source
0 lines of code written outside of Coder Agents
32 GB of VRAM waiting for local models

DEV Community