This article was originally published on aicoderscope.com
TL;DR: WSL 3, previewed at Microsoft Build 2026 on June 2, swaps WSL 2's heavy Hyper-V virtual machine for a lightweight paravirtualized layer that puts Linux GPU and NPU workloads within 3–5% of bare-metal Linux speed. That matters for running Claude Code, Aider, Cline, and local Ollama models on Windows without dual-booting. The catch: the preview is locked to Copilot+ PCs with Qualcomm Snapdragon X Elite, Intel Meteor Lake, and Lunar Lake NPUs — AMD and most discrete NVIDIA desktop setups aren't on the launch list, and WSL 2 already does NVIDIA CUDA passthrough today.
What you'll be able to do after this guide:
- Decide whether WSL 3 is worth chasing now or whether your existing WSL 2 setup already does the job
- Run a Linux-native AI coding stack (Claude Code, Aider, Cline + Ollama) on Windows with GPU acceleration
- Avoid the two traps that bite developers moving their coding agents into WSL
| WSL 2 (today) | WSL 3 (preview) | Dual-boot Linux | |
|---|---|---|---|
| GPU passthrough | NVIDIA CUDA works; some virtualization overhead | Near-native, ~3–5% overhead; adds NPU | Native, zero overhead |
| Hardware at launch | Any WSL2-capable PC | Copilot+ PCs: Snapdragon X Elite, Intel Meteor/Lunar Lake | Any |
| Setup friction |
wsl --install, one reboot |
Windows Insider preview build required | Partition, bootloader, driver pain |
| Best for | Most devs running Ollama/CUDA on Windows now | Copilot+ laptops doing local NPU + GPU AI | Squeezing every last token/sec out of a GPU |
Honest take: If you already run an RTX desktop with WSL 2, your CUDA-backed Aider and Cline setup is fine — stay put. WSL 3 is the real upgrade for Copilot+ laptop owners who want their NPU and GPU available to Linux coding agents without dual-booting. Treat it as preview, not production.
Why this is news at all (WSL 2 already runs CUDA)
Worth clearing up first, because the headlines oversell it: WSL 2 has supported NVIDIA CUDA GPU passthrough since 2020. If you have an RTX card and run ollama run qwen2.5-coder:14b inside Ubuntu on WSL 2 today, it already uses your GPU. The driver bridge (/usr/lib/wsl/lib, the dxg kernel interface) has been shipping for years. So WSL 3 is not "finally GPU on Windows Linux." That existed.
What WSL 3 actually changes, per Microsoft's Build 2026 preview, is two things. First, it replaces the full Hyper-V VM that backs WSL 2 with a lighter paravirtualized hardware access model, dropping the virtualization tax so CUDA, DirectML, ROCm, ONNX Runtime, and OpenVINO workloads land within 3–5% of a native Linux install. Second — and this is the genuinely new capability — it exposes the NPU to Linux, not just the GPU. On a Copilot+ laptop, that means PyTorch, JAX, llama.cpp, and Ollama running inside Linux can finally reach the neural accelerator that was previously Windows-only.
For an AI coding workflow, the practical read is narrow but real: if you code on a Snapdragon X Elite or Intel Lunar Lake laptop and want a Linux-native agent stack hitting local models on-device, WSL 3 is the first time that's near-native. For everyone on a desktop NVIDIA box, the win is a few percent of overhead you probably won't notice.
What WSL 3 supports at launch
The preview is restricted. From Microsoft's Build 2026 announcement and the coverage that followed:
- Hardware: Copilot+ PCs (they have an NPU) on Qualcomm Snapdragon X Elite, Intel Meteor Lake, and Intel Lunar Lake. AMD support comes later. Discrete NVIDIA desktop GPUs were not called out as part of the launch passthrough story — the headline demo was about Copilot+ NPUs and integrated graphics.
- Frameworks: CUDA, ROCm, DirectML, ONNX Runtime, and OpenVINO are all listed as supported acceleration paths. DirectML is the layer doing the heavy lifting for NPU and integrated-GPU access.
- AI tools demonstrated: Ollama, PyTorch, llama.cpp, and JAX running inside WSL with near-native acceleration.
- Distribution: preview ships through the Windows Insider Program first; Microsoft said it plans to push WSL 3 through Windows Update later, but gave no GA date.
That last point is the one to internalize. As of June 16, 2026, WSL 3 is a Windows Insider preview with no general-availability date. If your machine isn't a recent Copilot+ device, you cannot run it yet regardless of how good your GPU is.
Getting on the WSL 3 preview
There's no wsl --upgrade-to-3 button. The path is the Windows Insider Program:
# Check what you're on today (run in PowerShell or inside WSL)
wsl --version
# Example output on a current WSL 2 install:
# WSL version: 2.6.1.0
# Kernel version: 6.6.87.2-1
# WSLg version: 1.0.66
# Windows version: 10.0.26200.8728
To get the preview build:
- Enroll the machine in the Windows Insider Program (Settings → Windows Update → Windows Insider Program) and pick the channel carrying the WSL 3 preview. Microsoft typically lands this kind of feature in the Dev or Canary channel first.
- Install the flagged build and reboot.
- After the preview lands,
wsl --versionreports a 3.x WSL version on supported hardware.
If you're not on a Snapdragon X Elite, Meteor Lake, or Lunar Lake machine, stop here — the preview won't activate the new passthrough path, and you gain nothing over WSL 2.
The AI coding stack that benefits
Once you have a GPU- or NPU-accelerated Linux environment, the coding tools that gain the most are the Linux-native and CLI-first ones. Here's what to install and why each one cares about WSL.
Claude Code runs cleanly in any Linux shell, so WSL has always been a reasonable home for it. The agent itself calls Anthropic's API, so the GPU isn't doing inference — but if you pair Claude Code with a local model gateway or run local tooling (linters, test suites, build steps) that the agent drives, the near-native I/O and compute of WSL 3 cut the friction. See our Claude Code review for what it's actually good at.
Aider is terminal-native and a natural WSL citizen. Point it at a local Ollama model and the GPU passthrough is what makes that usable:
# Inside WSL, with Ollama running locally
ollama pull qwen2.5-coder:14b
aider --model ollama/qwen2.5-coder:14b
# Aider confirms the model and shows the repo map:
# Aider v0.86.0
# Model: ollama/qwen2.5-coder:14b
# Git repo: .git with 142 files
# Repo-map: using 1024 tokens
If that ollama pull model runs on CPU instead of your accelerator, you'll feel it — single-digit tokens per second instead of usable speed. Our full Aider + Ollama setup guide covers model choice and the context-window trap.
Cline lives in VS Code, which on Windows talks to the WSL backend through the Remote-WSL extension. With a local model served from inside WSL (or LM Studio on the Windows side), Cline's agentic tool-calling runs against your accelerated Ollama instance. The Cline + LM Studio setup and Continue.dev + Ollama guide both apply directly.
The common thread: WSL 3 doesn't make the agent smarter. It makes the local inference the agent depends on fast enough to be worth using on a Windows machine — which is exactly the "local LLM + coding tool" combination most readers are searching for.
The problem I keep seeing: networking between WSL and the host
The single most common failure when wiring an editor-side agent (Cline, Continue.dev, Cursor) on Windows to a model server inside WSL isn't GPU — it's the network boundary. WSL gets its own virtual NIC, so localhost:11434 on the Windows side does not always reach Ollama running inside WSL.
The fix that works reliably: bind Ollama to all interfaces inside WSL, then point the editor at the WSL IP, not localhost.
bash
# Inside WSL — make Ollama listen on all interfaces
export OLLAMA_HOST=0.0.0.0:11434
ollama serve
# Find the WSL IP to use f
Top comments (0)