Plandex v2 review 2026: cloud shut down, self-hosted survives — is the terminal agent still worth it?

#plandex #terminalagent #review #selfhosted

This article was originally published on aicoderscope.com

If you found Plandex through a 2024 blog post or a "best CLI coding agents" roundup, you probably noticed something odd when you tried to sign up: the cloud service is gone. Plandex AI, the company, announced in October 2025 that it was winding down its managed cloud after the founder took an engineering role at Promptfoo (itself later acquired by OpenAI in March 2026). New user registration closed October 3, 2025. Existing cloud accounts ran until November 7, 2025, then the managed service shut off entirely.

What remains is a 15.4k-star MIT-licensed Go binary and a Docker-based server you can run yourself. The question for a developer in May 2026 is whether the self-hosted version is still useful, or whether Aider and Claude Code have simply lapped it.

Short answer: Plandex v2 still does something specific that those tools do not do as cleanly — handle enormous multi-file tasks in a single controlled session with a cumulative diff sandbox that keeps AI changes quarantined until you explicitly apply them. For privacy-focused teams or developers running air-gapped environments, that combination is hard to replicate. For everyone else, the setup overhead and maintenance uncertainty make it hard to recommend over the actively-developed alternatives.

What happened to Plandex Cloud

Plandex launched in early 2024 as a company with a managed cloud and an open-source core. The cloud offered two tiers: an integrated plan ($45/month, models included) and a BYO API Key plan ($30/month, bring your own Anthropic/OpenAI keys). Both tiers are gone.

The project has not been abandoned in the sense of being deleted or relicensed — the GitHub repo at plandex-ai/plandex is still up, the code is MIT, and the README actively documents the self-hosted path. There have been no major feature releases since v2.2.1, and the project is in maintenance mode. The plandex.ai website was returning errors at the time of writing (May 23, 2026), though the GitHub install script worked during testing.

If you depend on a vendor standing behind the product, Plandex is not that product anymore. If you are comfortable running open-source infrastructure yourself and accepting that the project's future is community-dependent rather than backed by a funded team, it is a genuinely capable tool.

What Plandex v2 actually does

Plandex is a terminal-first AI coding agent written in Go, optimized for large projects and long-running multi-step tasks. Where Aider is built around tight git integration and fast iteration on individual files or functions, Plandex is built around what it calls a "plan": a session that accumulates proposed changes across many files and lets you review, revise, or roll back before anything touches your working directory.

The core technical differentiators in v2:

2M token effective context window. With the default model pack (Claude Sonnet or equivalent), Plandex can hold up to 2M tokens of loaded file content across a session. For reference, that is enough to hold the entire source of a mid-sized production application without selective trimming.

Tree-sitter project maps up to 20M tokens. Even if your repo is too large to load fully into context, Plandex builds a syntax-aware map using tree-sitter (30+ languages supported). This gives the model structural awareness of your codebase — class hierarchies, function signatures, import graphs — without burning tokens on full file content.

Cumulative diff review sandbox. Every proposed change lands in a sandbox, not your files. You get a full diff view before anything is applied. If a 20-file refactor produces three good changes and two bad ones, you apply the three and reject the two without touching git. This is meaningfully different from Aider's default behavior, which writes directly to files (though Aider's --dry-run flag exists for similar purposes).

Configurable autonomy. You can run Plandex in full auto mode for straightforward tasks (it will run commands, apply changes, retry on failures) or drop to a fine-grained approval mode where it pauses before each action. The automated debugging loop can read terminal output and browser application state (requires Chrome) to iterate on failures without manual intervention.

Model flexibility. Plandex uses a model pack system where different roles in a task (planning, coding, summarization, error analysis) can use different models. The default pack uses Claude Sonnet for most roles, but you can configure OpenAI, Google, OpenRouter, or Ollama (local, no API key) for any role. There is also failover: if you provide keys for both a direct provider and OpenRouter, Plandex falls back automatically if the first request fails.

Self-hosting in 2026: what you actually need

The self-hosted path has two modes.

Local mode (simplest): Docker Desktop installed, one command to start the server, then the CLI talks to your local server. Your API keys stay on your machine. This is the setup most individual developers will use.

# Install the CLI
curl -sL https://plandex.ai/install.sh | bash
# (or install from the GitHub releases page if plandex.ai is down)

# Start the local server with Docker
docker run -p 8099:8099 plandexai/plandex-server

# Set up your API key (example: Anthropic)
export ANTHROPIC_API_KEY=sk-ant-...

# Initialize in your project directory
cd your-project
plandex new

Remote server mode: For teams or multi-user setups, Plandex server can run on any Linux host with Docker. You point the CLI at the server URL and manage user accounts through the API. This requires more setup — a PostgreSQL database, proper networking, and TLS if exposing externally — but is fully documented in the GitHub docs.

Windows is supported only through WSL. There is no native PowerShell or CMD support, and no indication that will change given the current maintenance state.

Model costs: what you actually spend

The software is free. Your API costs are the only ongoing expense.

Using the default model pack with Claude Sonnet 4.6 ($3 input / $15 output per million tokens), a typical mid-complexity task — "refactor this 800-line service to split concerns into three files" — runs 50,000 to 150,000 tokens depending on how many planning rounds it takes. That is $0.15 to $0.45 per task at Sonnet pricing, which is comparable to what you would spend on the same task in Aider.

If you switch the model pack to use OpenRouter's free or low-cost tier models (Mistral, Llama 3, Qwen 2.5 Coder), you can run Plandex at effectively zero API cost with reduced quality. For organizations that cannot send code to cloud providers at all, the Ollama integration lets you run entirely local inference at zero per-token cost.

Context caching is used across Anthropic, OpenAI, and Google providers, which matters for long sessions where the same file content is referenced repeatedly. On Anthropic's API, cache reads are 10% of the base input rate — a 200,000-token context re-read costs $0.06 instead of $0.60.

Where Plandex still beats the alternatives

Three specific scenarios where Plandex's self-hosted approach is the right call in 2026:

Large monorepos with strict privacy requirements. If your codebase cannot touch third-party cloud infrastructure (financial services, healthcare, pre-launch IP), self-hosted Plandex with Ollama gives you a fully local agentic coding workflow. Aider can do this too, but Plandex's tree-sitter project maps are better suited to very large codebases where you cannot fit the whole repo in context.

Long multi-step tasks requiring rollback granularity. The diff sandbox is genuinely useful when you are asking an agent to make many changes across many files and you want fine-grained control over what gets applied. Claude Code's agentic mode applies changes as it goes; rolling back requires manual git commands. Plandex's sandbox makes partial acceptance of a multi-file plan str