ChatGPT Plus: $20/month. Midjourney: $10/month. GitHub Copilot: $10/month.
That's $480/year for AI tools that send every keystroke to someone else's server. I cancelled all three and replaced them with a local stack that costs nothing after the initial setup.
Here's exactly what I'm running.
Chat: Ollama + Qwen 3.5 (replaces ChatGPT)
Qwen 3.5 9B runs on 8 GB VRAM. Qwen 3.5 35B (MoE) runs on 16 GB with 256K context — longer than ChatGPT's 128K. It handles reasoning, analysis, writing, and code generation. For math and logic, it matches GPT-4o on most benchmarks.
Gemma 4 27B is the alternative if you want native vision (describe images, analyze screenshots) without an API call.
ollama pull qwen3.5:9b
Done. Running. Forever. No subscription.
Images: ComfyUI + FLUX (replaces Midjourney)
FLUX.1 Dev generates images that compete with Midjourney v6. FLUX 2 Klein is the newer, faster variant. Z-Image does uncensored generation — no "I can't generate that" refusals.
The catch with ComfyUI has always been setup complexity. Model files in the wrong folder, custom nodes breaking, workflow JSONs that don't load.
I built a wrapper that handles all of it. ComfyUI auto-detection, one-click install if it's missing, model bundles with one-click download, and a Dynamic Workflow Builder that constructs the correct pipeline based on what you have installed. You never touch a workflow JSON.
Code Agent: MCP Tools (replaces Copilot)
Not autocomplete — a full coding agent. It reads your project files, writes code, runs shell commands, executes tests, and iterates on errors. 13 MCP tools: file I/O, shell execution, web search, code execution, screenshots.
The difference from Copilot: it doesn't just suggest the next line. You say "add input validation to this form and write tests" and it reads the code, writes the validation, creates the test file, runs the tests, and fixes failures. Up to 20 tool iterations per task.
Works with any model. Native tool calling for Qwen, Gemma, Llama. XML fallback for everything else.
The App That Ties It Together
All of this runs through Locally Uncensored — one desktop app, one window, swap between chat, code agent, and image/video generation.
It auto-detects 12 local backends (Ollama, LM Studio, vLLM, KoboldCpp, etc.) and has 20+ cloud provider presets if you occasionally need a frontier model. A/B model comparison lets you test two models side by side with the same prompt.
Not Electron. Tauri v2 with a Rust backend. The app itself uses ~80 MB of RAM.
What I Lost
Honestly? Two things.
GPT-4's creative writing is still noticeably better than local models for fiction and marketing copy. For technical writing, code, and analysis — local models are equal or better.
Midjourney's aesthetic consistency across different prompts is hard to match locally. FLUX is technically more capable, but Midjourney has a "house style" that's effortlessly good. Locally, you need to dial in your prompts more carefully.
Everything else — speed, privacy, availability, cost — is better locally.
Monthly Cost Comparison
| Service | Cloud | Local |
|---|---|---|
| Chat AI | $20/mo (ChatGPT Plus) | $0 |
| Image Gen | $10/mo (Midjourney) | $0 |
| Code Agent | $10/mo (Copilot) | $0 |
| Total | $40/mo ($480/yr) | $0 |
Hardware requirement: a GPU with 8+ GB VRAM. If you have a gaming PC from the last 5 years, you probably already have one.
GitHub: PurpleDoubleD/locally-uncensored
License: AGPL-3.0 — fully open source, no telemetry, no cloud dependency.
Top comments (1)
One surprising insight is that many developers overestimate the need for multiple AI tools. In our latest cohort, we found that focusing on mastering a single tool, like an open-source LLM, often yields better results. The key is integrating it deeply into your workflow rather than spreading yourself thin across platforms. This approach not only saves costs but also boosts productivity as you become adept at leveraging one tool for multiple use cases. - Ali Muwwakkil (ali-muwwakkil on LinkedIn)