David

Posted on Apr 10

i cancelled ChatGPT, Midjourney, and Copilot. here's the $0/month stack that replaced all three.

#productivity #ai #beginners #opensource

ChatGPT Plus: $20/month. Midjourney: $10/month. GitHub Copilot: $10/month.

That's $480/year for AI tools that send every keystroke to someone else's server. I cancelled all three and replaced them with a local stack that costs nothing after the initial setup.

Here's exactly what I'm running.

Chat: Ollama + Qwen 3.5 (replaces ChatGPT)

Qwen 3.5 9B runs on 8 GB VRAM. Qwen 3.5 35B (MoE) runs on 16 GB with 256K context — longer than ChatGPT's 128K. It handles reasoning, analysis, writing, and code generation. For math and logic, it matches GPT-4o on most benchmarks.

Gemma 4 27B is the alternative if you want native vision (describe images, analyze screenshots) without an API call.

ollama pull qwen3.5:9b

Done. Running. Forever. No subscription.

Images: ComfyUI + FLUX (replaces Midjourney)

FLUX.1 Dev generates images that compete with Midjourney v6. FLUX 2 Klein is the newer, faster variant. Z-Image does uncensored generation — no "I can't generate that" refusals.

The catch with ComfyUI has always been setup complexity. Model files in the wrong folder, custom nodes breaking, workflow JSONs that don't load.

I built a wrapper that handles all of it. ComfyUI auto-detection, one-click install if it's missing, model bundles with one-click download, and a Dynamic Workflow Builder that constructs the correct pipeline based on what you have installed. You never touch a workflow JSON.

Code Agent: MCP Tools (replaces Copilot)

Not autocomplete — a full coding agent. It reads your project files, writes code, runs shell commands, executes tests, and iterates on errors. 13 MCP tools: file I/O, shell execution, web search, code execution, screenshots.

The difference from Copilot: it doesn't just suggest the next line. You say "add input validation to this form and write tests" and it reads the code, writes the validation, creates the test file, runs the tests, and fixes failures. Up to 20 tool iterations per task.

Works with any model. Native tool calling for Qwen, Gemma, Llama. XML fallback for everything else.

The App That Ties It Together

All of this runs through Locally Uncensored — one desktop app, one window, swap between chat, code agent, and image/video generation.

It auto-detects 12 local backends (Ollama, LM Studio, vLLM, KoboldCpp, etc.) and has 20+ cloud provider presets if you occasionally need a frontier model. A/B model comparison lets you test two models side by side with the same prompt.

Not Electron. Tauri v2 with a Rust backend. The app itself uses ~80 MB of RAM.

What I Lost

Honestly? Two things.

GPT-4's creative writing is still noticeably better than local models for fiction and marketing copy. For technical writing, code, and analysis — local models are equal or better.

Midjourney's aesthetic consistency across different prompts is hard to match locally. FLUX is technically more capable, but Midjourney has a "house style" that's effortlessly good. Locally, you need to dial in your prompts more carefully.

Everything else — speed, privacy, availability, cost — is better locally.

Monthly Cost Comparison

Service	Cloud	Local
Chat AI	$20/mo (ChatGPT Plus)	$0
Image Gen	$10/mo (Midjourney)	$0
Code Agent	$10/mo (Copilot)	$0
Total	$40/mo ($480/yr)	$0

Hardware requirement: a GPU with 8+ GB VRAM. If you have a gaming PC from the last 5 years, you probably already have one.

GitHub: PurpleDoubleD/locally-uncensored
License: AGPL-3.0 — fully open source, no telemetry, no cloud dependency.

Top comments (1)

Laura Ashaley • Apr 11

This is a strong example of where the “local-first AI” movement is heading, but it’s also a bit idealized in practice. Replacing ChatGPT, Midjourney, and GitHub Copilot with a fully local stack is absolutely possible today, and tools like Ollama, Qwen, Gemma, ComfyUI, and FLUX are closing the gap fast.

But the trade-off isn’t just cost—it’s reliability, maintenance, model updates, and ecosystem tooling. Most people underestimate how much “invisible infrastructure” they’re paying for with managed tools: safety layers, updates, latency optimization, and integrations. The real future probably isn’t full replacement—it’s hybrid stacks where local models handle privacy-sensitive or repetitive tasks, and cloud models handle frontier reasoning and scale.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.