I tried running Qwen3.6 locally for a week — here's why I went back to a $2/month API
Qwen3.6-35B-A3B is genuinely impressive. 917 points on Hacker News. An open-source model that rivals GPT-4 on several benchmarks. The dream of free, private, local AI — finally real?
I spent a week running it. Here's what actually happened.
The Setup
I have a decent machine — 32GB RAM, RTX 4070 Ti. Not top-tier, but respectable. Enough to run a 35B model at Q4 quantization.
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull Qwen3.6
ollama pull qwen3.6:35b-a3b
# Run it
ollama run qwen3.6:35b-a3b
First response: 47 seconds.
For a simple "explain this function" prompt.
The Real Costs
Electricity
RTX 4070 Ti at full load: ~285W. Plus CPU, fans, RAM: ~380W total.
A week of 4 hours/day usage:
- 380W × 4h × 7 days = 10.6 kWh
- At $0.12/kWh = $1.27 just in electricity
That's already halfway to $2/month in one week of casual use.
If you're in a country with higher electricity costs (Germany: $0.36/kWh, Australia: $0.29/kWh), that week of local inference costs $3.80 — almost 2 months of a $2/month API subscription.
Time Tax
Every prompt: 15-47 seconds to first token. Compare:
- $2/month Claude API via SimplyLouie: 1.2 seconds average
- Local Qwen3.6: 15-47 seconds
If you run 20 prompts per day, you're spending 5-15 minutes just waiting for responses. That's 35-105 minutes per week.
At even minimum wage, that's real money.
Thermal Impact
My GPU temps: 83°C sustained. That's not great for long-term hardware longevity. GPU degradation is real — high sustained temps accelerate VRAM degradation.
The "free" model has hidden hardware depreciation costs.
When Local AI Actually Makes Sense
I'm not anti-local-AI. There are legitimate use cases:
- Sensitive data you cannot send to any cloud (medical records, classified code, NDA'd projects)
- Offline development (no internet, air-gapped systems)
- Research/experimentation (you want to understand the model internals)
- High-volume batch processing (millions of prompts where API costs would be enormous)
But for most developers doing daily coding work? The economics don't hold.
The Actual Math for Daily Dev Work
Let me be direct about what I use AI for daily:
- Code review (explain what this function does)
- Writing tests (generate Jest tests for this module)
- Debugging (why is this throwing a TypeError)
- Documentation (write JSDoc for this API)
- Email drafts (professional response to client complaint)
For these tasks:
| Option | Cost | Speed | Privacy |
|---|---|---|---|
| ChatGPT Plus | $20/month | Fast | Low |
| Local Qwen3.6 | "Free" + electricity + hardware wear | Slow | High |
| $2/month Claude API | $2/month | Fast | Medium |
| Run nothing | $0 | N/A | Perfect |
The $2/month API wins on the cost/speed tradeoff for casual daily use.
What I Actually Use Now
I kept Ollama installed. I use it for things that are genuinely sensitive — API keys in code, client NDA work, anything I wouldn't want trained on.
For everything else, I use SimplyLouie's $2/month Claude API — it's Claude Sonnet via a simple REST API, no rate limit gymnastics, no subscription tier confusion.
curl https://simplylouie.com/api/chat \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"message": "explain this function", "context": "const fn = (x) => x.reduce((a,b) => a+b, 0)"}'
Response in 1.2 seconds. $2/month total. It runs on Anthropic's infrastructure, not my GPU.
The Honest Verdict
Qwen3.6 is a landmark model. The open-source AI community deserves credit — this is genuinely impressive work.
But "open source" and "free" are different things. Running a 35B model locally has real costs: electricity, time, hardware wear, setup complexity.
For developers in markets where $20/month ChatGPT is genuinely unaffordable — Nigeria, Philippines, Indonesia, Kenya, India — a $2/month hosted API is often the better tradeoff than local inference on aging hardware.
- Nigeria: ₦3,200/month (vs ₦32,000 for ChatGPT)
- Philippines: ₱112/month (vs ₱1,120 for ChatGPT)
- Indonesia: Rp32,000/month (vs Rp320,000 for ChatGPT)
- Kenya: KSh260/month (vs KSh2,600 for ChatGPT)
- India: ₹165/month (vs ₹1,600+ for ChatGPT)
If you have the hardware and the use case, run local. If you're doing daily dev work and want fast, cheap, reliable AI — the $2/month API is the more honest choice.
SimplyLouie is a $2/month Claude API. 50% of revenue goes to animal rescue. simplylouie.com
Top comments (0)