The debate that's been heating up HN this week
You've seen the posts: someone runs a benchmark, a local GPU beats Claude Sonnet on some coding task, and the comments explode.
"Just buy a GPU. You'll save money in the long run."
I wanted to actually do the math. Here's what I found.
The upfront cost nobody talks about
A decent GPU for local inference:
- RTX 4090: ~$1,600
- RTX 3090 (used): ~$500-700
- MacBook Pro M3 Max (if you're using it for AI): ~$3,500
At $2/month for API access, you'd need 250-1,750 months (20-145 years) to break even on hardware costs alone.
Yes, that's ignoring electricity. And cooling. And the time you spend managing the setup.
When local models actually win
I'm not being unfair to local inference. It genuinely wins when:
- You're processing millions of requests/day — at scale, API costs compound fast
- You need offline access — rural areas, air-gapped systems, travel
- Your use case requires 100% data privacy — nothing leaves your machine
- You're a researcher who needs to fine-tune or inspect weights
For these cases, local is the right call.
When API wins (most developers)
Here's the thing: most developers aren't running millions of requests. They're:
- Building a side project
- Automating personal workflows
- Learning to prompt engineer
- Shipping a product to early users
For these use cases, the math is completely different.
# The simplest possible Claude API call
curl https://api.simplylouie.com/v1/chat \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "explain async/await in Python"}]
}'
Total cost of that call: fractions of a cent, included in your $2/month.
Total cost of your $500 GPU to run that same call: $500 + time to install Ollama + time to download the model + electricity.
The real comparison
| Local GPU | $2/month API | |
|---|---|---|
| Upfront cost | $500-$1,600 | $0 |
| Monthly cost | ~$15-30 (electricity) | $2 |
| Setup time | 2-8 hours | 5 minutes |
| Maintenance | Regular (driver updates, model updates) | None |
| Access from phone | No (unless you self-host) | Yes |
| Model quality | Depends on VRAM | Claude Sonnet |
| Break-even | Never (for light users) | Already won |
The hidden cost: opportunity cost
This is the one developers undercount.
Every hour you spend setting up your local inference stack is an hour you're not building. For a side project, that setup time often exceeds the total API cost you'd ever pay.
I built SimplyLouie on a $7/month VPS. My users pay $2/month. The entire infrastructure cost is under $10/month.
If I'd started with "I need to self-host everything" thinking, I'd still be configuring GPU drivers instead of shipping.
Who is the $500 GPU actually for?
Honestly? Enthusiasts, researchers, and developers who enjoy the hardware tinkering as much as the software side. That's completely valid. I'm not knocking it.
But if your goal is to ship something or use AI in your workflow at the lowest possible friction, the API wins by a wide margin at any price point under ~$20/month.
At $2/month, it's not even close.
Try it
If you want the API without the GPU drama:
7-day free trial. No GPU required. 🙂
Built by one developer. 50% of revenue goes to animal rescue. Because good software should do good things.
Top comments (0)