I Switched to Chinese AI APIs and Cut My Costs by Over 90%

#ai #productivity #api #llm

I'm a developer and creative based in Europe. I use AI daily — for coding, writing, data analysis. My preferred tool is Claude Code.

But recently my monthly API bill crossed $100. GPT-4o costs $10 per million output tokens; Claude Sonnet charges $15. Even "budget" Gemini adds up fast. It became unsustainable.

So I started looking for alternatives — and stumbled upon something surprising.

What I did

Installed CC Switch (free, open-source, 20 mins)
Registered on two Chinese LLM platforms (used Google Translate, 5 minutes each)
Copied their API keys into CC Switch

No servers. No complex proxies. Just terminal and keys.

After one week

For 95% of my work I don't switch back. Only high-stakes client copy still gets Claude.

The price difference

I won't post exact numbers — I signed up through regional plans that may not apply to everyone. But here's my experience:

Before: ~$80–$120/month
After: less than a single takeaway meal

Yes, less than lunch. For the same volume. Same quality.

What I'm wondering now

Am I the only one doing this?

I'm thinking of writing a full setup guide for non-Chinese developers. But first I want to know:

Are you paying too much for AI APIs?
What's stopping you from trying alternatives?
Would you switch if you could slash your bill?

Drop a comment or DM me on X (@minidouuuuuu). Happy to share what I found privately. No product, no pitch — just trying to help.

Top comments (3)

Harjot Singh • May 31

The 90% number tracks - DeepSeek/Qwen/GLM have basically broken the price floor, and for a huge fraction of coding tasks they're genuinely good enough that paying 10x for a US frontier model is just brand tax. The honest caveats worth flagging for readers: data-residency/compliance if you're handling customer data, and occasional English-edge-case quirks - but for the deterministic bulk of generation, the savings are real and hard to argue with.

The framing I'd add: it doesn't have to be all-or-nothing "switch to cheap models." The strongest setup is routing - cheap Chinese models for the mechanical 80%, a frontier model only for the hard 20% where the quality gap actually shows. That's exactly how Moonshift works under the hood: a multi-agent pipeline (prompt to a shipped SaaS on your own GitHub + Vercel) where the router mixes models per step, landing a full build ~$3 flat. First run's free, no card. Which provider did you settle on, and did you hit any quality cliff on the harder reasoning tasks, or did it hold up across the board?

Dubhe • May 29

Great write-up! I went down a similar path a few months ago but took a slightly different approach — instead of routing through CC Switch, I signed up for a multi-model gateway that already bundles Chinese models (DeepSeek, Qwen, GLM) alongside the standard OpenAI-compatible ones. One API key, no regional signup hassle.

The savings are real — DeepSeek V4 flash costs about $0.30/M tokens vs $10-15 for GPT-4o. For daily coding and data work the quality difference is barely noticeable.

The part about signing up on Chinese platforms through Google Translate was exactly my pain point too. What stopped you from trying a gateway that handles all that integration for you?

Some comments may only be visible to logged-in visitors. Sign in to view all comments.