The Open Source Illusion: Why "Free" AI Models Are Getting Expensive

#ai #llm #career #machinelearning

The Open Source Illusion: Why "Free" AI Models Are Getting Expensive

Everyone's watching Chinese open-source models. But the subscription costs are catching up to Western counterparts.

The Z.ai Price Hike

GLM 5.1 — arguably the best open-source model available — just doubled subscription prices. Maximum tier now costs $160/month.

For comparison:

Claude Pro: ~$20/month
ChatGPT Plus: ~$20/month
Mid-tier API access: variable, but often lower

Why This Matters

The narrative around open-source models has been "free alternatives to expensive closed models." But:

Inference costs scale with usage. Running GLM-5 at scale requires serious hardware or API credits.
Chinese providers are monetizing aggressively. The open weights are free; reliable hosting and premium features are not.
Local deployment isn't free either. A 70B+ parameter model needs 2-4x A100s or equivalent. That's $5-15/hour on cloud GPU instances.

The Real Cost Comparison

Model	Access Cost	Inference Cost (1M tokens)
GPT-5.2 API	$0	$10-30
Claude API	$0	$3-15
GLM-5 (Z.ai)	$0-160/mo	Included in subscription
Local 70B	$0	$5-15/hr hardware

The Hidden Value

What you're paying for with premium tiers:

Consistent availability (local GPUs can be flaky)
No setup maintenance (dependencies, updates, drivers)
Multi-modal features (not always available in open weights)
Context window guarantees (local setup may crash on 200K tokens)

My Approach

Hybrid strategy:

Experiment locally — understand model behavior, validate approaches
Production APIs — reliability and scale matter more than marginal cost savings
Monitor burn — token consumption grows non-linearly with adoption

More AI economics, model comparisons, and production insights from inside a bank — follow my Telegram channel:

🚀 https://t.me/ai_tablet (Russian, technical)

Top comments (1)

Felix • May 30

This hits hard. The real cost isn't just compute — it's the engineering time spent managing API keys, rate limits, and billing across providers. We built a relay service partly because of this exact problem. The hidden tax of "free" models is infrastructure complexity.