Anthropic Is Tightening the Screws — First-Party Harness Access Now Blocked Too

#ai #opensource #cloud #devops

Yesterday, Anthropic cut off Claude Pro and Max subscriptions from working with third-party tools like OpenClaw. Today, it's getting worse.

Developer Peter Steinberger shared on X that even first-party harness use is now returning 400 errors:

claude-p --append-system-prompt 'A personal assistant running inside OpenClaw.'
→ 400 Third-party apps now draw from your extra usage, not your plan limits.

The message is crystal clear: if you're not using Claude through Anthropic's own interfaces, you pay extra. Your subscription doesn't cover it anymore.

What actually changed

On April 4, 2026 at 12pm PT, Anthropic flipped a switch:

Claude Pro/Max subscriptions no longer cover usage through third-party agentic tools
OpenClaw and similar harnesses now require either extra usage bundles (at a discount) or a full API key with pay-per-token pricing
First-party harness invocations that reference third-party contexts are getting blocked with 400 errors
The stated reason: "third-party services are not optimized"

This isn't a gradual deprecation. It happened overnight, and a lot of developers woke up to broken setups.

The pattern we keep seeing

This follows a familiar playbook in cloud AI:

Launch a generous plan to attract users and developers
Build an ecosystem of tools and workflows around the API
Once there's lock-in, start restricting access and charging more

We saw it with OpenAI tightening rate limits. We saw it with Google changing Gemini's free tier. Now Anthropic is doing the same thing with Claude.

The difference this time is the speed. Anthropic went from "use Claude everywhere" to "bring your own coin" in less than 24 hours.

Why this matters for developers

If you've built workflows, agents, or automations around Claude through any harness that isn't claude.ai or the official API directly, you need a backup plan. Today it's OpenClaw, tomorrow it could be any integration that Anthropic decides isn't "optimized" enough.

This is exactly why running AI locally is becoming less of a hobby project and more of an infrastructure decision. When you run models on your own hardware:

No one can change your billing overnight
No 400 errors because your use case isn't "optimized"
No dependency on a provider's business model decisions
Your workflows keep working regardless of policy changes

Tools like Ollama, llama.cpp, and frontends like Locally Uncensored exist precisely for this scenario. Models like Gemma 4, Llama 3.3, and Qwen 3.5 are genuinely competitive now. The gap between cloud and local shrinks with every release.

The real question

How many times does a cloud provider need to change the rules before you start running a local fallback?

You don't need to go fully local overnight. But having Ollama running with a capable model as a backup means the next time a provider decides to "restructure access", your work doesn't stop.

The tools are there. The models are there. The only question is whether you set it up before or after the next policy change catches you off guard.