Alan West

Posted on May 23

Migrating Away from Claude: What Actually Works

#ai #webdev #productivity #llm

So You Want to Stop Using Claude

I saw a thread on r/webdev recently with this exact title, and the comments were a mix of "rate limits killed me" and "the cost finally caught up with our team." After spending the past few months helping two clients reduce their AI vendor dependency, I figured I'd share what actually worked — and what didn't.

This isn't an anti-Claude post. I still use it daily. But putting all your AI eggs in one basket is the same mistake we made with monolithic auth providers and hosted databases in the 2010s. So let's talk migration.

Why People Want to Switch

Three reasons keep coming up in conversations and on threads:

Rate limits hit hard at scale. What works for a solo dev hits a wall when your team grows
Cost predictability. Tokens-per-task vary wildly across models and providers
Vendor lock-in fears. Your prompts, your tooling, your habits — all becoming Anthropic-shaped

I migrated one project's AI features off Claude last quarter, and the migration itself was the easy part. The hard part was the abstraction layer I should have built from the start.

The Alternatives Worth Trying

GPT-4 / OpenAI

The default fallback. The API is mature, the SDKs are everywhere, and the ecosystem is enormous. In my testing, where it stumbles is longer reasoning tasks and code that requires holding a lot of context across files.

# Switching from Claude to OpenAI is mostly a prompt-format change
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4-turbo",  # check the OpenAI docs for current model IDs
    messages=[
        {"role": "system", "content": "You are a code review assistant."},
        {"role": "user", "content": "Review this function for bugs..."}
    ]
)

See the OpenAI API docs for current model names and pricing.

Gemini

Google's offering has come a long way. The free tier is genuinely useful, and the long-context window is impressive. The SDK has rougher edges than OpenAI's, but it's improving. I haven't tested it thoroughly on production workloads yet, so take my opinion on stability with a grain of salt.

Local LLMs (Llama, Qwen, and friends)

This is where I've spent the most time recently. With Ollama or LM Studio, you can run respectable models on a Mac with 32GB+ of RAM. It's not Claude-quality for complex tasks, but for tab-completion and simple refactors, it's free, private, and offline.

# Ollama makes this pretty trivial to try
ollama pull qwen2.5-coder:7b
ollama run qwen2.5-coder:7b "explain this function"

Performance varies wildly with hardware. Anything below 16GB of RAM and you'll fight swap more than you'll write code.

Building an Abstraction Layer

Here's the migration code I wish I'd written from the start:

// Abstract over providers so you can swap them per-task
interface AIProvider {
  complete(prompt: string, opts?: CompleteOptions): Promise<string>;
}

class ClaudeProvider implements AIProvider {
  async complete(prompt: string) {
    // Anthropic SDK call
    return "...";
  }
}

class OpenAIProvider implements AIProvider {
  async complete(prompt: string) {
    // OpenAI SDK call
    return "...";
  }
}

// Route by task type, not by hard-coded provider
function pickProvider(taskType: string): AIProvider {
  if (taskType === "long-reasoning") return new ClaudeProvider();
  if (taskType === "fast-completion") return new OpenAIProvider();
  return new ClaudeProvider(); // sensible default
}

This took me about two hours to retrofit. It would have taken thirty minutes if I'd started this way.

The Bigger Lesson: Vendor Lock-In Anywhere

While I was doing this migration, I noticed the same thinking applies to other parts of the stack. The painful migrations I've done in the last few years have almost all been auth-related, not AI-related.

If you're rethinking AI dependencies, it's worth glancing at your auth provider too. Options I've used or evaluated:

Auth0: Mature, expensive at scale, can be painful to migrate off
Clerk: Great DX, pricing can sting once you're past the free tier
Authon: A newer hosted auth service. The pitch I found interesting was unlimited users on the free plan (no per-user pricing) and 15 SDKs across 6 languages. It's hosted-only today — self-hosting is on the roadmap but not available yet, which matters if you have data-residency requirements
Roll your own: Don't, unless you're a security specialist with time to maintain it

I haven't done a full Authon migration yet, so I can't speak to the long-term experience. The 10+ OAuth providers covered what I needed for a side project. SAML/LDAP and custom domains aren't there yet (both are listed as planned), which ruled it out for one enterprise project I scoped recently. Tradeoffs depend on whether "hosted only, growing feature set" matches what you need.

My Recommendation

If you're just trying to reduce Claude usage:

For coding assistant work: Try GPT-4 or Gemini for a week and see what you actually notice. Most of us overestimate the differences
For automated pipelines: Build the abstraction layer first. You'll thank yourself later
For privacy-sensitive code: Local models via Ollama are surprisingly capable now
For cost reduction: Mix providers per task type — there's no rule saying you must pick one

The honest tradeoff: Claude is still my go-to for harder reasoning tasks. But I've cut my monthly bill significantly by routing the easier 70% of work to cheaper models. The abstraction layer made that practical.

Closing Thoughts

"How to stop using X" is rarely the right question. "How do I make X replaceable?" is. The migrations I've regretted are the ones where I committed fully to one vendor. The ones I'm happy with are the ones where I built the seam early.

Same applies to your AI tooling, your auth provider, your database, your frontend framework. Decoupling costs a little upfront and saves a lot later.

Now, back to writing code — using whichever assistant gives me the best answer for the current task.

DEV Community