Jyotiraditya Singh

Posted on Jul 2

Stop Scattering LLM Code Across Your Codebase

#ai #llm #architecture #python

Every AI project starts the same way.

You pick a model, install its SDK, and write something like this:

from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

response = client.responses.create(
    model="gpt-4.1",
    input="Explain recursion."
)

Everything works.

Then a week later someone asks:

"Can we try Claude?"

So you install another SDK.

Now your project has OpenAI code in one file, Anthropic code in another, Gemini somewhere else, and every agent knows exactly which provider it's talking to.

A month later, switching models means touching dozens of files.

I've made this mistake more than once.

The Real Problem Isn't Which Model You Use

It's that your application is tightly coupled to a provider.

Instead of your business logic saying:

answer = llm.generate(prompt)

it says:

client = OpenAI(...)
response = client.responses.create(...)

Your application now depends on:

OpenAI's SDK
OpenAI's response format
OpenAI's streaming API
OpenAI's error handling

Switch providers and all of that changes.

What Clean AI Architecture Looks Like

Instead of every component talking directly to an SDK, give your application a single interface.

Agents
 ↓
Services
 ↓
LLM Interface
 ↓
Provider

Your agents don't know whether the response came from OpenAI, Claude, Gemini, or something else.

They just ask for text.

That's the only thing they should care about.

A Simple Example

Instead of this:

from openai import OpenAI

client = OpenAI()

result = client.responses.create(...)

or this:

from anthropic import Anthropic

client = Anthropic()

result = client.messages.create(...)

your code becomes:

result = await client.generate(
    provider="openai",
    model="gpt-4.1",
    modality="text",
    prompt="Explain recursion."
)

print(result["text"])

The rest of your application stays exactly the same.

Changing providers doesn't require rewriting your business logic.

Why This Matters

As AI projects grow, you'll probably want to:

compare providers
benchmark models
add fallbacks
route requests dynamically
experiment with new releases

If provider-specific code is scattered throughout your project, each of those becomes a refactoring exercise.

If it's isolated behind one interface, they're configuration changes.

The Pattern Scales

This approach also makes it much easier to add things like:

retries
timeout handling
logging
usage tracking
streaming
structured outputs
provider fallbacks

without changing every agent or API route.

Instead, those concerns live in one place.

Why I Built Loom

After repeating this pattern across multiple projects, I wanted a single abstraction that kept provider-specific code isolated.

That's why I built Loom.

The idea is simple:

Your application shouldn't know which LLM provider it's using.

Only one file should interact with provider SDKs.

Everything else should depend on a clean interface.

Loom currently supports multiple providers behind the same API so switching providers doesn't require rewriting your application logic.

If you're interested, you can explore the project here:

Website: https://loom-weaves.vercel.app
GitHub: https://github.com/jyotir07/Loom

I'm also planning to write more about the architecture behind provider abstraction, routing strategies, streaming, and production AI systems.

I'd love to hear how you're structuring LLM integrations in your own projects. Are you calling provider SDKs directly, or have you built an abstraction layer too?