Why we replaced LangChain with the raw Anthropic SDK in production

#langchain #claude #anthropic #python

LangChain was the right answer in 2023. It abstracted away a messy ecosystem of half-baked provider APIs, gave you a unified LLM interface, and let you stitch agents together with a few dozen lines of Python. We used it everywhere — including in production on Vettio, our AI recruitment platform.

In April 2026, we ripped it out.

This post is about why we made that call, what replaced it, and the metrics that justified the migration.

The symptoms

LangChain's abstractions started leaking the moment we went beyond happy-path demos. Three things kept biting us:

Stack traces from hell. A single AgentExecutor.invoke() call crossed 14 frames of LangChain internals before reaching our code. Debugging a malformed tool call felt like archaeology.
Version churn. Every minor bump renamed, relocated, or deprecated something we depended on. Our CI was pinned to a specific LangChain SHA for six months just to stay green.
Abstracted-away observability. We couldn't cleanly trace token usage, cache hits, or per-tool latencies without monkey-patching internal classes.

Meanwhile, Anthropic's native SDK was getting better. Native tool calling, prompt caching, extended thinking, streaming — all first-class and documented.

The refactor

The logic we were using LangChain for wasn't complicated:

Build a system prompt from templates
Call Claude with a list of tools
Route tool calls to our internal handlers
Return the result

We replaced ~800 lines of LangChain glue with this:

from anthropic import Anthropic

client = Anthropic()

def run_agent(user_input: str, tools: list[dict], tool_handlers: dict):
    messages = [{"role": "user", "content": user_input}]

    while True:
        response = client.messages.create(
            model="claude-opus-4-7",
            max_tokens=4096,
            tools=tools,
            messages=messages,
            system=SYSTEM_PROMPT,
        )

        if response.stop_reason == "end_turn":
            return response.content[0].text

        # Handle tool use
        tool_calls = [b for b in response.content if b.type == "tool_use"]
        messages.append({"role": "assistant", "content": response.content})

        tool_results = []
        for call in tool_calls:
            result = tool_handlers[call.name](**call.input)
            tool_results.append({
                "type": "tool_result",
                "tool_use_id": call.id,
                "content": str(result),
            })
        messages.append({"role": "user", "content": tool_results})

That's it. No AgentExecutor, no Callback, no ConversationBufferMemory. Just the model and our code.

The metrics

We ran the old and new paths side-by-side for two weeks on Vettio's interview-bot service. Results:

p50 latency: 2.1s → 1.4s (−33%)
p95 latency: 4.8s → 3.2s (−33%)
Error rate: 0.9% → 0.2%
Stack trace depth on errors: 14 → 4 frames
Lines of integration code: 812 → 187

The latency win came mostly from eliminating LangChain's implicit retry-and-retry-again behavior on tool-use mismatches. With direct SDK calls, a malformed tool schema fails loudly instead of silently retrying three times.

When LangChain still makes sense

This isn't a blanket "don't use LangChain" post. It still wins if you need:

Multi-provider abstraction. Swapping between Claude, GPT-4, and Gemini behind a stable interface.
LangGraph workflows for graph-based agent topologies you'd otherwise build from scratch.
LangSmith observability you don't want to rebuild.

For a team that's already committed to one provider (we're all-in on Claude) and wants full control over prompts, tool schemas, and observability — the native SDK is the right tool in 2026.

The lesson

Abstractions pay for themselves when the underlying APIs are bad. Anthropic's API isn't bad. It's clean, well-documented, and stable. The abstraction tax was real; the abstraction benefit had quietly evaporated.

If you're still on LangChain in a production Claude app, benchmark a direct-SDK rewrite of your hot path. You might be surprised.