DEV Community

stockyard-dev
stockyard-dev

Posted on

The LLM proxy landscape in 2026: Helicone acquired, LiteLLM compromised, and what's next

The LLM proxy space has changed fast in 2026. Two significant events happened in Q1 that are worth understanding if you're picking infrastructure for routing LLM traffic. Then I'll give honest takes on the main options.

What happened in Q1 2026

Helicone acquired by Mintlify (March 3)

Helicone, one of the more established LLM observability and proxy tools, was acquired by Mintlify. The Helicone team announced the acquisition and stated the product would enter maintenance mode. No new feature development. Bug fixes only, timeline unclear.

If you're running Helicone in production, this is a risk signal. Maintenance mode means the product isn't being actively developed against a fast-moving ecosystem. Provider API changes won't get fast fixes. New models won't be prioritized.

LiteLLM PyPI supply chain attack (March 24)

A supply chain attack targeting LiteLLM was discovered on March 24. Malicious packages were published to PyPI that impersonated LiteLLM packages. The attack targeted developers installing or updating LiteLLM via pip.

LiteLLM itself (the legitimate package) was not compromised, but the attack surface here is real: pip's ecosystem has a history of typosquatting and dependency confusion attacks. If your LLM proxy is a Python package installed via pip in production, you have supply chain exposure that a compiled binary doesn't have.

This is not a knock on LiteLLM's code quality. It's a structural observation about Python package distribution.

The current landscape

Here are the main alternatives and honest assessments of each:

Portkey

Portkey supports 200+ providers, which is impressive. The routing and fallback features are solid. The catch: observability and the more advanced features are cloud-locked. If you need on-premise observability for compliance reasons, Portkey isn't the answer. The open-source version exists but the features that matter are in the cloud product.

Langfuse

Langfuse is good at what it does, which is observability. It's not a proxy. You can self-host it, the tracing is detailed, and it integrates with most frameworks. But if you need request routing, caching, or rate limiting at the proxy layer, Langfuse doesn't do that. It's a complementary tool, not a replacement.

OpenRouter

OpenRouter gives you access to a huge model catalog through one API. It's genuinely useful for trying models without managing individual provider credentials. The tradeoffs: it's cloud-only, there's no self-hosted option, and they take a 5.5% fee on top of provider costs. For production workloads where you're sending significant volume, that fee adds up. Also, your traffic goes through their infrastructure, which is a compliance consideration for some use cases.

TensorZero

TensorZero is doing something different: it focuses on the optimization loop, using feedback signals to improve prompts and routing over time. Interesting approach. The limitation is that it doesn't do caching or rate limiting today. If those are requirements, TensorZero isn't a complete solution on its own.

Where Stockyard fits

Stockyard is a self-hosted LLM proxy that ships as a single ~25MB Go binary with embedded SQLite. No external dependencies, no cloud component required. It handles routing, caching, rate limiting, and request logging across 40+ providers.

The honest tradeoffs: the embedded SQLite means no horizontal scaling. If you need to run multiple proxy instances sharing state, that's not supported. Single-instance deployments handle most workloads fine given LLM API latency, but it's a real constraint.

The supply chain story: it's a Go binary. There's no pip install, no package manager, no transitive dependency injection vector at runtime. The attack surface is different.

Stockyard's proxy core is Apache 2.0. The full platform with dashboard and team features is BSL 1.1. Source is at github.com/stockyard-dev/Stockyard.


Full disclosure: I built Stockyard, so I'm biased. But the comparison data is accurate.

Top comments (0)