DEV Community: SUNNY ANAND

Why I stopped calling LLM APIs directly and built an Infrastructure Protocol

SUNNY ANAND — Wed, 04 Feb 2026 12:45:59 +0000

Last month, my OpenAI bill hit $520. When I looked at the logs, 30% of that was people asking the same "getting started" questions over and over. I was paying for the same tokens twice, and my users were waiting 2.5 seconds for a response that I already had in my database. That was my "Aha!" moment.

The $500 Wake-up Call: Why raw API calling is a financial liability.
The "Infrastructure Maturity" Shift: Moving from wrappers to gateways.
The 5ms Victory: How I used Go and Redis to make LLM responses feel like a local file read. 4.** Sovereign Privacy:** Why "Sovereign Shield" redaction is a must for any enterprise app.
Universal SDKs: Announcing the official launches of pip install nexus-gateway and npm i nexus-gateway-js.
Conclusion: Why "Tokens as COGS" is the future of AI engineering.

I replaced my standard OpenAI client with the Nexus SDK. The first time I saw a 200 OK - 5ms (CACHE HIT) in my terminal, I realized the 'AI Bubble' isn't about the models—it's about the infrastructure protecting our margins.

Primary CTA: Star us on GitHub: [https://github.com/ANANDSUNNY0899/NexusGateway

Why We Built a Self-Healing AI Gateway: Architecting for Provider Instability

SUNNY ANAND — Sun, 01 Feb 2026 13:48:16 +0000

The Fragility of the "Wrapper" Era: Why openai.chat.completions is a single point of failure.
Native Infrastructure vs. Shims: Why we abandoned SDK shims for native Go implementations of Google and Groq protocols.
The Health-Check Loop: How Nexus uses a background goroutine to monitor provider latency and error rates.
Autonomous Re-routing: The logic behind switching from a primary model to a secondary "Speed" model (Groq) when latency spikes.
Conclusion: Why "Sovereign Infrastructure" is the only way to scale AI to the enterprise.

Why We Stopped Trusting LLM Shims: Architecting a Self-Healing Gateway in Go

SUNNY ANAND — Wed, 21 Jan 2026 19:34:54 +0000

Most AI "middleware" is just a thin wrapper around OpenAI’s SDK. If a provider goes down, or an API schema changes, your app dies. After a 7-day engineering sprint, we’ve moved Nexus Gateway away from "shims" and into Native Infrastructure. We’ve built a self-healing routing engine in Go that treats LLM providers as unreliable commodities, not gods.

How I Built a Golang AI Gateway to Cut OpenAI Costs by 90%

SUNNY ANAND — Tue, 06 Jan 2026 18:19:56 +0000

The $50 Wake-Up Call

Last week, I was building an AI chatbot. It was fun until I saw my OpenAI bill. I was paying for the same questions over and over again.

If 100 users ask "What is the capital of France?", why am I paying OpenAI 100 times to generate the same word: "Paris"?

I realized I needed a Caching Layer. But not just any cache—a Semantic Cache.

The Architecture
I decided to build a middleware that sits between my app and OpenAI.

Language: Go (Golang) for high concurrency.
Cache: Redis (for speed) + Pinecone (for vector search).
Routing: Automatically switches between GPT-4 and Claude 3.

The "Secret Sauce": Vector Caching
A normal cache (Key-Value) is dumb.

User A: "Hello" -> Cache Miss
User B: "Hi there" -> Cache Miss

They mean the same thing, but the strings are different.

I used OpenAI Embeddings to turn text into vectors (lists of numbers). Now, if the cosine similarity between two prompts is > 0.9, I serve the cached answer.

Result: My API bill dropped by 90% overnight.

Building the "Universal Router"

I didn't want to be locked into OpenAI forever. So I built a Provider interface in Go.

type AIProvider interface {
    Send(prompt string) (string, error)
}

Now, my users can switch models instantly just by changing a JSON parameter:

{
  "model": "claude-3-opus",
  "message": "Write a poem about Rust."
}

The Result: Nexus Gateway
I realized other developers have this exact problem. So I polished the code, added Stripe for billing, built a Next.js Dashboard, and open-sourced it.

I even published a Python SDK so you can use it in 3 lines of code:

pip install nexus-gateway

from nexus_gateway import NexusClient

client = NexusClient(api_key="nk-your-key")
response = client.chat("Hello World")

Try it out
It’s live, open-source, and free to try.

Live Demo: https://nexus-gateway.org
GitHub: [https://github.com/ANANDSUNNY0899/NexusGateway]

If you are building AI apps and want to save money, give it a shot. I’d love your feedback!