correctover

Posted on Jun 25

AI Gateway vs Embedded SDK: Why Your LLM Architecture Needs Verified Failover

#llm #architecture #ai #python

The debate between AI gateways and embedded SDKs is settled by one question: where does your reliability logic run?

Correctover (pip install correctover) takes the embedded approach — reliability runs in your process, not through a third-party proxy.

The Gateway Problem

AI gateways (LiteLLM, Portkey, OpenRouter) operate as reverse proxies:

Your App → Gateway Proxy → Provider API
                ↓
          Markup on tokens
          Data passes through
          Latency from proxy hop
          Locked into their routing

Problems:

Latency: Every call goes through an extra network hop (5-50ms overhead)
Cost: Gateways charge per-token markup on top of provider pricing
Data exposure: All prompts and responses pass through third-party servers
Vendor lock: You're tied to their provider list and routing logic

The Embedded SDK Approach

Correctover runs in your application process:

Your App (with Correctover embedded SDK)
  |--→ Provider A (via your key)
  |--→ Provider B (via your key, if A fails)  
  |--→ Provider C (via your key, if B fails)

  All validation happens locally
  No proxy, no markup, no data exposure

Benefits:

Zero extra latency: 22µs for contract validation vs 5-50ms proxy hop
BYOK: Your keys, your data, zero markup
Local validation: No third-party sees your prompts
Full control: You choose providers, contracts, and failover logic

The Missing Piece: Verified Failover

Most gateways offer failover — but it's transport-level. They check HTTP 200 and move on. Correctover adds verified failover: every response from a backup provider is validated against a 6-dimension contract before delivery.

from correctover import NeuralReliabilityEngine

engine = NeuralReliabilityEngine()

# Multi-provider with verified failover
response = engine.chat_completion(
    messages=[{"role": "user", "content": "Explain quantum computing in 3 sentences"}],
    providers=["openai", "anthropic", "google"],
    contract={
        "max_latency_ms": 8000,
        "max_cost_tokens": 1500,
        "min_completion_ratio": 0.9
    }
)

Performance Comparison

Metric	Gateway (Proxy)	Correctover (Embedded)
Base latency overhead	5-50ms	22µs
Failover type	Transport (HTTP 200)	Verified (6-dimension)
Validation	None	Schema/Latency/Cost/Completeness
Data exposure	All data through proxy	None (local)
Pricing	Per-token markup	BYOK, zero markup
Deployment	Separate service	pip/npm install
Provider switching	Manual config	Auto with contract validation

When to Use Each

Use a Gateway When:

You need centralized key management for a team
You want usage analytics across your org
You're okay with per-token markup

Use Correctover (Embedded SDK) When:

You need reliability without latency overhead
You want BYOK with zero data exposure
You need verified failover, not just transport switching
You want a single pip install solution

Getting Started

pip install correctover

# Your existing OpenAI call...
# Add verified failover in 3 lines:
from correctover import NeuralReliabilityEngine
e = NeuralReliabilityEngine()
response = e.chat_completion(messages=messages, providers=["openai", "anthropic"])

Correctover可瑞沃 — Enterprise AI Reliability Infrastructure. Website: correctover.com | PyPI: pip install correctover

DEV Community