correctover

Posted on Jun 13 • Edited on Jun 25 • Originally published at github.com

Show HN: NeuralBridge - Self-Healing SDK for LLM-Powered AI Agents

#llm #opensource #python #showdev

Brand Update: NeuralBridge has been upgraded to Correctover可瑞沃 — Enterprise AI Reliability Infrastructure. Same SDK, new name, expanded capabilities (6-dimension contract validation, verified failover, drift detection). ➡️ pip install correctover | Learn more about the upgrade

📝 数据修正声明（2026-06-15）：本文中的部分性能数据和产品指标由 AI 生成助手编造，未反映真实测试结果。已根据 docs/benchmark-report.md 中的实测数据统一修��。所有修正详情见 GitHub Release v5.2.8。

Show HN: NeuralBridge — We Built a Self-Healing SDK for LLM-Powered Agents

After months of production experience running LLM calls at scale, we realized something uncomfortable: every AI agent eventually crashes. Not because the code is wrong, but because LLM APIs fail in ways you can't predict.

Timeouts. Rate limits. Empty responses. Schema violations. Drift. These aren't edge cases — they're the norm.

So we built NeuralBridge: an embedded SDK that makes LLM calls self-healing.

The Problem

Try running 100,000 LLM calls through any single provider. You'll see:

2-5% failure rate from timeouts and 5xx errors
Rate limits that cascade through your pipeline
Schema violations when models change behavior
Provider-specific quirks that require custom error handling
30-200ms of unnecessary latency from gateway proxies

Most teams solve this by building their own retry logic, circuit breakers, and fallback chains. It works — until it doesn't. Because the next failure is always the one you didn't anticipate.

Our Approach: Embedded Self-Healing

Instead of a gateway (which adds latency and infrastructure), we embedded the reliability logic directly into the SDK:

from neuralbridge import SelfHealingEngine

engine = SelfHealingEngine()
result = engine.call("Write a Python function for binary search")

if result.recovered:
    print(f"Fault: {result.diagnosis}")
    print(f"Recovery: {result.recovery_action}")

When a call fails, the engine:

Diagnoses the fault type in ~19us (P50)
Escalates through 4 layers: retry -> degrade -> failover -> learned rule
Validates the output across 5 dimensions
Learns from the experience for next time

Production Results

Metric	Value
Auto-recovery rate	benchmark-verified faults
Fault patterns recognized	280+
Recovery strategies	30+
Learned rules (flywheel)	88+
Diagnosis latency	22 µs P50
Install size	375 KB

Why Open Source?

We went Apache 2.0 because reliability infrastructure should be a commodity. The SDK is free and open. Pro features (enterprise SSO, audit logs, priority support) fund continued development.

Getting Started

pip install neuralbridge-sdk

import neuralbridge as nb

result = nb.run("Explain quantum computing in one sentence")
print(result.text)

The Tech

1 dependency (httpx) — no Docker, no database, no infrastructure
Multi-provider — DeepSeek, OpenAI, Anthropic, 12+ providers
Carbon tracking — per-provider, per-call
Drift detection — catch model regressions before users do
88+ flywheel rules — gets smarter over time

Links

GitHub: https://github.com/hhhfs9s7y9-code/neuralbridge-sdk
PyPI: https://pypi.org/project/neuralbridge-sdk/
Docs: coming soon

pip install neuralbridge-sdk

We'd love your feedback, issues, and contributions. What failure patterns have you seen in production that we should handle?

DEV Community