Show HN: NeuralBridge — We Built a Self-Healing SDK for LLM-Powered Agents
After months of production experience running LLM calls at scale, we realized something uncomfortable: every AI agent eventually crashes. Not because the code is wrong, but because LLM APIs fail in ways you can't predict.
Timeouts. Rate limits. Empty responses. Schema violations. Drift. These aren't edge cases — they're the norm.
So we built NeuralBridge: an embedded SDK that makes LLM calls self-healing.
The Problem
Try running 100,000 LLM calls through any single provider. You'll see:
- 2-5% failure rate from timeouts and 5xx errors
- Rate limits that cascade through your pipeline
- Schema violations when models change behavior
- Provider-specific quirks that require custom error handling
- 30-200ms of unnecessary latency from gateway proxies
Most teams solve this by building their own retry logic, circuit breakers, and fallback chains. It works — until it doesn't. Because the next failure is always the one you didn't anticipate.
Our Approach: Embedded Self-Healing
Instead of a gateway (which adds latency and infrastructure), we embedded the reliability logic directly into the SDK:
from neuralbridge import SelfHealingEngine
engine = SelfHealingEngine()
result = engine.call("Write a Python function for binary search")
if result.recovered:
print(f"Fault: {result.diagnosis}")
print(f"Recovery: {result.recovery_action}")
When a call fails, the engine:
- Diagnoses the fault type in ~19us (P50)
- Escalates through 4 layers: retry -> degrade -> failover -> learned rule
- Validates the output across 5 dimensions
- Learns from the experience for next time
Production Results
| Metric | Value |
|---|---|
| Auto-recovery rate | 84.1% of faults |
| Fault patterns recognized | 280+ |
| Recovery strategies | 30+ |
| Learned rules (flywheel) | 88+ |
| Diagnosis latency | 19us P50 |
| Install size | 375 KB |
Why Open Source?
We went Apache 2.0 because reliability infrastructure should be a commodity. The SDK is free and open. Pro features (enterprise SSO, audit logs, priority support) fund continued development.
Getting Started
pip install neuralbridge-sdk
import neuralbridge as nb
result = nb.run("Explain quantum computing in one sentence")
print(result.text)
The Tech
- 1 dependency (httpx) — no Docker, no database, no infrastructure
- Multi-provider — DeepSeek, OpenAI, Anthropic, 12+ providers
- Carbon tracking — per-provider, per-call
- Drift detection — catch model regressions before users do
- 88+ flywheel rules — gets smarter over time
Links
- GitHub: https://github.com/hhhfs9s7y9-code/neuralbridge-sdk
- PyPI: https://pypi.org/project/neuralbridge-sdk/
- Docs: coming soon
pip install neuralbridge-sdk
We'd love your feedback, issues, and contributions. What failure patterns have you seen in production that we should handle?
Top comments (0)