SQEval v1.16.0: Circuit-Breaker AI Failover & Real-Time Token Dashboard — Backed by 500k Benchmark Iterations

#sqeval #ai #reliability #circuitbreaker

SQEval v1.16.0: Circuit-Breaker AI Failover & Real-Time Token Dashboard

By Antonio Oreany · March 29, 2026 · SQEval Engineering Blog

Today we release SQEval PRO v1.16.0 — the most resilient version of our Search Quality audit engine to date. This release introduces automatic AI provider failover with a circuit-breaker pattern, a real-time token usage dashboard, and a comprehensive constants refactoring — all validated against 500,000 benchmark iterations with zero AI tokens consumed.

The Problem: Single-Provider Fragility

Cloud AI providers fail. Rate limits (HTTP 429), quota exhaustion, network timeouts — any of these can take down an audit mid-flight. In v1.15.x, a Gemini 429 meant a hard failure for the user. We needed a system that adapts instead of crashes.

The Solution: Circuit-Breaker Auto-Failover

Inspired by Netflix's Hystrix pattern, we implemented a three-tier circuit breaker directly in the AI analysis pipeline:

How It Works

Health Tracking — Each provider (Gemini, Perplexity) maintains a failure counter and timestamp
Auto-Suspend — After 3 consecutive failures, provider is suspended for 5 minutes
Smart Chain — Healthy providers are tried first; suspended ones become last resort
Auto-Recovery — After cooldown, circuit goes half-open and retries
Heuristic Fallback — If all AI fails, deterministic heuristic engine takes over

Request
analyze/submit

Gemini
healthy

Perplexity
standby

429!

fallback

Result

provider badge Heuristic Fallback
Figure 1: Circuit-Breaker Failover Chain — Gemini → Perplexity → Heuristic
both fail

The UI now shows a provider badge on every result: a colored indicator showing which AI powered the analysis, with a “fallback” tag when the primary provider was bypassed.

Real-Time Token Dashboard

The new admin panel includes a live token usage dashboard that refreshes every 10 seconds:

SVG Gauge — Semi-circle quota meter with animated fill
Burn Rate Sparkline — Real-time SVG chart showing token consumption over time
ETA to Exhaustion — Color-coded countdown (green >24h, yellow <24h, red <1h)
Per-Model Breakdown — Proportional bars for Gemini vs Perplexity usage
Circuit Breaker Health — Live dots that pulse when a provider is suspended
Cost Tracker — Running USD cost estimate based on token pricing

10s
Refresh Interval

6
Dashboard Widgets

0
Polling When Closed

Benchmark Results: 500,000 Iterations, Zero AI Tokens

We validated this release using our SQRG v2 benchmark suite — 500,000 synthetic iterations testing five blending strategies between heuristic and AI scoring. No API tokens were consumed; the benchmark uses pseudo-random profiles against our calibrated ground truth.

Mean Absolute Error by Blend Strategy
500,000 iterations · Lower is better

25
20
15
10
5

24.8
Pure Logic

18.6
Light Hybrid

12.4
Balanced

6.3
AI-Driven

1.2
Pure AI

Figure 2: MAE across five blending strategies. Pure AI achieves 20x lower error than Pure Logic.

Detailed Benchmark Table

Blend	Mean MAE	Median	Std Dev	P95	P99
B1: Pure Logic	24.80	25.01	14.26	46.97	51.27
B2: Light Hybrid	18.61	18.75	10.70	35.23	38.48
B3: Balanced	12.42	12.49	7.15	23.50	25.81
B4: AI-Driven	6.26	6.23	3.65	12.08	13.53
B5: Pure AI	1.24	1.23	0.72	2.37	2.47

Engineering Highlights

Constants Refactoring

We extracted 15+ magic numbers into named constants across the codebase:

Token tracking: AI_CONFIG.TOKEN_TRACKING (HOUR_MS, DAY_MS, retention, TTL, burn rate window)
Circuit breaker: AI_CONFIG.CIRCUIT_BREAKER (failure threshold, cooldown)
Client thresholds: CLIENT.SUSTAINABILITY, CLIENT.QUOTA