I built a Multi-Model AI Router to end "Claude-Loyalty" (and I need you to break it)

Sami Alotaibi — Fri, 06 Mar 2026 02:48:41 +0000

I've noticed a pattern in my workflow: I spend way too much time switching between Claude for code, Gemini for long-context research, and GPT for quick logic.

In 2026, "Model Loyalty" feels like a tax on productivity. So, I built Alotaibi AI—a lightweight, intent-based router that picks the best API for your prompt so you don't have to.

The Tech Stack
Frontend: Ultra-lightweight "Thin-client" HTML/JS, hosted on Vercel for global edge delivery.

Backend: A decoupled, private orchestration layer. I opted for a dedicated backend instead of serverless functions to avoid execution timeouts and to handle more complex intent-classification logic.

Intelligence: I'm using a zero-shot classification layer to weigh prompts across four axes: Logic, Creativity, Speed, and Retrieval.

The "Secret Sauce" (and why I need your help)
The core logic evaluates your prompt in real-time. Depending on the "intent" detected, it routes the request to the specific model currently performing best for that niche (Claude 4.6, GPT-5, or Gemini 3 Pro).

The Problem: Tuning these "routing thresholds" is a nightmare. Sometimes the system favors a faster model when I actually need deep reasoning.

I need a Roast
I'm looking for feedback on three things:

Routing Accuracy: Try to "trick" the system. Send it a prompt that should go to a high-reasoning model but gets routed to a faster, "dumber" one.

The UI: Is the "thin-client" approach too minimal, or is the speed of a raw HTML frontend worth it?

Architecture: If you’ve built decoupled AI apps before, how are you handling the handshake latency between your frontend and private API?

Check it out here:

alotaibiai.vercel.app

DEV Community: Sami Alotaibi

I built a Multi-Model AI Router to end "Claude-Loyalty" (and I need you to break it)