I built an AI Gateway with no technical background. Here's where I'm stuck.

#ai #llm #beginners #showdev

I'm a solo founder from Argentina. I'm not an engineer — I built the backend of NeuralRouting.io almost entirely with Claude.

The problem

Most teams send every LLM request to GPT-4 even when a smaller model would return the same quality answer. The cost difference between a $30/M token model and a $0.50/M model is massive at scale.

What NeuralRouting does

It sits between your app and LLM providers. Each request gets a complexity score, and the router picks the cheapest model that can handle it.

It also has:

Dual-layer semantic cache — similar queries get served from cache instead of hitting the API again
Shadow Engine — runs cheaper models in parallel to benchmark quality over time
PII filtering and rate limiting
Agent loop detection