DEV Community

Juan
Juan

Posted on

I built an AI Gateway with no technical background. Here's where I'm stuck.

I'm a solo founder from Argentina. I'm not an engineer — I built the backend of NeuralRouting.io almost entirely with Claude.

The problem

Most teams send every LLM request to GPT-4 even when a smaller model would return the same quality answer. The cost difference between a $30/M token model and a $0.50/M model is massive at scale.

What NeuralRouting does

It sits between your app and LLM providers. Each request gets a complexity score, and the router picks the cheapest model that can handle it.

It also has:

  • Dual-layer semantic cache — similar queries get served from cache instead of hitting the API again
  • Shadow Engine — runs cheaper models in parallel to benchmark quality over time
  • PII filtering and rate limiting
  • Agent loop detection

Where it's at

Early. I only support OpenAI and Groq right now. Zero users. I built too much before talking to anyone and I'm fixing that now.

If you work with LLMs and want to try it, I'm looking for honest feedback: neuralrouting.io

Top comments (0)