I didn't mean to build this. Looking for testers anyway.

#ai #webdev #opensource #startup

I'll be honest up front: I'm not a salesman. Everyone has always told me that, and lately even the AI I build with tells me that. So this is not a pitch. This is me asking for help.

I didn't set out to create any of this. It started because I kept having context window issues while building something else entirely, and I went down a rabbit hole. The rabbit hole kept going. Somewhere down there I got tired of watching my AI coding agent send every single request to the most expensive model available, whether the task needed it or not, and I started building a fix for myself.

What's been embarrassing is I thought I had one story. Then the testing led me to another one. Then another. My website has changed multiple times, which has been a lesson learned in public, the hard way. The good news is there are no clients yet, so that's good :)

Where the testing has left me now is this:

As a gateway I am able to send routine coding requests to lower cost models that get verified before you ever see them. The hard ones escalate to a frontier model. On my benchmark runs that was about 1 in 27 requests. The full setup measured around 95 percent on HumanEval+ (n=164) at roughly 8x lower cost per request. I keep re-running these numbers because honestly I didn't quite believe them either. So far they keep holding.

So based on those runs, that works out to making a Fable 5 subscription last up to 27x longer, because most of the work doesn't need that level of cost to be accurate.

What I don't know yet is whether it holds up on YOUR work. Real repos, real deadlines, weird edge cases. That's exactly what I need testers for.

Security, because you should ask: encrypted at rest, TLS in transit, and your code is never training data. I verified the upstream providers do not train on it, and the routine tier runs on open models I host myself. Local models, privacy, security, and a cache that bends the cost curve down even lower over time. That is my goal, my promise, and what I want help testing and proving.

The deal: it's pre-beta, it's free while it is, and I will keep publishing the real numbers whether they flatter me or not. I feel like I have something special here, something people have been asking for. I just need people smarter than me to hammer on it and tell me where it breaks.

DEV Community

I didn't mean to build this. Looking for testers anyway.

Top comments (0)