DEV Community

Cover image for Solo founder building Rhelm: Recursive High Efficiency Language Models
Jacob Haflett
Jacob Haflett

Posted on

Solo founder building Rhelm: Recursive High Efficiency Language Models

Hey, I'm Jacob. Solo founder of Rhelm.

10+ years deep in infra, Kubernetes, distributed systems, Go, Python, and AI orchestration.

I got tired of watching API bills stack up fast. Every task, big or small, was getting routed to the most expensive model by default. Didn't matter if it was complex reasoning or fixing a typo. Same model, same price.

So I built recursion into the workflow.

How it works

Rhelm decomposes complex objectives into atomic subtasks. Each one is simple enough for a small model to nail perfectly, then gets routed to the cheapest capable model. Local models at $0/token handle the bulk of the work. The expensive frontier models only get called when the task actually needs them.

The result: real AI power in your hands, not rented behind paywalls.

What it looks like in practice

Most AI tools dump everything on you at once. Logs, token counts, model responses, errors, all fighting for your attention. You end up spending more time managing the AI than doing the actual work.

We solved that by putting everything on a kanban board. PMs write objectives in plain language, agents pick them up like team members, and each card only surfaces what matters for that task. Cost, quality, status. No noise. You see what you need when you need it.

Early numbers

  • ~90% token cost reduction
  • Output quality goes up, not down
  • Runs on your hardware or in the cloud, your choice

Waitlist is open

If this sounds like a problem you're dealing with, check it out: rhelm.io

I'd love to hear from the community. What's your biggest pain point with current AI agent setups right now: cost, drift, security, or tool sprawl?

Drop your thoughts below. I'm building this in public and your feedback shapes the roadmap.

Top comments (3)

Collapse
 
micheal_angelo_41cea4e81a profile image
Micheal Angelo

Hi, I came across this blog and I’m interested in knowing how these small models are being orchestrated. Also, when it comes to running them locally, how much RAM would it require? Nevertheless, it’s a pretty interesting blog.

Collapse
 
rhelmai profile image
Jacob Haflett

Thanks so much, Micheal! Really appreciate you checking it out and joining the waitlist.

Great questions. We'll be dropping YouTube content soon that walks through how the orchestration works, RAM requirements for running models locally, and a lot more. Stay tuned for that.

Glad you found it interesting, and welcome aboard!

Collapse
 
micheal_angelo_41cea4e81a profile image
Micheal Angelo

Also, Joined the waitlist👍