DEV Community

Cover image for LLMs have a memory problem. So I built a fix.
Lachlan Allen
Lachlan Allen

Posted on

LLMs have a memory problem. So I built a fix.

The latency is trash. I’m sorry, but you can’t build a fluid voice conversation when your memory layer takes 500ms+ just to fetch a relevant fact. By the time the AI responds, the user has already hung up or started talking again. And the pricing for high-volume retrieval? A joke.

I got tired of waiting, so I built my own free solution Orthanc.ai.

It’s a memory layer designed specifically for speed. No bloat, no external API dependancies for the embeddings. It uses optimized local models to handle storage and retrieval instantly.

I open-sourced the whole thing. Steal the code here: https://github.com/orthanc-protocol/client-sdk.git

If you want the speed but don't want to manage the servers, I’m giving away free credits for the hosted API/SDK. Hit me up and I'll send you a key or make a quick profile on the website and snag a key.

Top comments (0)