We got tired of stuffing our AI agent's entire chat history into every prompt, so we built an API that doesn't

wontopos — Tue, 21 Jul 2026 07:07:27 +0000

Why We Built This

We're two people building WOS, a long-term memory API for AI agents. The problem we kept running into: agents either forget everything between sessions, or you fix that by re-sending the full conversation history on every single call.

That gets expensive fast, and past a certain length, models don't even use all of it well anyway.

What It Actually Does

You store a user's memories once. On each query, WOS recalls only the relevant ones and hands you a small, bounded context, no matter how much history is behind it.

Semantic Retrieval: No keyword matching (BM25), so it works perfectly across English, Korean, Japanese, etc.
Bring Your Own Key: Use your own LLM key for generation; we only handle search and storage.

The Number That Mattered Most: 45x Cheaper

For a 100K-token history with 1,000 queries a month:

Standard approach: ~$250/month in tokens.
With WOS: ~$5.50/month.

The gap only grows as the history expands.

Language-Agnostic by Design

We deliberately left out any lexical/keyword matching. This means retrieval quality doesn't quietly degrade depending on what language someone happens to write in. We test this against a single store holding multiple languages at once, and it holds up.

Try It Out

You can explore the documentation and start using the API here:

Wontopos - Long-term memory for AI agents

WOS is a long-term memory API for AI agents. Store and recall what matters across sessions, in any language, billed only for the tokens you use.

wontopos.com

Get Started with WOS API

Would love feedback!

Especially from anyone who's hit memory or context-length walls building their own agents. How are you handling long-term context today?

DEV Community: wontopos