About a year and a half ago, we were building a proactive AI assistant.
Not just a chatbot, but something that could actually act on your behalf.
It could reply to emails in your tone, move calendar events, organize your inbox, and surface information based on what you actually care about.
The goal was simple:
build something that feels like an extension of how you think.
The part we didn’t expect
To make that work, we started with what most people use today: RAG.
And to be fair - RAG works.
You can go pretty far with chunking, embeddings, and retrieval.
You can build systems that feel smart.
But as the assistant got more complex, something started to break.
Not in an obvious way.
It was more subtle.
The system could retrieve relevant information,
but it didn’t really understand how things were connected.
Everything was based on similarity.
And similarity is not structure.
Building a "brain"
To move forward, we needed something else.
We started building what we internally called a "brain".
A layer responsible for:
- extracting meaning from data
- connecting concepts together
- maintaining a consistent structure over time
At the beginning, it was just a supporting component for the assistant.
But the deeper we went, the more it became clear:
this was the real problem.
About 7 months ago, we made a decision:
we stopped focusing on the assistant itself
and went all-in on this layer.
That became BrainAPI.
From retrieval to structure
The shift can be summarized like this.
Typical RAG pipeline:
chunk -> embed -> retrieve -> generate
What we moved toward:
ingest -> extract -> connect -> graph -> query
Instead of treating data as independent chunks,
we process it into a structured representation of entities and relationships.
In practice:
- documents are parsed into concepts
- relationships are extracted and normalized
- everything is stored in a graph + vector layer
Vectors are still useful,
but they are no longer the primary abstraction.
The graph is.
What changes in practice
This changes how you interact with data.
Instead of asking:
"what text is similar to this query?"
You can ask:
- what entities are involved?
- how are they connected?
- what paths exist between concepts?
- what else is related in this context?
Retrieval becomes navigation.
Where this approach helps
We found this particularly useful when:
- context spans across multiple sources and time
- relationships matter more than keywords
- consistency is important (not just relevance)
Some practical use cases:
- recommendation systems (ecommerce, social)
- search systems that go beyond keyword matching
- persistent memory for agents and chatbots
- more reliable RAG setups in complex domains
Exploring "polarities"
One interesting direction we’ve been exploring is something we call polarities.
Instead of returning a single "best" answer,
the system can surface a range of possible solutions around a problem,
based on how concepts relate in the graph.
It’s less about ranking results,
and more about exploring a solution space.
Why this matters
At Lumen Labs (our startup), this direction came from a broader observation.
AI systems today are powerful,
but they are also fragile in how they represent knowledge.
They retrieve well.
They generate well.
But they don’t really ground information in a consistent structure.
And that’s where a lot of issues come from,
especially when accuracy actually matters.
If we want systems that people can rely on,
we need something closer to a structured memory layer.
Open sourcing it
We’ve been using this approach in production for a few B2B use cases,
but never exposed it publicly.
Now we’re opening it up.
- the core is open source
- it can run fully locally (we’ve tested it with Ollama + offline setups)
- or be deployed as managed instances in the cloud
- it’s extensible via a plugin system
Closing thoughts
We don’t think this replaces RAG.
But it feels like RAG is one component of a bigger system,
not the system itself.
After spending the last year and a half building on top of AI systems,
this "memory layer" is the piece that felt missing.
Curious to hear how others are approaching this,
especially if you’ve hit similar limitations with chunk-based retrieval.
Links
- Repo: https://github.com/Lumen-Labs/brainapi2
- Website / Video: https://brain-api.dev
Top comments (0)