Building Droste: a local structural + semantic code-memory engine for MCP agents
AI coding agents are getting better, but their memory layer is still often too shallow.
Most agent workflows still depend on one of two things:
- blind file reads;
- vector search over chunks.
Both are useful, but both miss something important in codebases: causal structure.
A function may be relevant to a bug even if it shares no words with the user’s query. A SQL function may be called through an RPC string from TypeScript or Dart. A file may
matter because it is a caller, callee, handler, migration, or dependency, not because its text looks similar.
That is the reason I started building Droste.
GitHub:
https://github.com/lorismascio17/droste-memory
PyPI:
https://pypi.org/project/droste-memory/
## What Droste is
Droste is a local code-memory engine for AI coding agents.
It indexes a repository into a hybrid structural + semantic graph:
- folders;
- files;
- symbols;
- functions;
- classes;
- methods;
- caller/callee links;
- import/dependency edges;
- cross-language links;
- local embeddings.
Then it exposes that memory through:
- a CLI;
- an MCP server;
- a visual graph viewer.
The goal is simple: give an agent the causal slice of code it needs, instead of forcing it to repeatedly scan files or rely only on semantic similarity.
## Why vector search alone is not enough
Vector search is good at finding code that sounds similar to a query.
But code often matters because of relationships, not wording.
For example:
- a controller calls a service;
- a service calls a repository;
- a frontend invokes an RPC function;
- an edge function touches a database table;
- a migration defines something used indirectly from app code;
- a test reveals behavior not obvious in the implementation.
A vector index may miss those links if the words are different.
A graph can preserve them.
That is the main design idea behind Droste: combine semantic retrieval with structural retrieval.
## Local-first by design
Droste is local-first.
No cloud database is required. No account is required. No API key is required.
Install:
python -m pip install --upgrade droste-memory
Index a repository:
droste index .
Ask for context:
droste context "checkout flow" --budget 1500
Run it as an MCP server:
droste mcp
An AI agent can then call Droste as a local code-memory backend instead of doing blind file reads.
## MCP usage
For MCP clients, the basic configuration is:
{
"mcpServers": {
"droste": {
"command": "droste",
"args": ["mcp"]
}
}
}
For serious multi-repo work, Droste also supports using one database per project:
{
"mcpServers": {
"droste": {
"command": "droste",
"args": [
"--db",
"/absolute/path/to/droste_memory_db.json",
"mcp"
]
}
}
}
That avoids mixing context from different repositories.
## What happens internally
Droste builds a local graph of the project.
At a high level:
- It extracts symbols from source files.
- It maps functions, classes, methods, files, and folders.
- It builds dependency edges where possible.
- It computes local embeddings.
- It stores the graph in local sharded JSON files.
- It retrieves context using both semantic similarity and graph relationships.
- It packs the result into a token budget for an LLM.
The important part is that retrieval is not only “which chunks are similar?”
It also asks:
- what calls this?
- what does this call?
- what file owns this symbol?
- what related nodes are connected by the graph?
- what cross-language links exist?
## Sharded local storage
Droste does not store the whole graph as one giant JSON file.
It uses sharded local storage, with one shard per source path. This keeps incremental saves faster and avoids rewriting the entire database after every change.
It also uses a lightweight seqlock-style consistency model so a reader does not assemble a torn snapshot while another process is writing shards.
This matters because the intended workflow is live:
- the engine indexes;
- the visualizer reads;
- the MCP server answers agent requests;
- the developer keeps coding.
## Visual graph
Droste also includes a visualizer.
The idea is to represent the codebase as a zoomable graph: project, folders, files, symbols, and causal edges.
It is not meant to replace the CLI or MCP server. It is meant to make coupling and blast radius visible.
In practice, I wanted something closer to a “code universe” than a flat file tree.
## Current status
Droste is still early, but usable.
Current features include:
- Python CLI;
- MCP server;
- local indexing;
- Tree-sitter based symbol extraction;
- semantic embeddings through local models;
- sharded storage;
- project-root isolation;
- query-aware ranking;
- token-budgeted context packing;
- visual graph viewer.
Latest version:
python -m pip install --upgrade droste-memory
## Example commands
droste index .
droste status
droste context "authentication flow" --budget 2000
droste zoom "SomeFunction"
droste view
droste mcp
## What I am looking for feedback on
I would especially appreciate feedback from people building or using AI coding agents.
Useful feedback would be:
- Does the MCP interface make sense?
- Is the install flow clean?
- Does retrieval feel better than normal file search in real projects?
- Are the returned context slices useful to an LLM?
- What graph relationships would matter most in your stack?
- What would make this more useful in day-to-day coding?
## Links
GitHub:
https://github.com/lorismascio17/droste-memory
PyPI:
https://pypi.org/project/droste-memory/
Install:
python -m pip install --upgrade droste-memory
Droste is open source and MIT licensed.
Top comments (0)