Alex Towell

Posted on Mar 21 • Originally published at metafunctor.com

The MCP Pattern: SQLite as the AI-Queryable Cache

#mcp #sqlite #python #architecture

I keep building the same thing.

Not the same product — the products are different. One indexes a Hugo blog. One indexes AI conversations. One consolidates medical records from three hospitals. One catalogs a hundred git repositories. But underneath, they all have the same skeleton. After the fifth time, I think the skeleton deserves a name.

The pattern

Domain files (ground truth)
    ↓ index
SQLite database (read-only cache, FTS5)
    ↓ expose
MCP server (tools + resources → AI assistant)

That's it. Three layers. The domain files are always canonical — the database is a disposable cache you can rebuild from them at any time. SQLite gives you structured queries, full-text search, and JSON extraction over data that was previously trapped in flat files. MCP exposes it to an AI assistant that can write SQL, retrieve content, and (in some cases) create new content.

Here's the inventory:

Project	Domain	Ground Truth	What the MCP Exposes
hugo-memex	Blog content	Markdown files with YAML front matter	951 pages, FTS5 search, taxonomy queries, JSON front matter extraction
memex	AI conversations	ChatGPT/Claude/Gemini exports	Conversation trees, FTS5 message search, tags, enrichments
chartfold	Medical records	Epic, MEDITECH, athenahealth exports	Labs, meds, encounters, imaging, pathology, cross-source reconciliation
arkiv	Personal archives	JSONL files from various sources	Unified SQL over heterogeneous personal data
repoindex	Git repositories	Local git repos + GitHub/PyPI/CRAN metadata	Repository catalog with activity tracking, publication status

Five projects. Five completely different domains. One architecture.

Why SQLite

SQLite is the most deployed database in history. It's on every phone, every browser, every Python installation. But that's not why I use it.

I use it because it solves three problems at once:

Structured queries over unstructured data. Hugo front matter is YAML trapped inside markdown files. Medical records are scattered across three incompatible EHR export formats. AI conversations are JSON trees with branching paths. SQLite turns all of these into tables you can JOIN, GROUP BY, and aggregate. json_extract() handles the long tail of fields that don't fit a fixed schema.

Full-text search. FTS5 with porter stemming and unicode61 tokenization gives you relevance-ranked search across any text corpus. No Elasticsearch, no external service, no running daemon. Just a virtual table that lives in the same database file.

Read-only enforcement. SQLite's authorizer callback lets you whitelist specific SQL operations at the statement level. My MCP servers allow SELECT, READ, and FUNCTION — everything else gets SQLITE_DENY. This isn't PRAGMA query_only (which can be disabled by the caller). It's engine-level enforcement that cannot be bypassed via SQL.

And the operational properties are free: WAL mode for concurrent readers, a single file you can back up with cp, zero configuration, zero running processes.

Why MCP

The Model Context Protocol is the thin layer that makes SQLite useful to an AI assistant. An MCP server exposes tools (functions the AI can call) and resources (reference material the AI can read). That's the whole API surface.

For each project, the MCP layer follows the same shape:

execute_sql — The power tool. Read-only SQL with exemplar queries in the docstring. The docstring is critical: it's the AI's primary reference for writing correct SQL. Ten well-chosen example queries teach the model more than a schema diagram.

get_<things> — Bulk retrieval. Instead of execute_sql to find IDs then N individual fetches, one call returns full content for a filtered set. This matters when you're sharing a context window across multiple MCP servers.

<domain>://schema — A resource containing the full DDL, relationship documentation, and query patterns. The AI reads this once, then writes SQL against it for the rest of the session.

The database is a cache

This is the most important architectural decision, and it's easy to get wrong.

The database is not the source of truth. The files are. The database is a materialized index that can be rebuilt from the files at any time. This means:

No migrations. If the schema changes, drop the database and re-index. For a 951-page Hugo site, full re-indexing takes six seconds. Why maintain migration code for a disposable cache?
No write conflicts. The files are edited by humans (or by AI tools that write to the filesystem). The database is updated by the indexer. There's exactly one write path.
No backup strategy. You already back up your files. The database is derived from them. Lose the database? Rebuild it.
Incremental sync is an optimization, not a requirement. SHA-256 content hashes + file mtimes make re-indexing fast. But if incremental sync has a bug, force a full rebuild. The cache being disposable means you can always recover.

What large context changes

With a million-token context window, you might think this pattern is obsolete. Why index into SQLite when you can just load everything into context?

The math says otherwise. My Hugo blog is 951 pages, ~480K words, ~1.9M tokens. It doesn't fit. And that's one data source. Add AI conversations (memex), medical records (chartfold), and repository metadata (repoindex), and you're well past the limit.

But even if it did fit, the pattern would still be useful. Loading 480K words into context to answer "which posts are tagged 'reinforcement-learning'?" is like loading an entire database into memory to run a SELECT with a WHERE clause. SQLite does it in microseconds. Context loading costs seconds and tokens.

The right model is: MCP for navigation, context for understanding. Use execute_sql to find the five relevant posts, then use get_pages to load their full content into context. One tool call for discovery, one for deep reading.

The tools that earn their keep

After building five of these, certain tools prove their worth and others don't.

The tools that matter:

execute_sql with good docstring examples. This is 80% of the value.
Bulk retrieval (get_pages, get_conversations, get_clinical_summary). One call instead of N+1.
Schema/stats resources. Quick orientation without burning a tool call.

The tools that surprised me:

suggest_tags in hugo-memex. Uses FTS5 similarity to find pages like your draft, then returns their most common tags with canonical casing. Solved a real problem: my blog had 40 case-duplicate tag pairs (Python/python, AI/ai).
get_timeline in chartfold. Merges encounters, procedures, labs, imaging, and notes into a single chronological stream.

The Unix connection

This pattern is the Unix philosophy applied to AI tooling:

Small tools that do one thing well. Each MCP server handles one domain.
Text as the universal interface. SQL in, JSON out.
Composition over integration. Five independent MCP servers, each ignorable, each replaceable.
Files as ground truth. The oldest pattern in computing.

The difference from classical Unix pipes is the composition layer. Instead of grep | sort | uniq, the AI is the orchestrator.

What I'd tell you if you're building one

Start with execute_sql and a schema resource. That's enough to be useful.

Make the database disposable. If you're writing migration code, you've made it too important.

Put the exemplar queries in the tool docstring, not in a separate document. The docstring is the one thing the AI definitely reads.

Use FTS5. The marginal cost is one virtual table. The marginal benefit is that the AI can search your content by meaning, not just by exact column values.

Enforce read-only at the engine level, not the application level. SQLite's authorizer callback is the right mechanism. PRAGMA query_only is a suggestion, not a wall.

Build bulk retrieval tools early. The N+1 pattern (find IDs, then fetch one at a time) is the biggest efficiency problem in MCP servers.

The projects: hugo-memex (PyPI: hugo-memex), memex (PyPI: py-memex), chartfold, arkiv, repoindex.

DEV Community