Richard Baxter

Posted on Mar 4

SQLite as an MCP context saver: stop cramming raw API data into your LLM

#ai #mcp #sql

Most MCP servers dump raw API responses into the conversation. I've been using SQLite as a dependency in my MCP to sync data from my Google Search Console account locally and query it with SQL - here's the pattern and a working implementation for Google Search Console.

I'm fascinated by the utility and insight an LLM can provide, provided the data is presented correctly. You're not going to get great results from consumer-grade AI with 100,000 lines of JSON. A SQL query to a populated local database, however, is a different matter. Today's post is the why and how I use SQLite to give Claude a fighting chance of doing analysis, accurately, without the context bloat or hallucination.

I've been shipping MCP servers since late 2025. Every time I hook one up to an API with any volume of data, without paging/chunk management the same problem shows up. You call the API, and if you're not limiting the rquest you get all the rows back. Claude, or your LLM of choice dutifully analyses the data as if its the whole story. But a decent site generates tens of thousands of query/page combinations per month - so that 1,000-row response is actually a small sample, and even that might be too much to deal with efficiently.

MCP gives AI models access to APIs - we know this. But most APIs spit back volumes of data that don't fit in a conversation, or much of the data is irrelevant to answering the question. I found that after a couple of GSC API calls, about 80% of the token use was wasted on raw data, leaving maybe 20% for the model to actually think with. The analysis that comes back is built on incomplete data, but it reads with full confidence - and that's a tricky combination to work with.

My approach

The fix is less about prompting and more about architecture.

Two phases:

Sync - pull your complete dataset from the API into a local SQLite database as a dependency. No pagination caps, handle rate limits, retries, and deduplication in the background. Create your local data warehouse.
Query - your MCP tools translate prompts to SQL queries, defined in the tool description. The mcp returns aggregated results, not raw dumps.

So in my example, when you ask "which pages are losing traffic?", Claude runs a targeted query that returns maybe 20 rows instead of trying to work through 50,000 raw data points crammed into the conversation. It's like handing someone the analyst's summary rather than the raw spreadsheet.

You get accuracy because your SQL hits the complete dataset, not a 1,000-row sample. And you get efficiency because you're sending 20 aggregated rows into the context window instead of 50,000 raw ones.

I reckon this approach has legs beyond just Search Console. Sync locally, query with SQL, return aggregated results. It addresses the context window problem at an architectural level rather than hoping your API response fits.

The thing that clicked for me building these is that your tools don't have to be thin API wrappers. You can build something that syncs an entire dataset in the background, builds indexes, prunes stale data, and then exposes simple query endpoints. Your model couldn't care less about pagination or rate limits or auth tokens - it asks a question and gets something useful back. Once you reach that point, it's not a party trick - it's just infrastructure.

MCP is a lot more powerful than some of us might realise!

Under the hood: Better Search Console

I built Better Search Console as a working implementation of this pattern, specifically for Google Search Console data.

The sync engine

The Google Search Console API caps you at 1,000 rows per request and 25,000 rows per date range. For anything beyond a hobby blog, that's nowhere near enough. So the sync engine breaks your date range into 90-day chunks and fetches them in parallel - three concurrent streams, each paginating through its chunk until the API returns fewer rows than requested.

You can trigger this as a CLI command if it's a particularly massive job, just by the way.

Writes are serialised though. SQLite doesn't love concurrent writes, so all three streams funnel into a single write queue. Every row hits the database via INSERT OR REPLACE keyed on date + query + page + device + country.

I threw 800K rows at it (six months from a client's property) and the initial sync took about four minutes. After that, syncs are incremental - it picks up from where it left off.

SQLite config

Default SQLite settings are fairly conservative, so for this workload I've adjusted a few things:

WAL mode - write-ahead logging so reads don't block writes during sync
64MB cache - keeps hot pages in memory instead of hitting disk
4GB memory-mapped I/O - lets SQLite memory-map the database file directly
10 covering indexes - one for each common query pattern (query lookups, page lookups, date ranges, country breakdowns, the lot)

The result is that queries against 800K rows come back in under a second. SQLite with proper indexes is surprisingly fast for this kind of analytical work.

Data retention

Left unchecked, six months of GSC data for a busy site can balloon to millions of rows. The server auto-prunes after every sync: keeps the last 90 days in full, then drops zero-click rows from older data. The thinking is - if a query/page combination got zero clicks three months ago, it's probably noise, but anything with actual clicks stays around indefinitely.

Query sandboxing

I spent longer on this than anything else. The query_gsc_data tool lets Claude write freeform SQL against your data, which is powerful - but you obviously don't want it running DROP TABLE or DELETE FROM. So every query gets regex-checked before execution, and anything that matches INSERT, UPDATE, DELETE, DROP, ALTER, CREATE, or ATTACH gets blocked. There's also a 10,000-row automatic LIMIT on queries that don't specify one, so a careless SELECT * can't eat your entire context window.

The tools

Twelve tools in total. Four get about 90% of my use. The first two will trigger an mcp app to render charts etc with vite (really cool if you're using Claude Desktop).

get_overview - shows all your properties at a glance with clicks, impressions, CTR, average position, and sparkline trends.
get_dashboard - deep dive on one property. Hero metrics, comparison periods, top queries, country breakdown, ranking distribution, new/lost queries, branded vs non-branded.
get_insights - sixteen pre-built analytical queries. The one I keep reaching for is opportunities: it surfaces queries where you're ranking 5-20 with decent impression counts. You're close enough to page one that a title tweak or a content refresh could push you over. These tend to be the lowest-effort wins in SEO.
query_gsc_data - freeform SQL against the search_analytics table with the sandboxing I described above.

That last one kicked the whole project off for me - being able to ask anything of the data and get a real answer.

SQL queries I've been running for SEO analysis

The schema is nice and simple:

search_analytics: date, query, page, device, country, clicks, impressions, ctr, position

I ran every one of these on a client's ecommerce site last Tuesday - 340K rows in the database - so they're well tested.

Find long-tail referring keywords (5+ words):

This is probably my favourite query in the set. Longer phrases tend to be conversational - they read like something you'd type into ChatGPT or ask Google's AI Overview. Finding which of these already send you traffic tells you exactly what questions your content is answering, and more importantly, which questions it isn't.

SELECT query, SUM(clicks) as clicks, SUM(impressions) as impressions,
  ROUND(AVG(position), 1) as avg_position
FROM search_analytics
WHERE date >= date('now', '-90 days')
  AND LENGTH(query) - LENGTH(REPLACE(query, ' ', '')) >= 4
GROUP BY query
ORDER BY impressions DESC LIMIT 30

That LENGTH trick counts spaces to filter for five-or-more-word phrases. Not pretty SQL, but it works in SQLite which doesn't have a native WORD_COUNT.

Spot keyword cannibalisation (multiple pages ranking for the same query):

SELECT query, COUNT(DISTINCT page) as pages, SUM(clicks) as clicks
FROM search_analytics
WHERE date >= '2025-01-01'
GROUP BY query HAVING pages > 1
ORDER BY clicks DESC LIMIT 20

I find cannibalisation on almost every client audit I do. Two pages scrapping over the same query, neither ranking properly, and nobody on the team has noticed. This surfaces the worst offenders so you can decide whether to consolidate or differentiate.

Content decay detection (pages losing traffic month-over-month):

SELECT page,
  SUM(CASE WHEN date >= date('now', '-28 days') THEN clicks END) as recent,
  SUM(CASE WHEN date BETWEEN date('now', '-56 days') AND date('now', '-29 days') THEN clicks END) as prior
FROM search_analytics
GROUP BY page HAVING prior > 10
ORDER BY (recent * 1.0 / NULLIF(prior, 0)) ASC LIMIT 20

Questions people are asking about your content:

SELECT query, SUM(clicks) as clicks, SUM(impressions) as impressions,
  ROUND(AVG(position), 1) as avg_position
FROM search_analytics
WHERE date >= date('now', '-90 days')
  AND (query LIKE 'how %' OR query LIKE 'what %' OR query LIKE 'why %'
       OR query LIKE 'can %' OR query LIKE 'does %' OR query LIKE 'is %')
GROUP BY query
ORDER BY impressions DESC LIMIT 25

I've used this one on sites when building FAQ sections. If you're getting impressions for "how to configure X" but no clicks, your existing content probably doesn't answer that question clearly enough - or at all. Once you can see them, they're straightforward fixes.

Country breakdown for international targeting:

SELECT country, SUM(clicks) as clicks, SUM(impressions) as impressions,
  ROUND(AVG(position), 1) as avg_position
FROM search_analytics
WHERE date >= date('now', '-90 days')
GROUP BY country
ORDER BY clicks DESC LIMIT 15

The nice thing about all of these is that you're querying the complete dataset. Everything sits in SQLite and every query hits every row - no 1,000-row cap, no pagination limits.

Interactive dashboards with ext-apps

Something else I got to explore on this project: MCP's ext-apps framework, which lets you render interactive HTML directly inside Claude Desktop. Actual clickable dashboards with real data, living right there in the conversation as embedded iframes.

The overview, the property drilldowns, the sparkline trends - they're all interactive. It feels like the beginning of something interesting to me: MCP-native SaaS, where you build the backend as an MCP server, the frontend as ext-apps, and distribute via npx. No hosting to maintain, no auth flows to build, no deployment pipeline to manage. And your data stays on your machine, which for SEO data is a nice bonus.

Setting it up

Three steps. The Google credentials bit takes the longest but you only do it once.

1. Google service account

Head to Google Cloud Console, create a new project and enable the Search Console API
Credentials → Create service account → Keys tab → Add Key → JSON
Grab the client_email value from that JSON file
In Search Console → Settings → Users and permissions → Add the email with Full permission

That last step is the one everyone misses (I did too, first time). The service account is basically a separate Google identity - it can't see your properties until you explicitly share them. Skip it and you'll get "No properties found" errors wondering what went wrong.

2. Claude Desktop config

Add this to your claude_desktop_config.json:

"better-search-console": {
  "command": "npx",
  "args": ["-y", "@houtini/better-search-console"],
  "env": {
    "GOOGLE_APPLICATION_CREDENTIALS": "/path/to/your-service-account.json"
  }
}

3. Restart and sync

Restart Claude Desktop and ask it to set up Better Search Console. The setup tool discovers your properties, syncs them in the background, and shows an overview. Initial sync depends on data volume - 30 seconds for a small site, a few minutes for one with millions of rows. After that, syncs are incremental.

The stack

Built in TypeScript, distributed via npm. Uses better-sqlite3 for storage, full pagination against the Google Search Console API, and MCP ext-apps for the interactive dashboard bits. Apache-2.0 licence - fork it, break it, whatever works for you.

npm: @houtini/better-search-console
GitHub: houtini-ai/better-search-console

I reckon the sync-locally-query-with-SQL approach has legs well beyond Search Console. Better answers from less context - that's genuinely all there is to it.

Pull requests welcome - I'm especially interested in SQL queries you've found useful for SEO analysis, and I'll add the good ones to the pre-built insights.

Top comments (11)

Ben Halpern • Mar 5

Amazing first post. This is really good stuff

Richard Baxter • Mar 5

Thanks Ben!

Richard Baxter • Mar 4

Just want to say this my first dev.to article. I really, really appreciate the warm welcome! Lots more to share - looking forward to getting to know this space 🙏

ReRoutd Admin • Mar 8

This is a really practical pattern. The “store first, summarize later” approach is exactly what ops teams need once MCP pipelines start touching high-volume APIs.

One thing we’ve seen in US production environments: adding TTL + provenance metadata (source, fetched_at, schema_version) to the SQLite layer makes incident review way easier when downstream prompts go sideways.

Have you tried pairing this with replayable traces so you can rerun the same prompt + cached context during postmortems?

AI Agent Digest • Mar 10

This is a pattern that more MCP server authors need to adopt. The "thin API wrapper" approach to MCP servers is the default right now, and it results in exactly the problem you're describing -- dumping 50K rows of raw API data into a context window and hoping the model figures it out. Treating the MCP server as a proper data layer with local sync and SQL aggregation is a fundamentally better architecture.

Fard Johnmar • Mar 10

Smart architecture for reducing token overhead. The sync-then-query pattern makes a lot of sense for stable datasets like Search Console data.

One security angle worth considering: when you cache external API data locally, you're also potentially caching any malicious content that came with it. A poisoned API response gets synced once, then served to the model repeatedly for 90 days.

The regex SQL injection protection handles the query side, but what about the content being queried? If an attacker can influence what Search Console returns (say, through page titles or meta descriptions they control), those strings now live in your SQLite database and get fed to Claude on every relevant query.

It's a different threat model than raw API responses — cached attacks persist longer and survive API-side fixes. Something like content scanning on sync (flagging unusual instruction patterns in text fields) could catch this before it lands in the database.

Have you seen any weird content come through from Search Console data, designed to influence an LLM?