DEV Community

Cover image for Google Sheets as a Database: SELECT * FROM 'Sheet1' LIMIT 5
Andrea Debernardi
Andrea Debernardi

Posted on

Google Sheets as a Database: SELECT * FROM 'Sheet1' LIMIT 5

Here's a thing I can do now:

SELECT name, revenue FROM "Q1 2026" WHERE revenue > 10000 ORDER BY revenue DESC;
Enter fullscreen mode Exit fullscreen mode

That's not a Postgres query. That's a Google Spreadsheet — the same one I was editing in a browser tab five minutes ago. Same query editor as Postgres. Same result grid. INSERT, UPDATE, DELETE work too.

I was staring at a spreadsheet the other week and it hit me: this is already a database. It has tables (sheets). It has rows. It has columns. It even has implicit types. Why am I exporting CSVs and importing them into Postgres just to run a SELECT?

So I stopped. I built it instead — as a plugin for Tabularis, a SQL client I work on that has a plugin system. Twenty minutes of actual typing, and Sheets became a first-class database driver.

This post is the "what I learned" version. If you want the full step-by-step with every line of code, the long-form tutorial is on our blog. If you just want the tricks that made it work, keep reading.


The shape of the problem

Tabularis plugins are JSON-RPC binaries. The host speaks JSON-RPC 2.0 over stdin/stdout, the plugin answers. That's the whole protocol. If you can answer ~15 methods — get_tables, get_columns, execute_query, and friends — you're a database driver.

A scaffolder (npm create @tabularis/plugin) gives you all 33 RPC stubs, a manifest.json, a build matrix for five platforms, and a manifest schema. So the interesting work is only ever: how do you map "this thing" onto SQL?

For Google Sheets, "this thing" is a REST API with three endpoints I care about:

  • GET /v4/spreadsheets/{id} — list the sheet tabs and their IDs.
  • GET /v4/spreadsheets/{id}/values/{range} — read a rectangle of cells.
  • POST /v4/spreadsheets/{id}/values/{range}:append — write a row.

That's it. Every table operation compiles down to one of those.


Mapping sheets onto SQL

The mapping writes itself once you start looking at it:

SQL concept Google Sheets reality
Database One spreadsheet (identified by its URL/ID)
Table One sheet tab inside that spreadsheet
Column names Row 1 of the tab
Column types Inferred by sampling rows 2..102
Rows Rows 2..N
Primary key There isn't one. (← the interesting problem)

Four of those are trivial. The fifth is where it gets fun.

The synthetic _row primary key

Google Sheets has no surrogate ID. You can't UPDATE customers SET email = ? WHERE id = 42 because there is no id unless the user decided to put one in column A — and even then, they'll re-sort the sheet and your IDs are suddenly referring to different rows.

The fix: synthesize one. Every table I expose has an extra column prepended:

_row  |  name       |  email            |  revenue
------+-------------+-------------------+---------
2     |  Acme Inc.  |  hi@acme.io       |  12400
3     |  Globex     |  ops@globex.com   |  8900
4     |  Initech    |  sales@initech.co |  22150
Enter fullscreen mode Exit fullscreen mode

_row is the spreadsheet row number. It's stable within a session — between get_columns and execute_query, the user hasn't reordered the sheet from their browser (usually). UPDATE and DELETE use it:

UPDATE "Customers" SET email = 'new@acme.io' WHERE _row = 2;
DELETE FROM "Customers" WHERE _row = 3;
Enter fullscreen mode Exit fullscreen mode

Yes, this breaks if someone's live-editing the sheet elsewhere. I decided that's their problem, not mine. Document the footgun, move on.

Type inference from samples

get_columns returns column names and types. Sheets doesn't give you types — cells are either number-formatted or string-formatted, that's it. So I sample:

pub fn infer_type(values: &[Value]) -> &'static str {
    let mut saw_int = false;
    let mut saw_real = false;
    for v in values {
        match v {
            Value::Number(n) if n.is_i64() => saw_int = true,
            Value::Number(_)              => saw_real = true,
            _                             => return "TEXT",
        }
    }
    if saw_real { "REAL" } else if saw_int { "INTEGER" } else { "TEXT" }
}
Enter fullscreen mode Exit fullscreen mode

One non-number in the sample → TEXT. Any floats → REAL. All integers → INTEGER. Three types is enough. The host renders them correctly in the grid, right-aligns numbers, and that's 95% of what "types" means to a human.


The regex SQL parser (don't do this at home)

execute_query takes a raw SQL string. The user types whatever they want. You need to parse it.

Reader, I regexed it.

pub enum Query {
    Select { columns: Vec<String>, from: String, where_clause: Option<String>,
             order_by: Option<String>, limit: Option<usize>, offset: Option<usize> },
    Insert { into: String, columns: Vec<String>, values: Vec<String> },
    Update { table: String, set: Vec<(String, String)>, where_clause: String },
    Delete { from: String, where_clause: String },
}

pub fn parse(raw: &str) -> Result<Query> { /* ~320 lines of compiled regexes */ }
Enter fullscreen mode Exit fullscreen mode

The parser handles SELECT ... FROM ... WHERE ... ORDER BY ... LIMIT ... OFFSET, INSERT, UPDATE WHERE _row = N, DELETE WHERE _row = N, COUNT(*), and WHERE expressions with AND / = / LIKE / > / <. No joins. No subqueries. No CTEs. No window functions.

It's bad. I know it's bad. Here's why it shipped anyway:

  1. The whole dataset fits in RAM. Sheets tops out around hundreds of thousands of cells. I fetch the entire tab, apply WHERE in-memory in Rust with a tiny expression evaluator, paginate the result. No query planner needed.
  2. The surface is small enough to regex correctly. I tested the failure modes. When parsing fails, the user sees "Unsupported SQL construct: <your query>" — clear, not cryptic.
  3. The whole driver is 1000 lines. Pulling in sqlparser would more than double that for a driver where nobody is writing joins.

When the driver grows up, swap in sqlparser. Not before.


OAuth without a web server

Google's OAuth2 expects a redirect URI. On the web, that's a route on your server. In a desktop app with no server? You use the "desktop" OAuth flow: Google redirects to http://127.0.0.1 with the code in the query string, the browser shows ERR_CONNECTION_REFUSED, and the user copy-pastes the URL back into your app.

It's ugly. It works. The whole exchange:

async function exchangeCode(clientId, clientSecret, code) {
  const body = new URLSearchParams({
    code,
    client_id: clientId,
    client_secret: clientSecret,
    redirect_uri: "http://127.0.0.1",
    grant_type: "authorization_code",
  });
  const resp = await fetch("https://oauth2.googleapis.com/token", {
    method: "POST",
    headers: { "Content-Type": "application/x-www-form-urlencoded" },
    body: body.toString(),
  });
  if (!resp.ok) throw new Error(await resp.text());
  return resp.json();  // { access_token, refresh_token, expires_in, ... }
}
Enter fullscreen mode Exit fullscreen mode

The refresh token goes into plugin settings (encrypted on disk by the host). On every API call, the Rust side checks token_expiry and hits /token with grant_type: refresh_token if needed. The user sees the OAuth wizard once, then never again.

The UI extension part

The wizard is React, not Rust. Tabularis loads plugin UI as IIFE bundles — self-contained .js files that assign a React component to window.__tabularis_plugin__ when evaluated. The manifest declares which slot the component mounts into:

"ui_extensions": [
  { "slot": "settings.plugin.before_settings",
    "module": "ui/dist/google-auth.js" },
  { "slot": "connection-modal.connection_content",
    "module": "ui/dist/google-sheets-db-field.js",
    "driver": "google-sheets" }
]
Enter fullscreen mode Exit fullscreen mode

Two slots. One puts the OAuth wizard above the plugin's settings form. The other replaces the default host/port/user/pass grid in the "new connection" modal with a single "Spreadsheet URL" input — because a Sheets connection has exactly one field and none of the usual ones.

The neat trick is that React is not bundled into these IIFE files. The host injects it as a global:

rollupOptions: {
  external: ["react", "react/jsx-runtime", "@tabularis/plugin-api"],
  output: {
    globals: {
      react: "React",
      "react/jsx-runtime": "ReactJSXRuntime",
      "@tabularis/plugin-api": "__TABULARIS_API__",
    },
  },
}
Enter fullscreen mode Exit fullscreen mode

That's the whole protocol contract: the host injects the globals, the bundle consumes them. No duplicate React in memory. The two bundles together gzip to ~4 KB.

The TypeScript package @tabularis/plugin-api gives you typed slot contracts — if you try to read context.tableName in a slot that doesn't have it, the build fails. Same package exposes hooks like usePluginSetting, usePluginModal, openUrl. Standard React, just with a host.


What actually ships

$ cargo build --release
    Finished release [optimized] target(s) in 34s
$ ls -lh target/release/google-sheets-plugin
-rwxr-xr-x  1 me me 3.1M  target/release/google-sheets-plugin
$ pnpm --dir ui build
  dist/google-sheets-db-field.js  1.18 kB │ gzip: 0.62 kB
  dist/google-auth.js              8.47 kB │ gzip: 3.40 kB
Enter fullscreen mode Exit fullscreen mode

A 3 MB binary plus 10 KB of JavaScript. Copy them to ~/.local/share/tabularis/plugins/google-sheets/, restart the app, and Google Sheets is in the driver picker next to Postgres, MySQL, and SQLite.

Then:

  1. Plugin settings → OAuth wizard → paste Client ID + Secret → authorize in browser → paste redirect URL back.
  2. New Connection → Driver: Google Sheets → paste spreadsheet URL.
  3. Sidebar lists every tab as a table. Click one. Row 1 becomes the column header.
  4. Editor: SELECT * FROM "Sheet1" LIMIT 5. Results render in the grid. Sortable, filterable, exportable.

Total time from "I want this" to "it works": about 20 minutes once you know the pattern. The full tutorial walks through every file.


Why this pattern generalises

Google Sheets isn't special. Anything with rows and columns can be a SQL database if you're willing to write the mapping:

  • Hacker News — I did this one too. Stories, comments, and users each become a table. SELECT * FROM top_stories WHERE score > 500.
  • A folder of CSVs — one file per table. Already exists as a community plugin.
  • Your Notion workspace — databases map 1:1. Properties become columns. Probably the next one I build.
  • A GitHub repo — issues, PRs, commits. SELECT author, count(*) FROM commits GROUP BY author.

None of these are databases. All of them have tables if you squint. The plugin protocol is the squint.

The trick that makes it tractable: you don't need to support all of SQL. You need to support the SQL your users will actually write against your data source. Nobody's writing window functions against a Google Sheet. Ship the 80% that works, error cleanly on the rest, let the surface grow with demand.


Takeaways

  • A "database driver" is just an RPC stub that can answer get_tables, get_columns, execute_query. Everything else is optional. Ship the minimum, grow from there.
  • Synthetic primary keys solve "this source has no IDs." Prepend a _row column that maps back to whatever positional addressing your source uses. Document the concurrency footgun.
  • Sample, don't schematise. When your source has no declared types, infer from a sample of rows. Three types (TEXT/INTEGER/REAL) is plenty.
  • Don't write a real SQL parser until you need one. Regex is fine for small surfaces. Swap to sqlparser when joins show up.
  • Externalise React in plugin UI bundles. Your 30 KB bundle becomes 3 KB and runs in the host's React instance. No hook-rules violations, no duplicate providers.

The finished driver lives here: tabularis-google-sheets-plugin

If you want to build one yourself: npm create @tabularis/plugin@latest scaffolds the whole thing in 10 seconds. The full step-by-step tutorial has everything this post skipped. Tabularis itself is on GitHub.

What would you want to query with SQL that isn't a database today? I'm genuinely asking — the plugin registry is open, and the weirder the source, the more interesting the mapping.

Top comments (0)