DEV Community

Mukhtar
Mukhtar

Posted on

Turn Anything Into a Queryable SQLite Database

grep, jq, ETL, and forensic indexing collapsed into one local SQL primitive


Your audit trail should be a database you own, not a SaaS UI you rent.

surveilr turns your files, emails, and APIs into standard SQLite databases you can query foreverโ€”with any tool, offline, on your machine.

No cloud. No dashboards. Just SQL.


See It Work in 2 Minutes

Install surveilr and immediately query your filesystem:

# Install (macOS/Linux)
brew tap surveilr/tap && brew install surveilr

# Create a database and scan your Documents folder
surveilr admin init -d my-data.db
surveilr ingest files -r ~/Documents -d my-data.db

# Open SQL shell and query
surveilr shell -d my-data.db
Enter fullscreen mode Exit fullscreen mode

Now run surprisingly powerful queries:

-- Find every PDF modified after a specific date
SELECT file_path, last_modified, size_bytes
FROM files
WHERE extension = 'pdf'
  AND last_modified > '2024-05-01';

-- Track file changes over time
SELECT file_basename, COUNT(*) as versions
FROM files
GROUP BY file_basename
HAVING versions > 1;

-- Find orphaned large files
SELECT file_path, size_bytes / 1024 / 1024 AS size_mb
FROM files
WHERE size_bytes > 10485760
ORDER BY size_bytes DESC;
Enter fullscreen mode Exit fullscreen mode

That's it. You just turned your filesystem into a queryable database you can keep investigating.


What is surveilr?

surveilr is a local-first universal SQL layer that ingests operational data and outputs standard SQLite databases.

It's not a platform. It's not a dashboard. It's an ingestion layer that speaks SQL.

Core Capabilities

  • ๐Ÿ“‚ File system indexing โ€” Turn any directory into queryable metadata
  • ๐Ÿ”„ Content transformation โ€” CSVโ†’SQL tables, HTMLโ†’JSON (CSS selectors), Markdown/XMLโ†’queryable data
  • ๐Ÿ“ง Email ingestion โ€” IMAP to SQLite (Gmail, Outlook, any server)
  • ๐Ÿ”Œ API extraction โ€” 600+ Singer taps (GitHub, Jira, Salesforce, databases)
  • ๐Ÿ” Standard SQL โ€” No custom DSL, just SQLite
  • ๐Ÿ  Local-first โ€” Everything runs on your machine, offline-capable
  • ๐Ÿ”’ You own the data โ€” Portable .db files you control forever

Why This Matters

Most operational data ends up in one of three dead ends:

  1. SaaS dashboards you can't query your way
  2. One-off scripts that become unmaintainable
  3. CSV exports that lose context over time

surveilr gives you a different option: permanent, queryable SQLite databases.

Years from now, you'll still be able to open that .db file with any SQLite client and run new queries you haven't thought of yet.


The SQLite Advantage

surveilr doesn't invent a new database format. It uses SQLiteโ€”the world's most deployed database.

You already trust SQLite. It's in your phone, your browser, your laptop's OS.

What surveilr adds is disciplined ingestion patterns that turn messy operational data into clean, queryable tables.

Inspectable: Open any .db file with DB Browser, VS Code, or sqlite3
Durable: SQLite files last decades
Interoper able: Works with Datasette, DuckDB, pandas, Observable, Grafana
Portable: One file, zero dependencies
Permanent: Your data doesn't disappear when a vendor shuts down

This is the opposite of SaaS data lock-in.


Queries That Make You Go "Wait, I Can Do THAT?"

The power comes from cross-domain queries you can't run anywhere else.

Find documents mentioned in emails after they were modified

SELECT f.file_path, e.subject, e.date, f.last_modified
FROM files f
JOIN emails e ON e.subject LIKE '%' || f.file_basename || '%'
WHERE f.last_modified < e.date
ORDER BY e.date DESC;
Enter fullscreen mode Exit fullscreen mode

Track all GitHub commits made within 24 hours of a production incident

SELECT c.commit_sha, c.author, c.message, c.timestamp
FROM github_commits c
WHERE c.timestamp BETWEEN
  (SELECT incident_time - interval '24 hours' FROM incidents WHERE id = 'INC-123')
  AND
  (SELECT incident_time FROM incidents WHERE id = 'INC-123');
Enter fullscreen mode Exit fullscreen mode

Find invoices discussed in email without matching purchase orders in Jira

SELECT e.subject, e.from, e.date
FROM emails e
WHERE e.subject LIKE '%invoice%'
  AND NOT EXISTS (
    SELECT 1 FROM jira_issues j
    WHERE j.summary LIKE '%' || SUBSTR(e.subject, INSTR(e.subject, 'INV-'), 10) || '%'
  );
Enter fullscreen mode Exit fullscreen mode

These aren't theoretical examples. These are real forensic workflows you can build.


Three Practical Guides

Guide 1: Query Your File System

Read the full guide โ†’

Scan directories and query file metadata with SQL. Plus: Transform CSVs into SQL tables, extract data from HTML with CSS selectors, parse Markdown and XML into queryable JSON.

2-minute win: Find all PDFs modified in the last 30 days
Bonus: Turn CSV files into queryable SQL tables automatically


Guide 2: Turn Email Into SQL

Read the full guide โ†’

Ingest Gmail/Outlook via IMAP. Query conversations, track threads, extract attachments, search across years of email history.

2-minute win: Find all emails from a specific sender mentioning "invoice"


Guide 3: Extract API Data

Read the full guide โ†’

Use Singer taps to pull data from GitHub, Jira, GitLab, Salesforce, or 600+ other sources. Join across platforms.

2-minute win: Query all GitHub commits from the last 7 days


Why Not Just Write a Script?

You could. But:

Scripts become dead ends.
Six months later, you can't remember what format you used or where you saved the output.

surveilr outputs permanent, standard SQLite databases.
Open them years later with any SQL tool and keep querying.

Scripts don't compose.
You can't easily join your email script's output with your file scan script's output.

surveilr stores everything in one queryable database.
Cross-domain joins just work.

Scripts have no schema.
You're parsing JSON with jq and hoping the structure doesn't change.

surveilr normalizes data into stable SQL tables.
Query with confidence.


Ecosystem Integration

surveilr isn't the destination. It's the ingestion layer.

Once your data is in SQLite, you can use any tool in the SQLite ecosystem:

You own the database. Use whatever tools you want.


Local-First. No Cloud. You Own It.

This is a big deal in 2026.

Most "compliance platforms" hide your own data behind proprietary dashboards.

surveilr gives you raw SQL access to everything.

  • โœ… All data stays on your machine โ€” No upload, no sync, no cloud dependency
  • โœ… Works completely offline โ€” Internet not required
  • โœ… Inspectable โ€” Open the .db file with any SQLite client
  • โœ… Portable โ€” Copy the file, query it anywhere
  • โœ… No vendor lock-in โ€” Standard SQLite format, not proprietary
  • โœ… Privacy-first โ€” Your data never leaves your control

If you're tired of SaaS platforms that:

  • Charge per seat
  • Lock your data behind APIs
  • Disappear when the startup shuts down
  • Require constant internet connectivity

...then surveilr is for you.


Oh, By the Way: Compliance Teams Love This Too

If you work in healthcare, finance, or regulated industries, surveilr happens to be perfect for:

  • HIPAA audits โ€” Track where PHI files are stored
  • SOX compliance โ€” Maintain 7-year email records with queryable evidence
  • GDPR requests โ€” Respond to "right to access" with SQL queries
  • SOC 2 audits โ€” Show complete change management history

But that's a side effect of the real value: permanent, queryable operational data you control.


Forensic Curiosity

Once you start using surveilr, you'll find yourself asking new questions:

  • What files were deleted but are still referenced in emails?
  • Which documents were modified right before a deployment?
  • What code changes correlate with customer support tickets?
  • Which team members touched files in a specific directory over the last year?
  • What attachments were sent externally that match internal file hashes?

These questions are impossible to answer with dashboards.

But with SQL, they're just queries.


Installation

macOS / Linux

brew tap surveilr/tap && brew install surveilr
Enter fullscreen mode Exit fullscreen mode

Verify

surveilr --version
surveilr doctor  # Check environment
Enter fullscreen mode Exit fullscreen mode

For other platforms, see the installation guide.


Run This Now

Pick your 2-minute win:

Option 1: Query your filesystem

surveilr admin init -d fs.db
surveilr ingest files -r ~/Documents -d fs.db
Enter fullscreen mode Exit fullscreen mode

Option 2: Query your email

surveilr admin init -d email.db
surveilr ingest imap -u you@gmail.com -p "app-password" -a imap.gmail.com -d email.db
Enter fullscreen mode Exit fullscreen mode

Option 3: Query GitHub (Ingest Singer taps)

surveilr admin init -d github.db
surveilr ingest files -r ./github-tap-script.py -d github.db
surveilr orchestrate adapt-singer (convert data to views)
Enter fullscreen mode Exit fullscreen mode

Then open the .db file in any SQLite tool and start exploring.


Learn More


The Bottom Line

Your operational dataโ€”files, emails, API responsesโ€”should be:

  1. Queryable (SQL, not grep)
  2. Permanent (SQLite, not CSVs)
  3. Yours (local, not SaaS)
  4. Composable (join across domains)
  5. Inspectable (open in any tool)

That's what surveilr gives you.

It's not compliance software. It's not enterprise governance.

It's grep, jq, ETL, and forensic indexing collapsed into one local SQL primitive you can build on forever.


Ready to own your data? Install surveilr and query something surprising.

Get Started โ†’

Top comments (0)