Part 1 of 6 in the "Surveilr Deep Dive" series
Picture this: It's Monday morning, and your organization is facing a compliance audit. The auditors want to know about every file that contains sensitive data, every email with external attachments, every system configuration change in the last quarter. Your team scrambles across dozens of tools, exports from multiple systems, and spends weeks manually correlating data that should be instantly queryable.
What if I told you there's a tool that turns ANY file, email, or system resource into queryable SQL data in minutes? Meet Surveilr β the open-source platform that's revolutionizing how organizations approach compliance, security monitoring, and resource surveillance.
The Compliance Data Problem
Before we dive into the solution, let's acknowledge the problem. Modern organizations generate digital evidence across countless systems:
- Files scattered across file systems with varying permissions and metadata
- Emails buried in different mailboxes with attachments and complex threading
- Project management data spread across GitHub, Jira, GitLab
- System configurations and logs in proprietary formats
- Network resources and databases with their own query languages
When audit time comes, teams spend 80% of their effort just finding and correlating data, leaving only 20% for actual analysis. This is backwards.
Enter Surveilr: The SQL-First Revolution
Surveilr flips this equation. Its core philosophy is elegantly simple:
Walk resources. Capture everything. Query anything.
Instead of learning dozens of tools and APIs, you learn SQL once and query everything. Instead of fragmented data silos, you get unified Resource Surveillance State Databases (RSSDs) that contain all your organizational evidence in a single, queryable format.
Here's what makes Surveilr different:
π― SQL-First Architecture
Every piece of data β files, emails, system info, project data β becomes rows and columns in SQLite databases. If you know SQL, you can analyze anything.
π Local-First, Edge-Based
Your data stays on your infrastructure. Surveilr creates local SQLite databases that can be queried without internet connectivity or cloud dependencies.
π Tool-Independent Results
Once created, RSSDs are standard SQLite files. You can query them with any SQL tool, language, or platform. Surveilr doesn't hold your data hostage.
π§© Universal Resource Processing
Files, emails, APIs, executables, documents β if it exists on your systems, Surveilr can ingest it and make it queryable.
The RSSD: Your Universal Evidence Database
At the heart of Surveilr is the Resource Surveillance State Database (RSSD). Think of it as a universal translator that converts any resource into structured, queryable data.
Here's the magic: every RSSD contains a standard schema with tables like:
-
uniform_resource
β Every file, email, or resource becomes a row -
device
β Information about the systems being surveilled -
ur_ingest_session
β Audit trail of when and how data was collected -
uniform_resource_transform
β Processed and enriched data - Plus dozens more for specialized data types
The beauty is consistency. Whether you're analyzing a Windows server, a Linux container, or a MacBook, the RSSD schema remains the same. Your SQL queries work everywhere.
Your First Surveilr Experience
Let's get hands-on. I'll walk you through creating your first RSSD and running your first surveillance queries.
Installation
Surveilr provides a one-liner installation script:
curl -sL https://raw.githubusercontent.com/opsfolio/releases.opsfolio.com/main/surveilr/install.sh | sh
Or if you prefer a custom installation path:
SURVEILR_HOME="$HOME/bin" curl -sL https://raw.githubusercontent.com/opsfolio/releases.opsfolio.com/main/surveilr/install.sh | sh
Verify your installation:
surveilr --version
surveilr --help
Your First Ingestion
Let's start simple β surveilling your current directory:
# See what would be ingested (dry run)
surveilr ingest files --dry-run
# Actually ingest the files
surveilr ingest files
# Check what we created
ls -la resource-surveillance.sqlite.db
That's it! You've just created your first RSSD. Surveilr walked through your current directory, captured metadata about every file, and stored it all in a SQLite database.
Your First Queries
Now comes the magic. Let's query our surveillance data:
# Connect to the database
sqlite3 resource-surveillance.sqlite.db
# See what tables were created
.tables
# Look at the uniform_resource table structure
.schema uniform_resource
# Find all files modified in the last 7 days
SELECT
uri,
size_bytes,
datetime(last_modified_at, 'unixepoch') as modified_date
FROM uniform_resource
WHERE last_modified_at > strftime('%s', 'now', '-7 days')
ORDER BY last_modified_at DESC;
# Find the largest files
SELECT
uri,
size_bytes,
ROUND(size_bytes/1024.0/1024.0, 2) as size_mb
FROM uniform_resource
WHERE size_bytes IS NOT NULL
ORDER BY size_bytes DESC
LIMIT 10;
# Look for configuration files
SELECT uri, nature, size_bytes
FROM uniform_resource
WHERE uri LIKE '%.conf%'
OR uri LIKE '%.config%'
OR uri LIKE '%.json%'
OR uri LIKE '%.yaml%'
OR uri LIKE '%.yml%';
What Just Happened?
In less than 5 minutes, you:
- Installed a powerful surveillance platform
- Ingested file system metadata from your current directory
- Queried that data using standard SQL
- Generated insights about file modifications, sizes, and types
But this is just the beginning. That RSSD you just created is a standard SQLite database. You can:
- Query it from any programming language with SQLite support
- Import it into business intelligence tools
- Share it with team members
- Merge it with other RSSDs
- Archive it for compliance purposes
The Architecture Behind the Magic
Before we wrap up this introduction, let's peek under the hood. Surveilr's architecture is elegantly modular:
Core Components
-
surveilr ingest
β The data collection engine that walks resources and populates RSSDs -
surveilr admin
β Administrative commands for managing and merging databases -
surveilr notebooks
β Built-in SQL notebooks for common queries and reports -
surveilr shell
β Interactive SQL interface with enhanced capabilities
Resource Types
Surveilr doesn't just handle files. It can ingest:
- File systems with full metadata and content analysis
- Email systems via IMAP (Gmail, Outlook, Microsoft 365)
- Project management platforms (GitHub, GitLab, Jira, OpenProject)
- Executable scripts that generate dynamic data
- Cloud storage systems via OpenDAL integration
- Custom data sources through capturable executables
SQL-Centric Philosophy
Everything in Surveilr revolves around SQL:
- Configuration is stored in SQL tables
- Business logic is implemented as SQL queries
- Reports are SQL result sets
- Extensions are SQL functions and procedures
- Migrations are SQL scripts
This means you can understand, modify, and extend Surveilr using the same SQL knowledge you already have.
What's Coming Next in This Series
We've just scratched the surface. Over the next five articles, we'll build on this foundation:
Article 2: "From Chaos to Queries" β We'll expand our RSSD with email ingestion, project management data, and capturable executables. You'll see how multiple data sources become a unified, queryable dataset.
Article 3: "Talk to Your Data" β We'll add AI-powered natural language querying, turning your RSSD into a conversational interface. Ask questions in plain English, get SQL results.
Article 4: "Zero-Code Dashboards" β We'll build web-based compliance dashboards using SQLPage, creating visual interfaces that non-technical stakeholders can use.
Article 5: "Breaking SQLite Barriers" β We'll enable remote team access using Surveilr's PostgreSQL wire protocol implementation, making your local RSSD accessible to distributed teams.
Article 6: "Enterprise Mastery" β We'll explore advanced features like document processing, data deidentification, and building compliance automation workflows.
Try It Yourself
The best way to understand Surveilr is to use it. Here are some exercises to try before the next article:
- Install Surveilr and create your first RSSD
-
Experiment with different directories β try surveilling
/etc
(on Unix systems) or your Documents folder -
Explore the schema β run
.schema
in sqlite3 to see all available tables - Write custom queries β look for files by type, size, or modification date
-
Check out the help β run
surveilr ingest files --help
to see advanced options
The SQL-First Future
Surveilr represents a fundamental shift in how we think about organizational data. Instead of learning new tools for each data source, we leverage the universal language of SQL. Instead of fragmented compliance efforts, we build unified evidence databases. Instead of vendor lock-in, we create portable, standards-based data assets.
In a world where data is the new oil, Surveilr is the refinery that turns raw organizational resources into compliance gold.
Ready to dive deeper? In the next article, we'll expand our surveillance capabilities beyond simple files to include emails, project management data, and dynamic executable content. Your RSSD is about to become a lot more interesting.
Coming up in Article 2: "From Chaos to Queries: Ingesting Files, Emails, and Everything Else"
Top comments (0)