Published a SQL Linter to PyPI Because I Was Tired of Bad Queries Hitting Production

#sql #ai #devops #python

Food manufacturing ERPs run on SQL Server. SSRS reports, stored procedures, ad-hoc queries — often written by people who learned SQL from Stack Overflow.

A DELETE without WHERE against a staging table is a wake-up call. sql-sop catches these patterns before they reach the database.

sql-sop: 18 rules, 55 tests, 0.08 seconds

pip install sql-sop
sql-sop check .

That is it. Point it at a directory and it scans every .sql file in 0.08 seconds. No config file needed. No database connection. Just pattern matching against compiled regex and sqlparse AST analysis.

The rules

5 errors (block commits):

Rule	What it catches
E001	DELETE without WHERE
E002	DROP without IF EXISTS
E003	GRANT/REVOKE in application code
E004	String concatenation in WHERE (SQL injection)
E005	INSERT without explicit column list

10 warnings (advisory):

Rule	What it catches
W001	SELECT *
W002	Missing LIMIT on large result sets
W003	Functions on indexed columns (kills index usage)
W004	Multi-table JOIN without aliases
W005	Subquery in WHERE that could be a JOIN
W006	ORDER BY without LIMIT
W007	Hardcoded magic numbers in WHERE
W008	Inconsistent keyword casing
W009	Missing semicolons
W010	Commented-out code blocks

3 structural rules (v0.3.0, sqlparse AST):

Rule	What it catches
S001	Implicit cross join (comma-separated tables in FROM)
S002	Subquery nested more than 2 levels deep
S003	CTE defined but never referenced

The fluent API

v0.2.0 added a chainable Python API:

from sql_guard import SqlGuard

result = SqlGuard().enable("E001", "W001").scan("DELETE FROM users")
print(result.passed)     # False
print(result.summary())  # "1 error in 1 statement"

This lets you use sql-sop programmatically - in test suites, CI pipelines, or other tools. The CLI is for humans; the API is for code.

Pre-commit hook

repos:
  - repo: https://github.com/Pawansingh3889/sql-guard
    rev: main
    hooks:
      - id: sql-sop
        args: [--severity, error]

Every SQL file gets checked before every commit. Dangerous patterns are caught before they reach the PR, let alone production.

Structural rules with sqlparse

The regex-based rules catch surface patterns. But some bad SQL looks fine line-by-line:

SELECT * FROM orders, customers WHERE orders.id = customers.order_id

This is an implicit cross join. It works, but it is fragile and unclear. The structural rule S001 catches it by parsing the FROM clause rather than matching text.

For S002 (deeply nested subqueries), sqlparse builds an actual token tree. I walk it recursively, counting parenthesis depth. More than 2 levels deep gets flagged with a suggestion to use CTEs.

Notes on publishing to PyPI

Hatchling is the simplest build backend. pyproject.toml with [build-system] requires = ["hatchling"] — no setup.py, no setup.cfg.
Test matrix matters. Python 3.10 through 3.13 each have slightly different regex behaviour. CI catches what local testing misses.
195 monthly downloads is modest but meaningful. Most PyPI packages get zero. Each download is someone protecting their database.
The pre-commit hook drives adoption. More usage comes via pre-commit than the CLI. Meeting users where they already work matters more than features.