Everyone's building "chat with your data" apps right now. Connect an LLM to your database, let it write SQL, execute the results. Magic.
Until it isn't.
The Wake-Up Call
I was testing a GPT-4 agent connected to a staging database. Asked it to "clean up the test data from last week."
It started generating DELETE statements.
I caught it in time. But it got me thinking — there's literally nothing between the LLM and my database. One hallucination, one prompt injection, one bad day, and production data is gone.
The Problem
When you connect an AI to a database, you're giving it the keys to everything. Most setups look like this:
User → LLM → SQL → Database
No validation. No guardrails. The LLM can generate anything, and it gets executed.
The Solution
I built a simple validation layer that sits between the SQL source and your database:
User → LLM → SQL → ProxQL → Database
↓
(blocked)
It's called ProxQL. Here's how it works:
import proxql
# Safe queries pass through
proxql.is_safe("SELECT * FROM users") # True
# Destructive queries get blocked
proxql.is_safe("DROP TABLE users") # False
proxql.is_safe("DELETE FROM logs") # False
# You can also restrict to specific tables
result = proxql.validate(
"SELECT * FROM employees",
allowed_tables=["products", "orders"]
)
result.is_safe # False
result.reason # "Table 'employees' is not in allowed tables list"
What It Catches
Beyond basic statement types, it detects SQL injection patterns:
Hex encoding:
SELECT 0x44524F50 -- This is "DROP" in hex
CHAR() abuse:
SELECT CHAR(68) || CHAR(82) || CHAR(79) || CHAR(80) -- Spells "DROP"
File access:
SELECT pg_read_file('/etc/passwd')
SELECT LOAD_FILE('/etc/passwd')
SELECT * FROM users INTO OUTFILE '/tmp/dump.txt'
Unicode homoglyphs:
SELECT * FROM users WHERE nаme = 'admin'
-- That 'а' is Cyrillic, not ASCII
Three Modes
| Mode | What's Allowed |
|---|---|
read_only |
SELECT only (default) |
write_safe |
SELECT, INSERT, UPDATE |
custom |
You define the rules |
from proxql import Validator
# Read-only for analytics
analytics = Validator(mode="read_only")
# Allow writes but block destructive ops
api = Validator(mode="write_safe")
# Custom rules
admin = Validator(
mode="custom",
allowed_statements=["SELECT", "INSERT", "UPDATE", "DELETE"],
blocked_statements=["DROP", "TRUNCATE"]
)
Works with Any SQL Source
Not just LLMs. Use it for:
- User-submitted queries in BI tools
- Automated pipelines
- API endpoints that accept SQL
- Any text-to-SQL workflow
Available in Python and TypeScript
pip install proxql
npm install proxql
The API is identical in both:
import proxql from 'proxql';
proxql.isSafe("SELECT * FROM users"); // true
proxql.isSafe("DROP TABLE users"); // false
Try It
GitHub: github.com/Zeredbaron/proxql
It's open source (Apache 2.0). Would love feedback — especially on injection patterns I might be missing.
What's the sketchiest SQL you've seen an LLM generate? Drop it in the comments.
Top comments (2)
My first question is why would people let AI control a database? I don't understand the problem learning SQL.
If I would allow people to chat with the database I would only provide a limited set of command to execute. And provide a time limited rollback for people to check the executed changes.
Letting AI write queries is a no-go for me.
Pretty nice approach,
I used to create roles and assign permissions, which is manually and annoying, like i have to redo whole query script in every database instance lol