We have tools for checking whether a query is injectable. We have linters, scanners, ORMs, parameterized queries, and database policies. But after the database returns rows, most applications simply trust that the result set matches the operation that asked for it.
queryguard starts there.
The query may be safe. The result may still be wrong.
SQL injection taught us to distrust query construction. Parameterized queries answered the question:
Did the user control the query structure?
That question is well understood. The tooling is mature. But it is a different question from the one queryguard asks:
Did this operation receive only the rows and fields it was allowed to receive?
Those two questions are not the same. A perfectly safe parameterized query can still return the wrong row — because a predicate was dropped, a join widened the result, a developer selected a column they shouldn't have, or a query was rewritten without updating its scope contract.
queryguard is not a database firewall. It is not a SQL injection scanner. It is not an ORM plugin. It is a contract check for observed result sets.
Where it sits
The hook position is the core design decision. queryguard sits immediately after cursor execution — before any result shaping, filtering, serialization, or response mapping.
cursor = conn.execute(sql, bindings)
rows = [dict(row) for row in cursor.fetchall()]
evidence = queryguard.run_check(contract, {
"contract_id": "user_profile_lookup",
"contract_version": "0.1.0",
"params": {"user_id": user_id},
"session": {"tenant_id": tenant_id},
"result": rows,
})
if evidence["verdict"] != "PASS":
raise QueryguardViolation(evidence)
return rows
Not at the HTTP layer. Not inside the ORM. Not at the API gateway. Immediately after the cursor returns rows — while the result is still raw, before anything shapes or discards it.
This is intentional. If rows are shaped before queryguard sees them, queryguard cannot detect violations in the discarded or modified data. The adapter-risk demo in the live lab shows this explicitly.
The contract
Each named operation has a contract that declares what its result set is allowed to look like.
id: user_profile_lookup
version: 0.1.0
operation: read
description: fetch one user profile by id
result:
cardinality:
max_rows: 1
fields:
allowed:
- id
- name
- email
- avatar_url
- created_at
required:
- id
- name
- email
forbidden:
- password_hash
- reset_token
- mfa_secret
row_constraints:
required:
- field: id
operator: equals
value_from: params.user_id
The contract declares intent, not SQL. queryguard never sees the query that ran. It only sees the contract and the result, and it asks: does the result match the declared scope?
The checks cover:
- Cardinality — did the operation return the expected number of rows?
- Field allowlist — does every row contain only declared fields?
- Required fields — is every required field present?
- Forbidden fields — did any sensitive field leak into the result?
- Row constraints — does every row satisfy the declared predicates?
Each executed check produces a finding. Gateway checks can stop evaluation early when later checks would not be meaningful, but once the input envelope, contract, and result shape are valid, row and field checks run completely so multiple violations can be reported together. The evidence includes a contract_hash — a SHA-256 of the full contract body — so you can prove not just which contract identity was claimed, but which exact policy was applied. The evidence_hash then covers the contract body hash, the contract identity, the params, the session, and the result — so the record ties the observed rows to the exact policy used and changes when the verification inputs change.
The live DB proof
The real test is against real SQL. The live lab creates an in-memory SQLite database, seeds it with users, orders, and invoices, executes actual queries, and passes the raw cursor results to queryguard.
Nine cases:
| Case | SQL | Verdict |
|---|---|---|
| Login clean | WHERE email = ? |
PASS |
| Login tautology | WHERE email = ? OR 1=1 |
FAIL — cardinality, row_constraints |
| Profile clean | WHERE id = ? |
PASS |
| Profile wrong row |
WHERE id = 2 with params.user_id = 1
|
FAIL — row_constraints |
| Profile forbidden column | SELECT ... password_hash ... |
FAIL — field_allowlist, forbidden_fields |
| Orders clean | WHERE customer_id = ? |
PASS |
| Orders missing predicate | SELECT id, customer_id, status, total, created_at, updated_at FROM orders |
FAIL — row_constraints |
| Invoice clean | WHERE id = ? AND tenant_id = ? |
PASS |
| Invoice missing tenant predicate |
WHERE id = ? only |
FAIL — row_constraints |
Case 2 is the one worth pausing on. The query WHERE email = ? OR 1=1 returns every user in the database. The login contract says max_rows: 1 and email must equal params.email. queryguard catches the widened result on cardinality and row_constraints — before the application ever sees it.
Case 4 shows something different: the query is safe, the parameterization is correct, but the hardcoded predicate WHERE id = 2 returns the wrong row for params.user_id = 1. No injection. Clean SQL. Wrong result. queryguard catches it.
The adapter-risk demo shows the inverse: if SQL returns password_hash but an adapter strips it before queryguard sees the rows, queryguard returns PASS. The violation was hidden upstream. The rule is absolute — queryguard must sit before any shaping.
The honest limits
queryguard v0.1 is narrow by design.
Adapter shaping can hide violations. queryguard only verifies the rows it receives. If an adapter filters fields or rows before verification, queryguard cannot detect what was removed. The checker must run immediately after query execution and before any result shaping, filtering, serialization, or response mapping.
Row uniqueness is not enforced. Three identical authorized rows produce PASS, because every row satisfies the declared constraints. Detection of duplicate rows requires either a max_rows cardinality constraint or a future unique_by primitive.
Nested result rows are not verified. A row containing a nested object produces UNKNOWN. Flat rows only in v0.1.
Write operations are out of scope. No INSERT, UPDATE, DELETE. No result set, nothing to verify.
Aggregates are out of scope. COUNT, SUM, GROUP BY — row-level predicate enforcement requires individual rows.
Row order affects the evidence hash. The same logical result in a different order produces a different evidence_hash. Use a deterministic ORDER BY when reproducible hashes matter.
These are not oversights. They are the boundary of the one claim.
The test suite
The embedded suite has 59 cases across five batches:
- Documented — clean reads, tautology injection, cross-user rows, forbidden columns, nested objects
- Adversarial batches 1–4 — malformed contracts, hostile envelopes, null/bool coercion bypass, contract body substitution, cyclic structures, depth-exceeded values
- Meta-tests — determinism, hash sensitivity to result changes, hash sensitivity to contract body changes, row-order behavior, duplicate-row known limitation, shared Python reference detection
The adversarial suite exists because a checker that catches bad results but crashes on bad inputs is not a reliable checker. The bounded JSON-domain walk handles depth > 64, node count > 10,000, and cyclic Python structures without crashing. Every failure returns structured evidence.
Why this matters — and where it should go next
SQL injection gave us decades of tooling for one side of the database boundary: the query going in. queryguard is an attempt at the other side: the result coming out.
It turns "I think this query returns the right data" into a testable, evidence-producing assertion. But v0.1 is narrow by design, and there are open questions I haven't fully answered yet.
What does the contract lifecycle look like at scale? Right now contracts are simple versioned operation definitions, treated as immutable once published. That works for a lab. It gets harder when a schema changes and fifty contracts need updating. How do teams manage that without the manifest going stale?
Where should this live in a real pipeline? I've placed it at the cursor boundary, immediately after rows are returned. In real applications, the nearest practical hook might be a repository layer, database access wrapper, or ORM-adjacent interception point — but the rule stays the same: verify before shaping, filtering, serialization, or response mapping.
Is the one-thing discipline the right call for v0.2? Row uniqueness, nested rows, and write authorization are explicitly out of scope. Each could become its own tool in the same family. But maybe the right move is a more complete result verifier rather than three separate tools. I genuinely don't know yet.
If you've hit similar gaps — results you couldn't trust, predicates that got dropped, columns that leaked into responses — I'd like to hear how you've handled it. And if you run the suite or the live lab and find something that should fail but doesn't, open an issue.
The scope is narrow. The claim is specific. That's where v0.1 has to start.
queryguard v0.1 — github.com/rodrigo-areyzaga/queryguard
Single file. Zero dependencies. python queryguard.py to run the suite. python live_db_lab.py for the SQLite validation.
Top comments (0)