How does a linter know your column doesn't exist

#ai #postgres #devops #go

You write a query that SELECTs ghost_status from the orders table. Your code compiles. Your tests pass. But ghost_status was never created in any migration. In production, that query crashes.
Valk Guard catches this at PR time - with no database connection.
This post walks through exactly how. Not hand-waving. The actual code path, from source file to finding.
The setup
Here's a Go file using Goqu to build a query:
func ListBrokenUserOrderStatus(ctx context.Context) error {
_, _, err := goqu.From("users").
LeftJoin(
goqu.T("orders"),
goqu.On(goqu.I("orders.user_id").Eq(goqu.I("users.id"))),
).
Select("users.id", "users.email", "orders.ghost_status").
Where(goqu.I("orders.missing_flag").Eq("pending")).
ToSQL()
return err
}
And here's the migration that created the orders table:
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INTEGER NOT NULL REFERENCES users(id),
total NUMERIC(10,2) NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
created_at TIMESTAMP DEFAULT now()
);
Notice: the query references orders.ghost_status. The migration never created that column. There is no ghost_status. Valk Guard reports:
VG105: projection column "ghost_status" not found in table "orders" schema; check SELECT list and schema/model mappings
How does it know?
Let's walk through each phase.
Phase 1: Query extraction
The Goqu scanner doesn't look for SQL strings. It walks the Go AST looking for method chains rooted in goqu.From().
When it finds one, it flattens the chain into a list of method calls: From("users") → LeftJoin(...) → Select(...) → Where(...). Each method gets parsed: From gives the base table, LeftJoin gives the join target, Select gives the projection columns, Where gives the predicates.
From these parts, the scanner synthesizes a SQL statement:
SELECT users.id, users.email, orders.ghost_status
FROM users LEFT JOIN orders ON orders.user_id = users.id
WHERE orders.missing_flag = 'pending'
This SQL never existed in your source code. Valk Guard constructed it from the AST of your Go code. That's the key difference from regex-based tools - regex can't walk a method chain and reconstruct what the query builder will produce.
Phase 2: Schema snapshot
Separately, Valk Guard finds all .sql files under your migration paths. Each file gets parsed through postgresparser, and every DDL statement gets applied to a Snapshot - an in-memory representation of your schema's current state.
The snapshot builder processes DDL actions in order:
CREATE TABLE orders (id, user_id, total, status, created_at) → registers the table with five columns
ALTER TABLE orders ADD COLUMN shipped_at TIMESTAMP → adds a sixth column
ALTER TABLE orders DROP COLUMN shipped_at → removes it

The end result is a map of table names to column definitions. For orders, that's: id, user_id, total, status, created_at. Five columns. No ghost_status.
This is the same principle as running all your migrations on an empty database - except it happens in memory, with no database, in microseconds.
Phase 3: Rule evaluation
Now VG105 runs. It takes the synthesized SQL (already parsed into a structured IR by postgresparser) and the schema snapshot, and does a straightforward lookup:
For each column in the SELECT list with usage type "projection", resolve which table it belongs to (using the alias or the single-table shortcut)
Look up that table in the snapshot
Check if the column exists in the table's column map
If not → finding

For ghost_status, the column usage says it belongs to orders (from the orders.ghost_status qualifier). The snapshot has an orders table. But orders.ghost_status is not in the column map. Finding.
The same logic powers VG106 (unknown filter column - catches WHERE orders.missing_flag = 'pending' from the same query) and VG107 (unknown table reference).
It also works with ORM models
The same snapshot system powers schema-drift rules (VG101–VG104). Instead of checking queries against migrations, these rules check ORM models against migrations.
Say you have a Go struct:
type Order struct {
ID int db:"id"
UserID int db:"user_id"
Total string db:"total"
Status string db:"status"
GhostStatus string db:"ghost_status"
}
Valk Guard's Go model extractor walks the AST, reads the db struct tags, and produces a ModelDef with columns: id, user_id, total, status, ghost_status.
VG101 then compares each model column against the migration snapshot. ghost_status isn't in the orders table → finding:
VG101: model "orders" references column "ghost_status" not found in table "orders" schema; check migration DDL or update model mapping
Two different rules, two different input paths (query vs. model), same schema snapshot, same answer.
What this means in practice
You don't need a running database. You don't need to run migrations. You don't need to connect to staging. Valk Guard reads your source code and your migration files, builds everything in memory, and cross-references them statically.
This runs in CI in seconds. It catches the kind of bug that usually shows up as a column "ghost_status" does not exist error in your logs at 2am - and moves it to a PR comment at 2pm instead.
go install github.com/valkdb/valk-guard/cmd/valk-guard@latest
valk-guard scan .
Repo: github.com/ValkDB/valk-guard

DEV Community

How does a linter know your column doesn't exist

Top comments (0)