Eitamos Ring

Posted on Feb 9

Building a Pure Go PostgreSQL SQL Parser (No CGO, No Server, No Runtime Dependencies)

#postgres #go #opensource #programming

Why we built this

We needed PostgreSQL SQL parsing in environments where CGO was not an option:

Alpine containers
AWS Lambda
Distroless images
Scratch builds
ARM deployments
Anywhere CGO_ENABLED=0 is required

Most existing approaches either:

Depend on native Postgres parser bindings
Require CGO
Require running a Postgres server
Are too heavy for infrastructure tooling

So we built a pure Go PostgreSQL parser.

The goal

Not to replace Postgres parsing.

Not to be 100% server-compatible.

The goal was simple:

Give infrastructure and tooling systems structured query data safely and deterministically.

What it extracts

The parser outputs an intermediate representation (IR) with:

Tables (with aliases)
Columns
Joins
WHERE filters
GROUP BY
ORDER BY
CTEs
Subqueries

Example

result, err := postgresparser.ParseSQL(`
    SELECT u.name, COUNT(o.id) AS order_count
    FROM users u
    LEFT JOIN orders o ON o.user_id = u.id
    WHERE u.active = true
    GROUP BY u.name
    ORDER BY order_count DESC
`)

fmt.Println(result.Command)       // "SELECT"
fmt.Println(result.Tables)        // users, orders with aliases
fmt.Println(result.Columns)       // u.name, COUNT(o.id) AS order_count
fmt.Println(result.Where)         // ["u.active=true"]
fmt.Println(result.JoinConditions) // ["o.user_id=u.id"]
fmt.Println(result.GroupBy)       // ["u.name"]
fmt.Println(result.ColumnUsage)   // each column with its role: filter, join, projection, group, order

Now tooling can answer:

What tables does this query touch?
What joins exist?
What filters are applied?

Why ANTLR + Pure Go

We evaluated:

libpg_query bindings
WASM approaches
regex / string parsing
custom parsers

Tradeoffs we cared about

Requirement	Why
Pure Go	Simpler deploy, fewer runtime risks
No CGO	Works in restricted environments
Deterministic behavior	Important for tooling / analysis
Performance	Needed for production workloads

ANTLR gave us:

Mature grammar ecosystem
Strong parsing guarantees
Good performance with SLL mode

Performance

Most real-world queries parse in roughly:

~70–350 microseconds

(using SLL prediction mode)

Where this is useful

Typical use cases:

CI SQL validation
Query lineage hints
Migration safety checks
Static query analysis before deploy
“What tables does this service touch?” automation

Open Source

We’ve been using this internally for months and decided to open source it.

If you break it with weird SQL, please open issues — that’s how coverage improves.

👉 https://github.com/ValkDB/postgresparser

DEV Community