DEV Community

Eitamos Ring
Eitamos Ring

Posted on

Building a Pure Go PostgreSQL SQL Parser (No CGO, No Server, No Runtime Dependencies)

Why we built this

We needed PostgreSQL SQL parsing in environments where CGO was not an option:

  • Alpine containers
  • AWS Lambda
  • Distroless images
  • Scratch builds
  • ARM deployments
  • Anywhere CGO_ENABLED=0 is required

Most existing approaches either:

  • Depend on native Postgres parser bindings
  • Require CGO
  • Require running a Postgres server
  • Are too heavy for infrastructure tooling

So we built a pure Go PostgreSQL parser.


The goal

Not to replace Postgres parsing.

Not to be 100% server-compatible.

The goal was simple:

Give infrastructure and tooling systems structured query data safely and deterministically.


What it extracts

The parser outputs an intermediate representation (IR) with:

  • Tables (with aliases)
  • Columns
  • Joins
  • WHERE filters
  • GROUP BY
  • ORDER BY
  • CTEs
  • Subqueries

Example

result, err := postgresparser.ParseSQL(`
    SELECT u.name, COUNT(o.id) AS order_count
    FROM users u
    LEFT JOIN orders o ON o.user_id = u.id
    WHERE u.active = true
    GROUP BY u.name
    ORDER BY order_count DESC
`)

fmt.Println(result.Command)       // "SELECT"
fmt.Println(result.Tables)        // users, orders with aliases
fmt.Println(result.Columns)       // u.name, COUNT(o.id) AS order_count
fmt.Println(result.Where)         // ["u.active=true"]
fmt.Println(result.JoinConditions) // ["o.user_id=u.id"]
fmt.Println(result.GroupBy)       // ["u.name"]
fmt.Println(result.ColumnUsage)   // each column with its role: filter, join, projection, group, order
Enter fullscreen mode Exit fullscreen mode

Now tooling can answer:

  • What tables does this query touch?
  • What joins exist?
  • What filters are applied?

Why ANTLR + Pure Go

We evaluated:

  • libpg_query bindings
  • WASM approaches
  • regex / string parsing
  • custom parsers

Tradeoffs we cared about

Requirement Why
Pure Go Simpler deploy, fewer runtime risks
No CGO Works in restricted environments
Deterministic behavior Important for tooling / analysis
Performance Needed for production workloads

ANTLR gave us:

  • Mature grammar ecosystem
  • Strong parsing guarantees
  • Good performance with SLL mode

Performance

Most real-world queries parse in roughly:

~70–350 microseconds

(using SLL prediction mode)


Where this is useful

Typical use cases:

  • CI SQL validation
  • Query lineage hints
  • Migration safety checks
  • Static query analysis before deploy
  • “What tables does this service touch?” automation

Open Source

We’ve been using this internally for months and decided to open source it.

If you break it with weird SQL, please open issues — that’s how coverage improves.

👉 https://github.com/ValkDB/postgresparser

Top comments (0)