Three Rules for Designing a Go SDK Other People Will Actually Use

#api #design #go #opensource

I publish open-source Go libraries.
Not many people use most of them, and I've spent a fair amount of time trying to figure out why. Some of it is distribution. Some of it is the unsexy truth that nobody needed the thing I built. But a real chunk of it — bigger than I want to admit — is that the API was designed for me, the author, and not for the developer arriving cold from a Google search at 2am with a deadline.

This post is three rules I now apply when designing a Go SDK. They come from publishing postgresparser — a pure-Go PostgreSQL parser — and watching where new users got stuck. The examples are from that library, but the rules aren't about parsers. They're about what the surface of a Go package should look like if you want strangers to use it.

I'll also flag one place I broke my own rule, because the post would be dishonest without it.

Rule 1: Expose answers, not nodes

The single biggest mistake I see in Go SDKs (and that I've made myself) is shipping the internal data model as the public API. The author has built an AST, or a state machine, or a config tree, and they think: "great, I'll let the caller walk it." The caller does not want to walk it. The caller wants an answer to a specific question.

Here's what "expose the nodes" looks like in a SQL parsing context:

// What other Go SQL parsers tend to give you
tree, _ := parser.Parse(sql)
for _, stmt := range tree.Statements {
    if sel, ok := stmt.(*ast.SelectStmt); ok {
        for _, from := range sel.From {
            if rv, ok := from.(*ast.RangeVar); ok {
                tables = append(tables, rv.Relname)
            }
            // ...also handle JoinExpr, Subquery, RangeFunction,
            // RangeTableSample, RangeTableFunc, CTERef...
        }
    }
}

The user came to your library to find out which tables a query touches. You handed them a tree-walking exercise and a list of node types they have to learn. Every caller of your library now has to write — and maintain — the same boilerplate, with the same bugs, in slightly different ways.

Compare:

// What postgresparser gives you
result, _ := postgresparser.ParseSQL(sql)
fmt.Println(result.Tables)

That's it. Two lines. CTEs, subqueries, set operations, joins — all flattened into the same field, with aliases preserved. The IR (the actual AST-equivalent) still exists internally, but it's not what the caller binds to.

The principle: for every question your SDK answers, there should be a single field or function whose name is the question. "Which tables?" → Tables. "Which columns are filtered?" → ExtractWhereConditions. "How is each column used?" → ColumnUsage. If a user has to traverse three levels of struct to get an answer, the answer wasn't really exposed.

The objection I hear: but what if the caller wants something custom that we didn't anticipate? Fine — keep the IR public for the 5% case. But default to answering the 95% case in one line, and only fall back to the IR when the typed accessor doesn't cover the question.

Rule 2: Name the common case after the common case, and mark the variants

Most Go SDKs I see treat all of their entry points as peers. Parse, ParseStrict, ParseAll, ParseWithOptions, ParseFromReader — all listed in pkg.go.dev with the same visual weight, and the user has to read every one to figure out which they want.

This is the "tyranny of options" failure. The author thought of every variant; the user has to think about it too.

The fix is sequencing. Pick the version 80% of users want. Give that the short name. Make the other variants explicitly named after the thing that makes them different.

postgresparser's parsing entry points:

// 80% case — parses one statement, gives you a result.
result, _ := postgresparser.ParseSQL(sql)

// "I might pass multiple statements and want all of them."
batch, _ := postgresparser.ParseSQLAll(sql)

// "I want an error if more than one statement was passed."
result, _ := postgresparser.ParseSQLStrict(sql)

ParseSQL is the default. ParseSQLAll and ParseSQLStrict are explicitly named after the property that makes them different (handling all statements, strict-on-multi). A user reading the package docs sees ParseSQL first, tries it, and only goes looking for the variants if they hit a case it doesn't cover.

The wrong version of the same API:

// Don't do this
postgresparser.ParseSQL(sql, ParseOptions{Strict: true, AllStatements: false})
postgresparser.ParseSQL(sql, ParseOptions{Strict: false, AllStatements: true})

You've moved the decision from the function name (where it's documented and grep-able) to a config struct (where it's not). New users have to read the options struct just to call the function. Existing code has to be re-read every time someone wants to know what mode it's in.

The principle: the most common call should be the shortest call. Variants get names that describe how they differ. Config structs are for things that don't fit in a name, not for things that do.

Rule 3: Return structured data, not strings the caller has to re-parse

This one I see less often in writing about SDK design, but it's the one that bites users hardest in practice.

If your SDK has done work to extract structured information from unstructured input, don't throw the structure away on the way out. Returning []string when you could have returned []struct{...} is a tax you charge every caller forever.

postgresparser extracts WHERE conditions. The naive return type would be:

// Bad: caller has to re-parse what you already parsed
conditions, _ := analysis.ExtractWhereConditions(sql)
// returns: ["status = 'active'", "total > 100"]

// Now every caller writes a regex. They get it wrong.
// They handle = and != but forget IS NULL. They miss BETWEEN.
// They re-introduce the bug your library was built to solve.

What it actually returns:

type Condition struct {
    Column   string
    Operator string
    Value    interface{}
}

conditions, _ := analysis.ExtractWhereConditions(
    "SELECT * FROM orders WHERE status = 'active' AND total > 100",
)
for _, c := range conditions {
    fmt.Printf("%s %s %v\n", c.Column, c.Operator, c.Value)
}
// status = active
// total > 100

Now the caller can ask c.Column == "tenant_id" directly. They can switch on c.Operator. They can type-assert c.Value. None of them have to write a regex, and none of them re-introduce parsing bugs at the boundary of your library.

The principle: if the structure exists internally, expose the structure. Strings are for things that have no structure, or for things the user is going to print. Stringly-typed return values are how libraries become impossible to use correctly at scale.

The reverse also holds: if you find yourself writing a long regex inside a library you depend on, that library failed Rule 3.

Where I broke my own rule

In the spirit of not pretending I have all this figured out: postgresparser violates Rule 2 with ParseSQLWithOptions(sql, opts). It exists alongside ParseSQL(sql), takes a config struct with extraction flags like IncludeCreateTableFieldComments, and is exactly the "tyranny of options" pattern I just told you to avoid.

The honest reason it exists: comment extraction is expensive and most callers don't need it, but I didn't want to design a separate ParseSQLWithComments function because the option might evolve. So I shipped a WithOptions escape hatch and told myself it was fine. It's not fine — it's a slow leak that will get bigger as more options accrete. The right move would have been a separate named function for the one option that exists today, and a real opt-in API design when the second option arrives.

I'm flagging it so you can see what the wrong choice looks like even when the author knew the rule.

The point of including this isn't self-deprecation. It's that you will violate your own rules. The goal isn't a perfect API on day one; it's noticing the violation, naming it, and fixing it before the wrong shape hardens into a public contract you can't change.

TLDR;

If you can't remember three rules, remember the question they all answer: what does the user have to learn before they can use this library?

Rule 1 says: don't make them learn your AST.
Rule 2 says: don't make them learn your option matrix.
Rule 3 says: don't make them re-parse what you already parsed.

Every line of documentation a user has to read before their first successful call is friction. Some of it is unavoidable. A lot of it isn't, and that's where the design work is.

postgresparser is on GitHub at github.com/ValkDB/postgresparser if you want to see what these rules look like applied (and, per the section above, where they aren't yet). Issues and PRs welcome — particularly the kind that point out a rule I missed.