Hong Minhee

Posted on Aug 27

Optique: Type-Safe CLI Parser Combinators

#cli #typescript #parser #javascript

Recently, I developed a somewhat experimental CLI parser library called Optique. While it has now been released up to version 0.2.0, I think it's an interesting idea worth sharing in this article.

Optique was influenced by two very different libraries. One is Haskell's optparse-applicative library, which taught me that CLI parsers can be parser combinators, and that this approach is incredibly useful. The other is Zod, which TypeScript users are already familiar with. While I got the core idea from optparse-applicative, Haskell and TypeScript are such different languages that there are significant differences in how APIs are structured. So for API design, I referenced various validation libraries including Zod.

Optique expresses what a CLI should look like by assembling small parsers and parser combinators like LEGO blocks. For example, one of the smallest building blocks is option():

const parser = option("-a", "--allow", url());

To execute this parser, you use the run() API:
(Note that the run() function implicitly reads process.argv.slice(2))

const allow: URL = run(parser);

While I've explicitly annotated the URL type in the code above, it's automatically inferred as URL even without the annotation. This parser only accepts the -a/--allow=URL option. It will error if given other options or arguments. It will also error if the -a/--allow=URL option is not provided.

What if we want to make the -a/--allow=URL option optional instead of required? In that case, we wrap the option() parser with the optional() combinator.

const parser = optional(option("-a", "--allow", url()));

What type does this parser produce when executed?

const allow: URL | undefined = run(parser);

Yes, it becomes a URL | undefined type.

Alternatively, let's make it possible to accept multiple -a/--allow=URL options. We want to be able to write:

prog -a https://example.com/ -a https://hackers.pub/

To allow an option to be used multiple times like this, we use the multiple() combinator instead of the optional() combinator:

const parser = multiple(option("-a", "--allow", url()));

Can you start to predict what the result type will be?

const allowList: readonly URL[] = run(parser);

Yes, it becomes a readonly URL[] type.

Now, what if we want to add a mutually exclusive -d/--disallow=URL option that cannot be used together with -a/--allow=URL? Only one of these options should be usable at the same time. In this case, we use the or() combinator:

const parser = or(
  multiple(option("-a", "--allow", url())),
  multiple(option("-d", "--disallow", url())),
);

This parser correctly accepts commands like these:

prog -a https://example.com/ --allow    https://hackers.pub/
prog -d https://example.com/ --disallow https://hackers.pub/

But it will error when -a/--allow=URL and -d/--disallow=URL options are mixed:

prog -a https://example.com/ --disallow https://hackers.pub/

So what type does this parser's result have?

const result: readonly URL[] = run(parser);

Oh, since both parsers wrapped by the or() combinator produce readonly URL[] values, we get readonly URL[] | readonly URL[], which ultimately becomes just readonly URL[]. We'd like to change this to a proper discriminated union format. Something like this type would be ideal:

const Result =
  | { mode: "allowList"; allowList: readonly URL[] }
  | { mode: "blockList"; blockList: readonly URL[] };

When we want to create an object-shaped structure like this, we use the object() combinator:

const parser = or(
  object({
    mode: constant("allowList"),
    allowList: multiple(option("-a", "--allow", url())),
  }),
  object({
    mode: constant("blockList"),
    blockList: multiple(option("-d", "--disallow", url())),
  }),
);

We also used the constant() parser to add a discriminator. This parser is a bit special—it doesn't read anything and just produces the given value. In other words, it's a parser that always succeeds. While it's mainly used for constructing discriminated unions, it can be used in other creative ways.

Now this parser produces the result value with the type we want:

const result:
  | { readonly mode: "allowList"; readonly allowList: readonly URL[] }
  | { readonly mode: "blockList"; readonly blockList: readonly URL[] }
  = run(parser);

The or() and object() combinators aren't just for mutually exclusive options. Subcommands can be implemented using the same principle. Let me introduce the command() parser that matches a single command and the argument() parser that matches positional arguments:

const parser = command(
  "download",
  object({
    targetDirectory: optional(
      option(
        "-t", "--target",
        file({ metavar: "DIR", type: "directory" })
      )
    ),
    urls: multiple(argument(url())),
  })
)

This parser matches commands like:

prog download --target=out/ https://example.com/ https://example.net/

The parser's result type is:

const result: {
  readonly targetDirectory: string | undefined;
  readonly urls: readonly URL[];
} = run(parser);

How would we add an upload subcommand here? That's right, we connect them with the or() combinator:

const parser = or(
  command(
    "download",
    object({
      action: constant("download"),
      targetDirectory: optional(
        option(
          "-t", "--target",
          file({ metavar: "DIR", type: "directory" })
        )
      ),
      urls: multiple(argument(url())),
    })
  ),
  command(
    "upload",
    object({
      action: constant("upload"),
      url: option("-d", "--dest", "--destination", url()),
      files: multiple(
        argument(file({ metavar: "FILE", type: "file" })),
        { min: 1 },
      ),
    })
  ),
);

This parser can now accept commands like:

prog upload ./a.txt ./b.txt -d https://example.com/
prog download -t ./out/ https://example.com/ https://hackers.pub/

The parser's result type is:

const result:
  | {
      readonly action: "download";
      readonly targetDirectory: string | undefined;
      readonly urls: readonly URL[];
    }
  | {
      readonly action: "upload";
      readonly url: URL;
      readonly files: readonly string[];
    }
  = run(parser);

The same approach can be applied to implement nested subcommands, right?

So I've shown you how Optique expresses CLIs. What do you think? Does Optique's approach seem suitable for expressing complex CLIs?

Of course, Optique's approach isn't perfect. It does require more work to define very typical and simple CLIs. Also, since Optique only serves as a CLI parser, it doesn't provide many of the features that general CLI application frameworks offer. (Though I do plan to add more features to Optique in the future…)

If you're still interested in Optique's approach, please check out the introduction documentation and tutorial.

DEV Community

Optique: Type-Safe CLI Parser Combinators

Top comments (0)