DEV Community

loading...

Better argument parsing with Getopt::Long

leontimmermans profile image Leon Timmermans ・5 min read

The problem

Raku has a built-in argument parser. This is a really good idea, given how common argument parsing it, but I still ended up writing my own, and to explain why I did so, I should first explain what the built-in parser does.

The raku built-in parser is a blind parser. This means that it converts the input arguments into a Capture without any knowledge of the MAIN sub, and then tries to call MAIN with that capture. This has several implications.

The first is that the input syntax has to be context-free. It allows for -foo/--foo (meaning :foo), -/foo/--/foo (meaning :!foo) and -foo=bar/--foo=bar (meaning :foo(val("bar"))). It does not allow traditional unix syntaxes such as -j2 or --jobs 2 as that would require knowing in advance that those two options take an argument.

The second issue with is is that it fails in very confusing ways. To explain this I will give some examples using zef (chosen because of its ubiquity).

$ zef instal
Usage:
  zef [--force|--force-fetch] [--timeout|--fetch-timeout=<Int>] [--degree|--fetch-degree=<Int>] [--update=<Any>] fetch [<identities> ...] -- Download specific distributions
  zef [--force|--force-test] [--timeout|--test-timeout=<Int>] test [<paths> ...] -- Run tests
  zef [--force|--force-build] [--timeout|--build-timeout=<Int>] build [<paths> ...] -- Run Build.pm
  zef [--fetch] [--build] [--test] [--depends] [--test-depends] [--build-depends] [--force] [--force-resolve] [--force-fetch] [--force-extract] [--force-build] [--force-test] [--force-install] [--timeout=<Int>] [--fetch-timeout=<Int>] [--extract-timeout=<Int>] [--build-timeout=<Int>] [--test-timeout=<Int>] [--install-timeout=<Int>] [--degree=<Int>] [--fetch-degree=<Int>] [--test-degree=<Int>] [--dry] [--upgrade] [--deps-only] [--serial] [--contained] [--update=<Any>] [--exclude=<Any>] [--to|--install-to=<Any>] install [<wants> ...] -- Install
  zef [--from|--uninstall-from=<Any>] uninstall [<identities> ...] -- Uninstall
  zef [--wrap=<Int>] [--update=<Any>] search [<terms> ...] -- Get a list of possible distribution candidates for the given terms
  zef [--max=<Int>] [--update=<Any>] [-i|--installed] list [<at> ...] -- A list of available modules from enabled repositories
  zef [--fetch] [--build] [--test] [--depends] [--test-depends] [--build-depends] [--force] [--force-resolve] [--force-fetch] [--force-extract] [--force-build] [--force-test] [--force-install] [--timeout=<Int>] [--fetch-timeout=<Int>] [--extract-timeout=<Int>] [--build-timeout=<Int>] [--test-timeout=<Int>] [--install-timeout=<Int>] [--degree=<Int>] [--fetch-degree=<Int>] [--test-degree=<Int>] [--dry] [--update] [--serial] [--exclude=<Any>] [--to|--install-to=<Any>] upgrade [<identities> ...] -- Upgrade installed distributions (BETA)
  zef [--depends] [--test-depends] [--build-depends] depends <identity> -- View dependencies of a distribution
  zef [--depends] [--test-depends] [--build-depends] rdepends <identity> -- View direct reverse dependencies of a distribution
  zef [--sha1] locate <identity> -- Lookup locally installed distributions by short-name, name-path, or sha1 id
  zef [--update=<Any>] [--wrap=<Int>] info <identity> -- Detailed distribution information
  zef [--open] browse <identity> <url-type> -- Browse a distribution's available support urls (homepage, bugtracker, source)
  zef look <identity> -- Download a single module and change into its directory
  zef [--fetch] [--build] [--test] [--depends] [--test-depends] [--build-depends] [--force] [--force-resolve] [--force-fetch] [--force-extract] [--force-build] [--force-test] [--force-install] [--timeout=<Int>] [--fetch-timeout=<Int>] [--extract-timeout=<Int>] [--build-timeout=<Int>] [--test-timeout=<Int>] [--install-timeout=<Int>] [--degree=<Int>] [--fetch-degree=<Int>] [--test-degree=<Int>] [--update] [--upgrade] [--dry] [--serial] [--exclude=<Any>] [--to|--install-to=<Any>] smoke -- Smoke test
  zef update [<names> ...] -- Update package indexes
  zef [--confirm] nuke [<names> ...] -- Nuke module installations (site, home) and repositories from config (RootDir, StoreDir, TempDir)
  zef [--version] -- Detailed version information
  zef [-h|--help]
Enter fullscreen mode Exit fullscreen mode

The problem here is a simple typo in the word "install", but nothing in this wall of output actually hints at that.

If it can match a subcommand the error message is better/shorter, but still not terribly helpful.

$ zef install
Usage:
  zef [--fetch] [--build] [--test] [--depends] [--test-depends] [--build-depends] [--force] [--force-resolve] [--force-fetch] [--force-extract] [--force-build] [--force-test] [--force-install] [--timeout=<Int>] [--fetch-timeout=<Int>] [--extract-timeout=<Int>] [--build-timeout=<Int>] [--test-timeout=<Int>] [--install-timeout=<Int>] [--degree=<Int>] [--fetch-degree=<Int>] [--test-degree=<Int>] [--dry] [--upgrade] [--deps-only] [--serial] [--contained] [--update=<Any>] [--exclude=<Any>] [--to|--install-to=<Any>] install [<wants> ...] -- Install
Enter fullscreen mode Exit fullscreen mode

Instead of telling us to give a module to install, it lists all the possible arguments for this subcommand (though to be fair, this one is largely zef's fault for making that first MAIN argument mandatory).

The same error message is given for zef install Foo --timeout=3.5 (because a Rat is not an Int) and zef install Foo --timeout 10 --timeout 10 (it passes a two value list instead of an Int to :$timeout).

These error messages are not helpful (and in some cases, it being an error isn't either). The problem here is simple: Raku knows it can't dispatch the capture to any of the MAIN candidates, but it doesn't know why. Figuring out why requires exactly the sort of introspection that it tries to avoid so hard.

But the most confusing way the argument parsing fail has to be the way it handles enums. It will interpret any known enum literal in scope as an enum as a string, e.g.:

zef install True
Cannot resolve caller new(Zef::Identity:U: Bool:D); none of these signatures match:
    (Zef::Identity: Str :$name!, :ver(:$version), :$auth, :$api, :$from, *%_)
    (Zef::Identity: Str $id, *%_)
Enter fullscreen mode Exit fullscreen mode

It's impossible to pass Raku programs using built-in argument parsing any of the strings 'True', False, Less, More, Same, a bunch of others and any enum literal you've defined in your script as strings because they'll be converted into something else entirely.

The solution

So instead of using the built-in argument parser, I wrote my own argument parsing module: Getopt::Long. Unlike the built-in one, it is contextual. It will first look at the sub MAIN to know what arguments to expect, than parse based on that, and then call MAIN. This way, one can parse -j2 to :j(2), and --jobs 2 to :jobs(2). This results in a far more unixish interface than what is possible using the default parsing.

But there's a second advantage: better error messages. It will try its hardest to give an informative error message. For example:

  • Unknown option --foo
  • Option --foo doesn't take arguments
  • Cannot convert --foo argument "10a" to number: trailing characters after number
  • Invalid Date '20-02-20' given as --foo argument; use yyyy-mm-dd instead

It tries very hard to detect any potential issue before dispatching is done, so that it can give an informative error message.

This should make it much easier for the user of a Raku program to figure out what they did wrong, while at the same time offering a much more standard interface to your users. It allows for a series of options to make the external interface either more unixish (the default) or more like raku's argument parsing (for backwards compatibility).

Getopt::Long offers both a functional interface (much like the Perl module that inspired it), and a MAIN wrapper that can function as a drop in replacement of the existing argument parser: all you have to do to benefit from this is add a use Getopt::Long; to your script!

Discussion

pic
Editor guide