Welcome to the second part of my rolling blog, where I start building CLIAR from the ground up-starting with the simplest possible parser—and see where it breaks.
You can find the first part here and current snapshots of the code on my GitHub page.
Getting started with Short Options
To keep the interface simple, we start with a factory method:
public static Cliar from(String[] args) { ... }
With that in place, we can begin parsing the simplest possible arguments: single-character options.
for (String arg : args) {
if (arg.startsWith("-") && !arg.startsWith("--")) {
char option = arg.charAt(1);
// ...
}
}
This allows basic usage like:
myApp -a
myApp -a -b -c
However, passing each option separately is verbose. A common convention is to group them:
myApp -abc
This requires only a small change:
for (String arg : args) {
if (arg.startsWith("-") && !arg.startsWith("--")) {
for (char chr : arg.substring(1).toCharArray()) {
// ...
}
}
}
Long Options for Clarity
An option character should reflect its meaning, like -a for all or -v for verbose. But what if you have options that start with the same letter, like cut and copy?
For reasons like this—and for better readability—we introduce long options. To distinguish them from short options, they start with a double hyphen --:
for (String arg : args) {
if (arg.startsWith("-") && !arg.startsWith("--")) {
// ...
} else if (arg.startsWith("--") && (2 < arg.length())) {
arg = arg.substring(2);
// ...
}
}
However a long option also starts with -, so the second branch will never be reached.
We therefore need to check for long options first:
for (String arg : args) {
if (arg.startsWith("--") && (2 < arg.length())) {
arg = arg.substring(2);
// ...
} else if (arg.startsWith("-") && !arg.startsWith("--")) {
// ...
}
}
Even something as simple as checking prefixes already introduces subtle edge cases.
Not everything is optional
Options, as the name implies, do not have to be passed to a program—but certain information must be, and it may not follow a format we can easily validate.
These required arguments are not prefixed, and their meaning is determined by the order in which they are passed to the program.
for (String arg : args) {
if (arg.startsWith("--") && (2 < arg.length())) {
arg = arg.substring(2);
// ...
} else if (arg.startsWith("-") && !arg.startsWith("--")) {
// ...
} else {
// TODO: process positional arguments
}
}
Parameterized options
So far, all options are simple flags—they are either present or not. But what if an option needs a value?
For example, we might want to specify a color:
myApp --color red
With our current approach, red would be treated as a positional argument, introducing ambiguity: is it a value for --color, or just the next positional parameter?
One way to solve this is to enforce ordering rules—but that would make the interface less flexible.
Instead, we can make the relationship explicit:
myApp --color=red
By using =, we clearly separate option and value, and parsing becomes straightforward: everything before = is the option, everything after is the value.
This works well for long options—but what about short options? For single-character options it might work, but short options can be grouped (-abc), making it unclear which option a value belongs to.
To keep the parser simple, we introduce a rule:
Only long options can have values.
This keeps the implementation predictable while still supporting common use cases.
for (String arg : args) {
if (arg.startsWith("--") && (2 < arg.length())) {
int index = arg.indexOf("=");
if (-1 != index) {
String key = arg.substring(0, index);
String val = arg.substring(index + 1);
// ...
} else {
// ...
}
} else if (arg.startsWith("-") && !arg.startsWith("--")) {
// ...
} else {
// TODO: process positional arguments
}
}
All Arguments Are Equal, but Some Are More Equal
Some tools treat - as a special positional argument representing standard input, and -- as a marker for the end of options.
We will ignore these cases for now—they will return once our parser is more complete.
Anything Goes? Not Quite.
So far, we have defined how command-line arguments are parsed, but not what valid arguments actually look like.
Thus:
---
-?*/
--n00p
-3
--color=(&3
are all valid options, but none of them are sensible.
It is common to restrict option names to letters only, while keeping values more flexible.
So far, all parsing logic is handled inside a single loop—and it continues to accumulate responsibilities.
As the number of cases grows, this approach becomes harder to maintain and extend.
In the next part, we will start to separate concerns and delegate specific tasks—such as parsing short options, long options, and values—to dedicated methods.
This article was written with the help of an LLM for structuring and wording. All technical content reflects my own understanding and decisions.
Top comments (0)