clig.dev is a set of guidelines for building command-line interfaces that are consistent, predictable, and human-friendly. Most teams treat these as aspirational — a checklist someone reviews once during a code review.
We automated it. A test walks our entire command tree and verifies every guideline at CI time. If someone adds a new command without a Long description, without examples, without SilenceUsage, or without exit code documentation, the build fails.
Here's how each guideline maps to concrete Go code, and the test that enforces it.
The Compliance Test
func TestCligCompliance(t *testing.T) {
root := getRootCmd()
// Walk the full command tree (including nested subcommands).
var commands []*cobra.Command
var walk func(cmd *cobra.Command)
walk = func(cmd *cobra.Command) {
commands = append(commands, cmd)
for _, child := range cmd.Commands() {
walk(child)
}
}
for _, child := range root.Commands() {
walk(child)
}
for _, cmd := range commands {
name := cmd.CommandPath()
if cmd.RunE == nil && cmd.Run == nil {
continue // Skip group headers
}
t.Run(name, func(t *testing.T) {
t.Run("has_long_description", func(t *testing.T) { ... })
t.Run("long_starts_with_verb", func(t *testing.T) { ... })
t.Run("documents_exit_codes", func(t *testing.T) { ... })
t.Run("has_examples", func(t *testing.T) { ... })
t.Run("silence_usage_set", func(t *testing.T) { ... })
t.Run("silence_errors_set", func(t *testing.T) { ... })
t.Run("format_flag_if_data_command", func(t *testing.T) { ... })
})
}
}
Seven checks per command. The test discovers commands dynamically — no manual list to maintain. When someone adds stave inspect newcommand, the test automatically verifies it meets all seven criteria.
Run it:
make clig-check
# or: go test ./cmd/ -run "TestCligCompliance|TestCligGlobalFlags" -count=1
Let's look at each guideline and how it's implemented.
1. Help Text: Long Description Starting With a Verb
clig.dev says: "Provide a detailed help text for every command. Start with a verb phrase that describes what the command does."
The Test
t.Run("long_starts_with_verb", func(t *testing.T) {
long := strings.TrimSpace(cmd.Long)
if long == "" {
t.Skip("no Long description")
}
first := strings.SplitN(long, " ", 2)[0]
if first != "" && first[0] >= 'a' && first[0] <= 'z' {
t.Errorf("%s: Long description should start with a capitalized verb, got %q", name, first)
}
})
The Implementation
Every command follows a template:
Long: `Gate applies a CI failure policy and returns exit code 3 when the policy fails.
Supported policies:
- fail_on_any_violation
- fail_on_new_violation
- fail_on_overdue_upcoming
Inputs:
--policy CI failure policy mode (default: from project config)
--in Path to evaluation JSON
--format, -f Output format: text or json (default: text)
Outputs:
stdout Gate result summary (text or JSON)
stderr Error messages (if any)
Exit Codes:
0 - Policy passed; no violations detected
2 - Invalid input or configuration error
3 - Policy failed; violations detected
130 - Interrupted (SIGINT)`,
The template has four sections: what it does (verb phrase), inputs (flags), outputs (stdout/stderr), and exit codes. This structure is enforced by the test — the "documents exit codes" check verifies the word "exit" appears in the Long text.
2. Exit Codes: Semantic, Not Binary
clig.dev says: "Use exit codes to indicate what happened. Don't just use 0 and 1."
The Implementation
const (
ExitSuccess = 0 // No issues
ExitSecurity = 1 // Security-audit gating failure
ExitInputError = 2 // Invalid input, flags, or schema validation
ExitViolations = 3 // Evaluation completed — findings detected
ExitInternal = 4 // Unexpected internal error (bug)
ExitInterrupted = 130 // Interrupted by SIGINT (Ctrl+C)
)
Exit code 3 is the critical design decision. The tool succeeded — it ran correctly and found violations. That's not an error (exit 1 or 2). It's a policy signal. CI pipelines can branch:
stave apply --controls controls --observations observations
case $? in
0) echo "Clean" ;;
2) echo "Bad input" ; exit 1 ;;
3) echo "Violations found" ; exit 1 ;;
4) echo "Bug — report it" ; exit 1 ;;
esac
The Test
t.Run("documents_exit_codes", func(t *testing.T) {
if cmd.RunE == nil {
t.Skip("group command")
}
if !strings.Contains(strings.ToLower(cmd.Long), "exit") {
t.Errorf("%s: Long description should document exit codes", name)
}
})
Every leaf command must mention "exit" in its Long description. The test doesn't check the specific codes — just that exit code documentation exists.
3. --format on Data Commands
clig.dev says: "If your command produces structured data, offer --format for machine-readable output."
The Test
t.Run("format_flag_if_data_command", func(t *testing.T) {
if !isDataCommand(cmd) {
t.Skip("not a data command")
}
f := cmd.Flags().Lookup("format")
if f == nil {
f = cmd.InheritedFlags().Lookup("format")
}
if f == nil {
t.Errorf("%s: data-producing command lacks --format flag", name)
}
})
isDataCommand maintains an explicit list of commands that produce multi-format output:
func isDataCommand(cmd *cobra.Command) bool {
multiFormatCommands := map[string]bool{
"stave apply": true,
"stave diagnose": true,
"stave validate": true,
"stave report": true,
"stave security-audit": true,
"stave ci gate": true,
"stave snapshot diff": true,
"stave snapshot quality": true,
"stave snapshot upcoming": true,
"stave controls list": true,
}
return multiFormatCommands[cmd.CommandPath()]
}
Adding a new data command to this map and forgetting the --format flag fails the test.
4. SilenceUsage and SilenceErrors
clig.dev says: "Don't dump usage information on every error. It's noise."
The Problem
By default, Cobra prints the full usage text whenever a command returns an error. A missing file produces 50 lines of flag documentation followed by a one-line error message. The user has to scroll up to find the actual problem.
The Implementation
Every leaf command sets:
cmd := &cobra.Command{
Use: "gate",
SilenceUsage: true,
SilenceErrors: true,
RunE: func(cmd *cobra.Command, _ []string) error {
// ...
},
}
-
SilenceUsage: true— Cobra doesn't print usage on error. The error message alone is shown. -
SilenceErrors: true— Cobra doesn't print the error. The executor handles error rendering with structuredErrorInfo(code, title, message, action, URL).
The Test
t.Run("silence_usage_set", func(t *testing.T) {
if cmd.RunE == nil {
t.Skip("no RunE")
}
if !cmd.SilenceUsage {
t.Errorf("%s: SilenceUsage should be true", name)
}
})
t.Run("silence_errors_set", func(t *testing.T) {
if cmd.RunE == nil {
t.Skip("no RunE")
}
if !cmd.SilenceErrors {
t.Errorf("%s: SilenceErrors should be true", name)
}
})
5. --version, --quiet, --no-color, --verbose
clig.dev says: "Provide global flags for version, quiet mode, verbose mode, and color control."
The Test
func TestCligGlobalFlags(t *testing.T) {
root := getRootCmd()
if root.Version == "" {
t.Error("root command has empty Version")
}
requiredGlobals := []struct {
name string
why string
}{
{"quiet", "CLIG: --quiet suppresses non-essential output"},
{"verbose", "CLIG: -v enables verbose/debug output"},
{"no-color", "CLIG: --no-color disables ANSI output"},
}
for _, rf := range requiredGlobals {
t.Run(rf.name, func(t *testing.T) {
f := root.PersistentFlags().Lookup(rf.name)
if f == nil {
t.Errorf("root command missing --%s flag (%s)", rf.name, rf.why)
}
})
}
}
All four are persistent flags on the root command — available to every subcommand.
6. NO_COLOR and TTY Detection
clig.dev says: "Respect the NO_COLOR environment variable. Don't output ANSI codes when stdout is not a terminal."
The Implementation
A five-level cascade determines whether to use color:
func (r *Runtime) CanColor(out io.Writer) bool {
// 1. Explicit --no-color flag
if r != nil && r.NoColor {
return false
}
// 2. NO_COLOR environment variable (https://no-color.org)
if _, noColor := os.LookupEnv("NO_COLOR"); noColor {
return false
}
// 3. TERM=dumb
if strings.EqualFold(os.Getenv("TERM"), "dumb") {
return false
}
// 4. Test override
if r != nil && r.IsTTY != nil {
return *r.IsTTY
}
// 5. Actual TTY detection (cached per file descriptor)
return detectTTY(out)
}
Priority order: explicit flag > environment > terminal type > test override > runtime detection. The result is cached per file descriptor via sync.Map — checking the same os.Stdout twice doesn't make two syscalls.
7. stderr for Diagnostics, stdout for Data
clig.dev says: "Write data to stdout. Write messages, logs, and progress to stderr."
The Implementation
Progress spinners write to stderr:
func (r *Runtime) BeginProgress(label string) func() {
errOut := r.stderr() // Always stderr, never stdout
if !r.isTerminal(errOut) {
fmt.Fprintf(errOut, "Running: %s...\n", label)
// ...
}
// Spinner frames go to errOut
}
Hints and next-step suggestions write to stderr:
func (r *Runtime) PrintNextSteps(steps ...string) {
if r.Quiet || len(steps) == 0 {
return
}
fmt.Fprintln(r.stderr(), "\nNext steps:")
for _, s := range steps {
fmt.Fprintf(r.stderr(), " %s\n", s)
}
}
Data (findings, reports, JSON) writes to stdout:
RunE: func(cmd *cobra.Command, _ []string) error {
result, err := evaluate(...)
return jsonutil.WriteIndented(cmd.OutOrStdout(), result)
}
This means stave apply --format json | jq .findings works — progress messages and hints go to stderr and don't contaminate the JSON on stdout.
8. "Did You Mean?" Suggestions
clig.dev says: "If the user makes a typo, suggest the closest valid command or flag value."
The Implementation
For unknown commands:
func SuggestCommandError(err error, commandNames []string) error {
unknown := extractUnknownCommand(err.Error())
if unknown == "" {
return err
}
suggestion := ClosestToken(unknown, commandNames)
if suggestion == "" || suggestion == unknown {
return err
}
return fmt.Errorf("unknown command %q\nDid you mean %q?", unknown, suggestion)
}
For invalid --format values:
func ResolveFormatValue(raw string, valid []string) (OutputFormat, error) {
normalized := NormalizeToken(raw)
for _, v := range valid {
if normalized == v {
return OutputFormat(v), nil
}
}
if suggestion := ClosestToken(normalized, valid); suggestion != "" {
return "", fmt.Errorf("invalid --format %q\nDid you mean %q?", raw, suggestion)
}
return "", fmt.Errorf("invalid --format %q (use %s)", raw, enumList(valid))
}
The ClosestToken function uses Levenshtein edit distance — if the input is within 2 edits of a valid value, it's suggested. stave apply --format jso → Did you mean "json"?
9. Graceful SIGINT Handling
clig.dev says: "Handle Ctrl+C gracefully. Clean up resources. Use exit code 130."
The Implementation
func (a *App) installInterruptHandler() func() {
sigCh := make(chan os.Signal, 1)
signal.Notify(sigCh, os.Interrupt, syscall.SIGTERM)
go func() {
select {
case <-sigCh:
fmt.Fprintln(os.Stderr, "Interrupted")
if a.cancel != nil {
a.cancel() // Cancel context, operations unwind
}
case <-done:
return
}
}()
// ...
}
Context cancellation, not os.Exit. Deferred cleanup runs. Exit code 130 (128 + SIGINT signal number 2) — the Unix convention that CI tools understand.
10. Examples: Realistic, Copy-Pasteable
clig.dev says: "Show examples that work. Use realistic values, not 'foo' and 'bar'."
The Implementation
Example: ` # Standard evaluation
stave apply --controls controls --observations observations --max-unsafe 168h
# JSON output for automation
stave apply --format json > evaluation.json
# Quiet mode with exit code only
stave apply --quiet --max-unsafe 7d
# Deterministic evaluation for CI
stave apply --now 2026-01-11T00:00:00Z --format json`,
Real flag values (168h, 7d), real paths (controls, observations), real workflows (pipe to file, CI with --now). A user can copy-paste any example and it works with the test data that ships with the tool.
The Test
t.Run("has_examples", func(t *testing.T) {
if cmd.RunE == nil {
t.Skip("group command")
}
if strings.TrimSpace(cmd.Example) == "" {
t.Errorf("%s: missing Example (CLIG: show realistic usage)", name)
}
})
The Full Checklist
| clig.dev Guideline | Implementation | Enforced By |
|---|---|---|
| Detailed help text |
Long: field with verb-phrase + template |
TestCligCompliance/has_long_description |
| Help starts with verb | Capitalized verb phrase | TestCligCompliance/long_starts_with_verb |
| Document exit codes | Exit codes section in Long text | TestCligCompliance/documents_exit_codes |
| Realistic examples |
Example: field with copy-pasteable commands |
TestCligCompliance/has_examples |
| Don't dump usage on error | SilenceUsage: true |
TestCligCompliance/silence_usage_set |
| Custom error rendering |
SilenceErrors: true + ErrorInfo |
TestCligCompliance/silence_errors_set |
| --format on data commands |
-f flag on multi-format commands |
TestCligCompliance/format_flag_if_data_command |
| --version | Root command Version field |
TestCligGlobalFlags |
| --quiet | Persistent flag, resolves io.Writer | TestCligGlobalFlags |
| --verbose | Persistent flag, controls slog level | TestCligGlobalFlags |
| --no-color | Persistent flag + NO_COLOR env + TTY detection | TestCligGlobalFlags |
| stderr for diagnostics | Progress, hints, errors → stderr | Code review (not automated) |
| "Did you mean?" | Edit distance suggestions for commands and flags | Code review (not automated) |
| SIGINT handling | Context cancellation, exit 130 | Code review (not automated) |
10 guidelines enforced by automated tests. 4 enforced by code review. The test runs in CI on every commit — a new command without proper help text, examples, or exit code documentation breaks the build.
The TestCligCompliance test walks the entire command tree of Stave, a Go CLI with 30+ commands. Adding a new command automatically triggers 7 compliance checks. make clig-check runs in under a second.
Top comments (0)