Dean Hamstead

Posted on Apr 1

Manage the health of your CLI tools at scale

#sre #perl #monitoring #cli

Your services have dashboards, tracing, and alerting. Your CLI tools print to STDOUT and exit. When something breaks, debugging starts at the API gateway -- everything upstream is a black box. This makes no sense.

If your CLI talks to an API, it's part of the request path. Instrument it like any other participant.

This post describes how we instrumented an internal Perl CLI -- the same mycli tool from our earlier post on fatpacking -- with syslog logging, StatsD metrics, and correlation IDs. The post is strongly biased towards tooling internal to an organisation, which has the luxury of being opinionated: you control the deployment targets, you know where syslog goes, and you can lean on solved infrastructure rather than building your own. The principles generalise to any language and any CLI that talks to an API.

Why observability matters in CLI tools

Web services get dashboards as a matter of course[1]. Error rates, latency percentiles, request counts -- these are table stakes for any production service. CLI tools rarely get the same treatment, even when they're used just as heavily.

Once your CLI emits metrics, you can build per-tool dashboards that show error rates broken down by command, by user cohort, by API version, by CLI version, by deployment target. This is the same dimensional analysis you'd do for a web service, applied to a tool that runs on someone's laptop.

This integrates naturally with operational practices you're probably already using:

Continuous deployment. When you ship a new CLI version, the dashboard shows whether error rates changed. If command.device_list.errors spikes after a release, you know immediately -- not when someone files a ticket three days later.
Rollback decisions. If error rates climb after a release, the dashboard tells you in minutes -- roll back now, debug later. Without metrics, you're guessing whether the new version is the cause or a coincidence.
Canary deployments. Roll the new version to 10% of jumpboxes. Compare http.timing and http.errors between the canary and the stable cohort. The same deployment strategy that works for services works for CLI tools, but only if you have the metrics to compare.
Feature flags. If a new feature is gated behind a flag, metrics tell you whether the flagged code path is slower, more error-prone, or unused. Without instrumentation, feature flag decisions are based on "nobody complained".
Incident management. During a site event, the CLI dashboard shows whether the tool is contributing to or affected by the problem. A spike in http.status.503 from the CLI tells the incident commander that the API is rejecting requests before users report it. Conversely, if the CLI error rate is flat during an incident, you can rule it out as a contributing factor.
Adoption and deprecation. Metrics answer "is anyone still using the v1 endpoint?" and "has the team migrated to the new auth flow?" without surveys or guesswork.

The point is not that CLI tools are special -- it's that they're not. They're participants in the same distributed system as your services, and they deserve the same observability treatment. The investment is small: a correlation ID, a handful of counters, and a logging lifecycle. The return is that your CLI becomes a first-class citizen in your operational tooling rather than a blind spot.

[1] Yours does, right?

The three layers

We instrument at three levels, each serving a different audience and persistence model:

 Layer           Audience              Persistence
 -----           --------              -----------
 Verbose mode    Developer at terminal Ephemeral (STDERR)
 Syslog          Ops / incident review Durable (centralised logs)
 StatsD          Dashboards / alerting Aggregated (time-series)

A developer debugging their own command uses --verbose. An on-call engineer investigating a reported issue searches syslog by invocation ID. A platform team monitors command usage and error rates on dashboards. Same underlying data, different consumers, different retention.

Each layer is controlled independently and opt-in:

# Syslog only
MYCLI_LOG=1 mycli device list

# Verbose only (no syslog, no metrics)
mycli device list --verbose

# Everything
MYCLI_LOG=1 mycli device list --verbose

StatsD metrics emit whenever a statsd_host is configured -- no-ops otherwise. Syslog requires MYCLI_LOG=1 -- deliberately opt-in, since CLI tools run on personal machines and writing to syslog on every invocation without consent would be surprising.

The verbose layer itself has depth. --verbose shows the shape of the HTTP conversation -- method, URL, status, timing -- but deliberately omits headers and bodies to keep the output scannable. When that isn't enough, plugging in LWP::ConsoleLogger::Everywhere via perl -M gives a full HTTP trace without the CLI needing to build one. More on this in the debugging spectrum section below.

Invocation ID: the correlation key

Every mycli invocation generates a random 8-character hex ID at startup:

my @chars = ('0' .. '9', 'a' .. 'f');
my $id = join '', map { $chars[ int(rand @chars) ] } 1 .. 8;

This ID appears in three places:

Every syslog message -- prefixed as [f7a3b1c2]
Every HTTP request -- sent as the X-Invocation-Id header
Verbose STDERR output -- printed at startup

The server-side API logs this header alongside its own request ID. To trace a failing command end-to-end:

# Find the CLI side
grep 'f7a3b1c2' /var/log/mycli.log

# Find the server side
grep 'f7a3b1c2' /var/log/api.log

One string, full picture. No timestamps to correlate, no guessing which request came from which terminal.

User-Agent

In addition to the invocation ID, set the User-Agent header to mycli/<version>. This is trivial and gives the server side a way to filter by CLI version without any custom header support -- useful for canary deployment analysis and for spotting users running outdated versions.

Two-way correlation

The API returns its own request ID in a response header (X-Request-Id). The CLI logs this too:

[f7a3b1c2] http: 200 OK (142ms, application/json, 8431 bytes) req=a1b2c3d4

This gives you a join key in both directions: from the CLI's invocation ID you can find the server's request ID, and vice versa. When a user reports "mycli gave me an error", the request ID in the error message leads straight to the server-side trace.

What the server needs to do

The correlation only works if the server participates. The requirements are minimal:

Log the X-Invocation-Id header from incoming requests. Most API frameworks can do this with a single middleware or access log configuration change.
Return a request ID in every response (e.g., X-Request-Id). Many frameworks generate this by default.
Propagate both IDs into the server's own tracing and logging. If the API uses structured logging or distributed tracing, attach the invocation ID as a field or span attribute so it appears in the same search results.

If the server doesn't log the invocation ID, the CLI-side correlation still works (you can grep your CLI logs by invocation ID), but you lose the end-to-end join. If the server doesn't return a request ID, the CLI can still log its own invocation ID, but the user can't hand a request ID to the API team and say "look this up".

The ideal state is both: the CLI sends its ID, the server sends its ID, and both sides log both. This is a two-line change on the server and it makes every future debugging session faster.

Structured syslog

Every invocation logs a structured lifecycle to syslog:

Startup

[f7a3b1c2] startup: cli: mycli device list --status Active
[f7a3b1c2] startup: perl: 5.36.0 on linux
[f7a3b1c2] startup: env: API_KEY=ab12****, SERVER_URL=https://api.internal
[f7a3b1c2] config: key source: file (~/.config/mycli/api-key)
[f7a3b1c2] config: format: table, fields: all, tty

The API key is masked -- first four characters, then ****. Enough to identify which key is in use without leaking it to logs.

HTTP requests

[f7a3b1c2] http: GET https://api.internal/v1/devices
[f7a3b1c2] http: 200 OK (142ms, application/json, 8431 bytes) req=a1b2c3d4

Every request/response pair is logged with method, URL, status, elapsed time, content type, response size, and the server's request ID.

Shutdown

[f7a3b1c2] device_list: done (387ms, 24 results, 2 requests, cache 3/1)

One line summarising the entire command: wall-clock time, result count, number of HTTP requests made, and resolve cache statistics (3 items cached across 1 resource type).

Always format, conditionally emit

A subtle design choice: the logger always formats every message, even when logging is disabled. Only the syslog() call is conditional:

sub _emit {
    my ($self, $priority, $context, $detail) = @_;
    my $msg = sprintf '[%s] %s: %s', $self->{_id}, $context, $detail;
    syslog($priority, '%s', $msg) if $self->{_enabled};
    return $msg;
}

This means formatting bugs surface during normal development, not only when someone enables logging in production. The cost is negligible -- sprintf is fast.

A note on philosophy: when syslog is enabled, all levels are transmitted -- info, debug, error. There is no runtime knob to suppress debug messages. The belief behind this is that logging should always be on in production, not enabled after a problem is suspected. The time you most need debug-level detail is exactly the time you can't reproduce the issue. You can never have too much log detail, with the obvious exception of user or employee personal data, which should never be logged at any level.

What not to log

The API key masking (ab12****) is one example of a broader principle: log enough to identify, not enough to exploit.

Credentials and secrets -- mask API keys, tokens, and passwords. Show enough characters to distinguish between keys (we show four), then mask the rest. Apply the same caution to environment variables and URL query parameters that may carry tokens.
Request and response bodies -- don't log them. They may contain customer data, PII, or sensitive business logic. Log metadata (status, timing, size) but never content. Body inspection is what LWP::ConsoleLogger is for -- interactive, ephemeral, on-demand.

StatsD metrics

Every command emits a standard set of metrics to StatsD:

Per-command metrics

Metric	Type	Description
`mycli.command.<cmd>.calls`	counter	Command invocations
`mycli.command.<cmd>.timing`	timing	Wall-clock duration (ms)
`mycli.command.<cmd>.results`	gauge	Items returned
`mycli.command.<cmd>.errors`	counter	Unhandled exceptions

The command name is derived from the class hierarchy: MyCLI::App::Command::device::list becomes device_list.

Per-HTTP metrics

Metric	Type	Description
`mycli.http.calls`	counter	Total HTTP requests
`mycli.http.timing`	timing	Per-request duration (ms)
`mycli.http.errors`	counter	Non-2xx responses
`mycli.http.status.<code>`	counter	Per-status-code breakdown

Operational metrics

Metric	Type	Description
`mycli.auth.key_source.<src>`	counter	Where the API key came from
`mycli.auth.url_source.<src>`	counter	Where the server URL came from
`mycli.config.file.found`	counter	Config file was loaded
`mycli.config.file.none`	counter	No config file found
`mycli.output.format.<name>`	counter	Output format selection

What this tells you

The metrics answer questions that logs can't:

What commands are people actually using? -- sort command.*.calls by count. If nobody uses crossconnect list, don't spend time improving it.
Is the API getting slower? -- http.timing percentiles over time. The CLI is seeing the same latency as your users, including TLS negotiation and DNS.
Are auth errors increasing? -- http.status.401 spike means keys are being rotated or revoked.
How are people authenticating? -- auth.key_source.env vs auth.key_source.file tells you whether your team has adopted the recommended credential flow.
What output formats matter? -- if 90% of usage is output.format.json, your table renderer is mostly aesthetic.

Metric naming conventions

Prefix every metric with the tool name (mycli.*) to avoid collisions in a shared StatsD instance. Use a consistent dot-separated hierarchy (mycli.command.<cmd>.calls) rather than flat names -- this makes metrics discoverable by browsing the tree. Watch cardinality: derive command names from a fixed set (like the class hierarchy) rather than user input, and keep dynamic segments like http.status.<code> to naturally bounded sets.

Verbose mode and the debugging spectrum

The three layers above cover durable observability -- data that outlives the terminal session. But the most common debugging scenario is someone at a keyboard wondering why their command isn't working. For this, the CLI has three levels of HTTP visibility:

Level 1: Silent (default)

No HTTP output. The user sees formatted results only. Syslog and metrics still capture everything in the background.

Level 2: `--verbose`

--> GET https://api.internal/v1/devices?status=Active
<-- 200 OK (142ms, application/json, 8431 bytes)

Printed to STDERR so it doesn't interfere with STDOUT piping. Shows method, URL, status, timing, and size. This is enough for "is my request hitting the right endpoint?" and "why is this slow?".

The design choice here is restraint. Verbose mode shows the shape of the conversation -- what was asked, what came back, how long it took. It deliberately omits headers and bodies. This keeps the output scannable when a command makes multiple requests.

Level 3: `LWP::ConsoleLogger::Everywhere`

When --verbose isn't enough -- when you need to see request headers, response headers, and full bodies -- plug in LWP::ConsoleLogger::Everywhere:

# From source
perl -MLWP::ConsoleLogger::Everywhere -Ilib bin/mycli device get 42

# Fatpacked binary (with API key redaction)
LWPCL_REDACT_HEADERS=Authorization \
  PERL5OPT="-MLWP::ConsoleLogger::Everywhere" \
  ./mycli-packed device get 42

This is a full HTTP trace: every header, every byte of the request and response body, formatted and syntax-highlighted. It's invaluable for debugging serialisation issues, unexpected headers, or auth failures.

The reason we don't build this into --verbose is that it's a different tool for a different job. Verbose mode is for operators; full HTTP tracing is for developers debugging the CLI itself. The -M flag means the capability is always available without cluttering the option namespace or adding a dependency that most users will never need.

Error reporting and surfacing correlation IDs

When the API returns an error, the CLI needs to show the user enough information to report the problem without overwhelming them with internals. Our error output includes the server's request ID:

Error: 403 Forbidden
  The API key does not have permission to access this resource.
  Request ID: a1b2c3d4

The request ID is the bridge between the user and the operations team. "It gave me a 403, request ID a1b2c3d4" is a complete bug report. The on-call engineer greps the server logs for a1b2c3d4, finds the full request context (authenticated user, requested resource, policy that denied access), and resolves the issue -- without asking the user to reproduce it, enable verbose mode, or paste terminal output.

The invocation ID doesn't appear in normal error output -- it's an internal correlation key for log analysis, not a user-facing artifact. If syslog is enabled, the invocation ID is already in the logs alongside the request ID, providing the join in both directions.

The execution wrapper

All of this comes together in the base command's execute() method, which wraps every leaf command:

sub execute {
    my ($self, $opt, $args) = @_;
    my $cmd     = $self->_metric_name;
    my $start   = Time::HiRes::time();

    $self->logger->info($cmd, 'start');
    $self->metrics->increment("command.$cmd.calls");

    eval { $self->_execute($opt, $args) };

    my $elapsed_ms   = int((Time::HiRes::time() - $start) * $MS_PER_SEC);
    my $requests     = $self->client->request_count;
    my $result_count = $self->{_result_count};

    $self->metrics->timing("command.$cmd.timing", $elapsed_ms);
    $self->metrics->gauge("command.$cmd.results", $result_count)
        if defined $result_count;

    if (my $err = $@) {
        $self->metrics->increment("command.$cmd.errors");
        $self->logger->error($cmd, $err);
        die $err;
    }

    $self->logger->info($cmd, sprintf 'done (%dms, %s results, %d requests)',
        $elapsed_ms, $result_count // 'n/a', $requests);
}

Leaf commands implement _execute() and don't think about observability at all. They call $self->client->get(...), render results, and return. The wrapper handles timing, logging, metrics, and error reporting. This is the single place where the observability contract is enforced -- no leaf command can accidentally skip it.

Design principles

A few principles that guided these choices:

Zero cost when off. Logging and metrics are lazy-initialised. If you never enable syslog or configure StatsD, the modules aren't even loaded.
Instrument the framework, not the features. Leaf commands don't contain observability code. The base command wrapper and HTTP client handle everything. New commands get full instrumentation for free.
Correlate by default. The invocation ID requires no opt-in. Every request carries it. The server just has to log it.
Separate concerns by audience. Verbose mode is for the person at the terminal. Syslog is for the person investigating after the fact. Metrics are for the person watching trends. Don't conflate them.
Don't build what you can plug in. Full HTTP tracing via LWP::ConsoleLogger is better than anything we'd build ourselves. Keep verbose mode lean and let the specialist tool handle the rest.

Testing observability

Instrumentation code is easy to write and easy to break silently. If nobody notices that the invocation ID stopped appearing in syslog, it might be months before an incident reveals the gap. A few testing strategies:

Unit test the logger's formatting. The _emit method returns the formatted message even when syslog is disabled. Assert that the invocation ID, context, and detail appear in the expected format.
Unit test metric emissions. Mock the StatsD client and assert that command.<cmd>.calls is incremented, command.<cmd>.timing receives a value, and command.<cmd>.errors fires on exception. These are contract tests -- they verify that the execution wrapper keeps its promises.
Assert the invocation ID propagates. Mock the HTTP client and verify that outgoing requests carry the X-Invocation-Id header with the same value the logger is using.
Integration test the full lifecycle. Run a command against a mocked API, capture STDERR with --verbose, and assert the --> / <-- lines appear with the expected method, URL, and status.

The "always format, conditionally emit" pattern helps here: the logger exercises all formatting code paths in every test run, even when syslog isn't available in the test environment.

Tracing an incident: a walkthrough

Here's how the instrumentation plays out during a real debugging scenario. This walkthrough exercises every layer described above: error output with a request ID, the metrics dashboard, syslog correlation, and two-way ID join.

A user reports: "mycli device list is failing intermittently." They include the error message:

Error: 503 Service Unavailable
  The API is temporarily unable to handle the request.
  Request ID: e4f5a6b7

Step 1: Find the server side. The on-call engineer greps the API logs for e4f5a6b7 and finds the request hit a backend that was in the middle of a deployment. The 503 was a transient error from a rolling restart.

Step 2: Assess the blast radius. But is it just this one user? The engineer checks the CLI dashboard: mycli.http.status.503 shows a spike over the last 20 minutes, coinciding with the deployment window. It's not one user -- it's everyone hitting that backend.

Step 3: Find the CLI side. The server log for e4f5a6b7 also contains the X-Invocation-Id: c8d9e0f1. Grepping the centralised CLI logs for c8d9e0f1 shows the full client-side context: which command was run, which user ran it, what arguments were passed, and that the request took 12 seconds before returning 503 (suggesting the backend was hanging, not failing fast).

Step 4: Verify the fix. After the deployment completes, the 503 counter drops to zero. The engineer confirms on the dashboard that error rates are back to baseline across all commands.

Total debugging time: minutes. Without instrumentation, this would have been a ticket saying "it's broken sometimes" followed by back-and-forth to reproduce, enable verbose mode, and collect output.

Summary

 +-------------------+    X-Invocation-Id    +-------------------+
 | mycli             |-----------------------| API               |
 |                   |    X-Request-Id       |                   |
 | - syslog [id]     |<----------------------| - access log [id] |
 | - StatsD metrics  |                       | - request trace   |
 | - verbose STDERR  |                       |                   |
 +-------------------+                       +-------------------+
         |                                           |
         v                                           v
 +-------------------+                       +-------------------+
 | Centralised logs  |<--- grep by ID ------>| Centralised logs  |
 | Metrics dashboard |                       | APM / tracing     |
 +-------------------+                       +-------------------+

Key takeaways:

Your CLI is part of the distributed system. If it talks to an API, it's a participant in the request path -- treat it like a service, not a script.
A correlation ID is the single most valuable thing you can add. One random string, sent as an HTTP header, ties client logs to server logs. Everything else builds on this.
Separate layers by audience. Verbose mode for the developer at the terminal, structured logs for the on-call engineer after the fact, metrics for dashboards and alerting. Same data, different consumers, different lifetimes.
Instrument the framework, not the features. A single execution wrapper gives every command logging, metrics, and error reporting for free. Leaf commands shouldn't contain observability code.
The server needs to participate. Log the client's invocation ID, return your own request ID. Without this, correlation is one-sided.
Log everything except secrets and personal data. Mask credentials, never log request bodies, and keep logging always on -- the time you need debug detail is the time you can't reproduce the issue.
Start simple, keep the door open. Wrap your logging backend so the rest of the codebase never touches it directly. Start with whatever works for your deployment targets today -- Sys::Syslog, Fluent::Logger, a file. When your infrastructure is ready for OpenTelemetry or wide events (see Appendix A), the swap is localised.

The investment is small: a correlation ID, a handful of counters, and a logging lifecycle. The return is that your CLI becomes a first-class citizen in your operational tooling rather than a blind spot.

References

LWP::ConsoleLogger::Everywhere -- drop-in full HTTP tracing for any LWP-based client
NO_COLOR -- convention for suppressing colour output, relevant to verbose/debug output formatting
OpenTelemetry -- industry-standard observability framework; Perl SDK on CPAN
StatsD -- the metrics aggregation protocol used for CLI instrumentation
Shipping a Perl CLI as a single file with App::FatPacker -- companion post on building and distributing mycli

Getting started

If you want to add observability to an existing CLI tool, here's a practical order of operations. Each step is independently useful -- you don't need to do all five before any of them pay off.

Generate a random invocation ID at startup. Eight hex characters is enough. Send it as an X-Invocation-Id header on every HTTP request. This single change makes every future debugging session easier.
Set User-Agent to <tool>/<version>. Trivial, and it lets the server side filter by CLI version without any custom header support.
Log three lifecycle events. Startup (command line, environment, config source), each HTTP request/response (method, URL, status, timing), and shutdown (duration, result count). Even logging to STDERR behind a --debug flag is better than nothing.
Emit one counter per command invocation. If you have StatsD or a metrics collector, mycli.command.<cmd>.calls is the single most useful metric -- it tells you what people are actually using. If you don't have a metrics pipeline, a cheap alternative is to emit key=value pairs in your log lines (e.g. command=device_list duration_ms=387 status=ok) -- most log aggregation tools, including Grafana itself, can extract fields from these lines and build charts and dashboards without a separate metrics stack.
Wrap your command entry point. Move timing, logging, and metric emission into a single wrapper around leaf command execution. New commands get instrumentation for free, and no leaf command can accidentally skip it.

Appendix A: Wide events

Our implementation uses separate syslog lines for each lifecycle phase (startup, HTTP, shutdown) and separate StatsD counters for aggregation. This works, but it means correlating data across multiple log lines at query time -- you need the invocation ID to join them together.

An increasingly popular alternative is the wide event (or what Stripe called a canonical log line in 2019): a single, information-dense structured record emitted once per unit of work, containing every attribute you collected along the way. Instead of five syslog lines and ten StatsD counters, you emit one event with fields like command=device_list duration_ms=387 results=24 http_requests=2 http_status=200 auth_source=file output_format=table cache_hits=3.

The advantages are significant:

Faster queries -- all the data is colocated in one record. No joins, no correlation by ID.
Ad hoc analysis -- during an incident you can group by any combination of fields without having pre-defined a metric for it.
Simpler pipeline -- one event replaces multiple log lines and multiple metric emissions. Less code, fewer failure modes.

We didn't take this approach because our logging infrastructure is syslog-based and doesn't support high-cardinality structured queries. If you have access to a columnar store (Honeycomb, ClickHouse, a data warehouse), wide events are the stronger choice. The execution wrapper already collects all the data in one place -- the change would be emitting it as a single structured record instead of spreading it across syslog and StatsD.

For more on wide events, see A Practitioner's Guide to Wide Events and All You Need Is Wide Events, Not Metrics.

Appendix B: Why Sys::Syslog and not a logging framework?

Perl has several mature logging frameworks -- Log::Any, Log::Dispatch, Log::Log4perl -- any of which would be a fine choice here. We went with Sys::Syslog directly. This is an opinionated trade-off worth explaining.

What Sys::Syslog gives you

Syslog is a solved problem on servers and jumpboxes. The local syslog daemon (rsyslog, syslog-ng, journald) handles buffering, rotation, compression, and forwarding to a central log aggregator. The CLI doesn't need to know where the logs go, how to authenticate to a remote endpoint, or what to do when the network is down. It calls syslog(), the daemon takes it from there. This is a clean separation of concerns: the application produces structured messages, the infrastructure handles transmission.

There are no extra dependencies beyond core Perl. No configuration files, no adapter registration, no output plugin selection. The logger module is ~50 lines. For a fatpacked binary where every dependency has a cost, this matters.

What a framework would give you

A framework like Log::Any or Log::Dispatch provides output abstraction: you write $log->info(...) and configure the destination at deployment time -- syslog, a file, STDERR, a network endpoint, or multiple at once. The application code doesn't change when the destination does. This is a genuine advantage when the tool runs in environments with different logging infrastructure, or when libraries you depend on already use Log::Any.

Where the trade-off bites

The opinionated choice of Sys::Syslog works well when every target machine runs a syslog daemon. It falls apart on developer laptops and desktops.

macOS ships with a syslog-compatible interface via Apple System Log, but the log viewer has moved to Console.app and the unified logging system. Messages from syslog() end up in a different place than most macOS users expect, and the retention policy may discard them quickly. On Windows, there is no syslog daemon at all.

You have two choices here:

Accept the gap. Detect the platform at startup and disable syslog on macOS and Windows. The CLI still has --verbose for interactive debugging, and StatsD metrics still flow if a collector is configured. You lose durable logging on developer machines, but you avoid adding complexity to the CLI itself. This is the approach we took -- the primary deployment targets are Linux servers and jumpboxes where syslog is reliable.

Solve logging everywhere. Use a framework like Log::Dispatch with pluggable outputs: syslog on Linux, a file on macOS, a network endpoint everywhere. This means the CLI now owns the full logging pipeline: transport selection, buffering when the destination is unavailable, possibly TLS for log data in transit, possibly client-side authentication to a log aggregator. Each of these is individually tractable, but collectively they add configuration surface, failure modes, and dependencies that the syslog approach avoids entirely.

There is a middle ground: In an organization with tight control of staff laptops and desktops (as is increasingly common), solving the logging problems in the CLI or having a local logging daemon is very feasible.

Another opinionated choice: Fluent::Logger

If your infrastructure runs Fluentd or Fluent Bit, Fluent::Logger is worth considering as an equally opinionated alternative to Sys::Syslog. It sends structured events directly to a Fluent collector over a local socket or TCP, which then handles routing, buffering, and delivery to whatever backend you use (Elasticsearch, S3, a data warehouse). Like Sys::Syslog, it delegates transport to purpose-built infrastructure. Unlike syslog, the events are natively structured -- key-value pairs rather than format strings -- which makes the path to wide events shorter.

The advantage of making an opinionated backend choice -- whether that's Sys::Syslog, Fluent::Logger, or something else entirely -- is that it removes abstraction layers that aren't adding value. If you know where your logs go, a framework like Log::Any is indirection without a benefit. You pay for adapter registration, output plugin configuration, and an extra dependency, but you only ever use one backend. An abstraction earns its keep when requirements are genuinely uncertain; when they're known, it's just ceremony.

The elephant in the room: OpenTelemetry

Of course, the industry is converging on OpenTelemetry as the standard answer to all of the above. Perl has solid support via the OpenTelemetry distribution on CPAN. If your organisation already runs an OTel collector, plumbing it into your CLI from the start is the right long-term bet.

Keeping the door open

The important thing is that the rest of the codebase never touches Sys::Syslog directly. Every module calls $self->logger->info(...), ->error(...), or ->debug(...). The actual syslog calls are isolated to two private methods in the logger class: _emit (which formats and transmits) and _open_syslog (which calls openlog). Swapping Sys::Syslog for Log::Dispatch, Fluent::Logger, or an OpenTelemetry log bridge would mean changing those two methods and nothing else.

This is the pragmatic middle path: start with the simplest backend that works for your deployment targets, but wrap it so the choice is easy to revisit. For a server-side CLI deployed to a controlled fleet, Sys::Syslog is a sensible default -- zero-config, zero-dependency, and delegates the hard problems to purpose-built infrastructure. If the tool later needs to run on developer laptops as a primary deployment target, the logging framework swap is a localised change rather than a rewrite.

Discussion

Have you plumbed observability into a CLI tool? I'd love to hear what worked and what didn't -- whether you went with OpenTelemetry traces, wide events from day one, or bolted logging on after the fact. What was the moment that made you invest in CLI instrumentation? Was it an incident that was hard to trace, a question about adoption you couldn't answer, or just good hygiene? And if you haven't done it yet -- what's holding you back?

Top comments (2)

Daniel Hofman • Apr 8

The ephemeral nature of CLI output is the observability gap - it disappears after execution unless you capture it. Structured logging with proper log levels and central aggregation turns the CLI into an auditable system instead of a fire-and-forget tool. The developer-at-terminal vs ops-at-dashboard split is real and most tooling ignores it.

Dean Hamstead • Apr 9

Absolutely. You've understood me :)