Makoto I.

Posted on Jul 21 • Edited on Aug 27

Reflecting on 7+ Years of Crafting CLI Tools in Rust

#rust #opensource #cli #tooling

Introduction

I (@ynqa) have been focusing on developing CLI tools using Rust for the past 7 years. I'd like to reflect on what tools I've actually built and what technical challenges I've tackled during this journey.

kubernetes-rust

My Rust journey began about 7 years ago (around 2018) when I decided to learn Rust by writing a Kubernetes API client from scratch.

Through developing the Kubernetes client, I learned fundamental Rust concepts:

Easy dependency management using Cargo
Utilizing move semantics and smart pointers in Rust
Concise error handling with Result and ? operator
Simple JSON serialization using derive and serde
- Loading configuration files like kubeconfig
- JWT token generation for OAuth2 authentication

Fortunately, this project caught the attention of other developers. Particularly, @clux forked my repository and made significant improvements. It has now evolved into a new crate called kube and has been adopted as an official CNCF (Cloud Native Computing Foundation) sandbox project. My initial commits remain in the history, and I'm moved to see my contribution living on in a transformed way.

promkit

After developing the Kubernetes Client in Rust, I was primarily working as an infrastructure engineer.

What I noticed during this time was that most of the work involved understanding the state or situation of various systems through trial and error using commands. This includes activities like log monitoring, system status checking, and API response investigation. Such work requires not just executing commands, but interactive trial and error such as adjusting filter conditions based on results and extracting log patterns. I wanted tools to streamline these processes.

This led me to work on promkit. promkit serves as the foundation for TUI design in the JSON filter and log search tool suite I'll describe later. I actively incorporated promkit into the development of tools like jnv, sig, logu, and empiriqa mentioned below. This achieved consistent operation systems and UI design across projects while improving development efficiency.

Examples

First, let me introduce the basic usage of promkit. Here's a simple example of using promkit for interactive input. This code displays a prompt that asks the user for input and validates the input content.

use promkit::{preset::readline::Readline, suggest::Suggest, Prompt};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let ret = Readline::default()
        // Set the prompt title
        .title("Hi!")
        // Set suggestion candidates
        .enable_suggest(Suggest::from_iter([
            "apple",
            "applet",
            "application",
            "banana",
        ]))
        // Validate input values
        // Here we require the input length to exceed 10 characters
        // If validation fails, an error message is displayed
        .validator(
            |text| text.len() > 10,
            |text| format!("Length must be over 10 but got {}", text.len()),
        )
        .run()
        .await?;
    println!("result: {:?}", ret);
    Ok(())
}

When you run this code, an interactive prompt like the following is displayed:

As you can see, promkit makes it easy to build interactive input UIs.

Concepts

promkit adopts a modularized architecture and is internally divided into several crates:

promkit-core
- Provides basic functionality like terminal control and pane management
promkit-widgets
- Various UI component groups like text, listbox, and tree display
promkit
- Provides high-level preset UIs and utilities (API that users directly use)
promkit-derive
- Derive macros to simplify input form construction

Particularly for promkit, rich interactive input UI components are provided as presets (such as text input fields, Yes/No confirmation, password input, multiple selection checkboxes, tree display, etc.), and developers can build practical prompts just by combining them.

For example:

Many diverse components are provided.

Furthermore, a framework for creating custom components is also prepared, allowing you to add widgets with custom input forms or custom display logic. For example, here shows creating a prompt that includes spinner animations using async.

If you want to know more about the implementation concepts of promkit, please check here.

jnv

After releasing promkit, the first thing I worked on was jnv, a CLI tool for interactively browsing and filtering JSON data.

jnv solves the trial and error of exploring huge JSON data and creating queries. Previously, you had to rewrite and execute jq commands multiple times, but with jnv, you can navigate JSON in tree view while editing jq filters and previewing results on the same screen.

It also has input completion functionality. For completion, it analyzes the current input content and displays possible tokens as candidates (such as object key names following dots or array indices like [0]). While it's not completion for all jq features, dynamic completion is performed for basic patterns like Identity filters (.), object keys, and array indices. This feature allows you to intuitively build queries without typos in key names, even with huge JSON.

Furthermore, settings like key bindings are managed in TOML files, allowing users to customize to their preferences.

A technical challenge I faced during jnv development was eliminating dependencies on C libraries. Initially, I used an FFI crate that called C's libjq from Rust (later I also created a library called j9). However, considering build ease, I switched to a library called jaq, which is a jq-clone.

jnv received an unexpectedly positive response immediately after release and rapidly gained stars on GitHub.

This success experience made me realize the need for developer tools, and I became even more motivated in subsequent OSS development.

sig

Next was the development of a tool called sig. Following the success of jnv, I aimed for a more general-purpose streaming data search tool.

sig is an interactive tool for real-time searching (grep) of streaming data like logs or standard output. The name comes from the initials of "streaming interactive grep". It receives command output sequentially on the terminal, and when you input keywords, matching lines are highlighted on the spot and filter results are automatically updated. It's designed as an "interactive grep" that's convenient for log monitoring.

The inspiration for sig came from the need to "adjust filter conditions while watching flowing logs". Traditionally, when looking for keywords while following logs with tail -f, you need to redo grep every time you change conditions, and you can't go back to missed logs. Also, in Kubernetes environments, tools like kubectl logs or stern stream logs, but they have similar problems with difficult filter trial and error. sig was created precisely to improve this point, realizing incremental search for real-time log streams.

There are two notable features.

The first is the command Retry function. While sig itself assumes pipe input, when you specify an external command (like a log retrieval command) with the --cmd option, sig launches that command and captures its output. Furthermore, when the user presses Ctrl+R, it re-executes that command to retrieve past logs again.
This makes it possible to retrieve and re-search "logs that flowed away during filter adjustment".
Real-time streams often result in "I missed that log..." when changing search conditions, but sig cleverly solves this problem by reproducing past logs.

The second is Archived mode. In normal pipe processing, you can't seek in streams, so you can't search content that has already flowed. sig stores the most recent N entries (default 1000) in an internal buffer, and when the user presses Ctrl+F at any time, it switches to archive mode. In this mode, stream input is temporarily paused, and offline search becomes possible from accumulated logs. You can scroll through past logs or search backwards with new search terms, so you can feel safe even if you miss the stream. Furthermore, by specifying -a/--archived in startup options, it can also be used as a tool to grep static text files.

Thus, sig is a powerful tool for developers that balances real-time search of streaming data with re-search of past logs.

logu

logu is a CLI tool that automatically extracts patterns from unstructured log messages. When you input a set of log files or streaming logs, it learns common message structures, extracts variable parts, and displays them grouped by log type (pattern). This is a Rust implementation of Drain3 from the machine learning field of log analysis, useful for extracting "repeatedly appearing error message patterns" buried in large amounts of logs.

While sig searches stream logs microscopically, logu analyzes entire logs macroscopically to extract patterns. For example, when "similar error messages appear repeatedly" in millions of lines of log files, it automatically detects those patterns and displays them with variable parts (like timestamps or IDs) replaced. This allows you to grasp log trends and frequent errors at a glance.

Other log parsers besides Drain3 are summarized (mainly Python implementations) in logpai/logparser.

empiriqa

empiriqa is a TUI tool for interactively building and experimenting with Unix command pipelines (command name is epiq). It streamlines the work of developers rewriting and re-executing pipelines in the shell through intuitive TUI operations.

In empiriqa, you can edit pipelines stage by stage on screen. Each stage's command can be individually edited, added, deleted, or disabled, and pressing Enter executes the entire pipeline at that point and displays results.

The biggest advantage is being able to experiment stage by stage. When you want to insert a filter in the middle of a pipe, you can add a new stage, write the command, and execute immediately. Since each stage can be disabled, you can also compare results with and without commands with one touch.

kuqu

kuqu is a tool for treating various resources in Kubernetes clusters (Pods, Nodes, Services, custom resources, etc.) like database tables and executing SQL-like queries. It enables aggregation and joins that are difficult with kubectl, and can be called a "Kubernetes version SQL query engine".

kuqu "SELECT pod.metadata.name, pod.spec.nodeName
     FROM pod JOIN node ON pod.spec.nodeName == node.metadata.name"
+--------------------------------------------+--------------------+
| pod.metadata[name]                         | pod.spec[nodeName] |
+--------------------------------------------+--------------------+
| coredns-6f6b679f8f-8rkh9                   | kind-control-plane |
| coredns-6f6b679f8f-kgbjs                   | kind-control-plane |
| etcd-kind-control-plane                    | kind-control-plane |
| kindnet-khqtr                              | kind-control-plane |
| kube-apiserver-kind-control-plane          | kind-control-plane |
| kube-controller-manager-kind-control-plane | kind-control-plane |
| kube-proxy-ccth7                           | kind-control-plane |
| kube-scheduler-kind-control-plane          | kind-control-plane |
+--------------------------------------------+--------------------+

While kubectl, the standard Kubernetes CLI, excels at retrieving individual resources and filtering (label selectors, etc.), it's difficult to perform advanced queries or aggregation across multiple resources. For example, operations like "counting the number of Pods that meet specific conditions" or "matching Pod and their placement Node information in a list (JOIN)" are not straightforward with kubectl alone. I focused on this and thought that if cluster state could be queried in the form of SQL queries, cross-sectional analysis of Kubernetes resources would become dramatically easier. This led to the creation of kuqu, which takes an approach of dynamically schematizing cluster resource information, putting it into data frames, and querying with a SQL engine.

At the core of kuqu is Apache DataFusion, a database query engine from the Apache Arrow project. DataFusion is a high-speed data frame/SQL execution engine written in Rust, equipped with everything from SQL Parser to query execution and optimizer.

kuqu leverages this DataFusion to register various resource objects retrieved from Kubernetes clients as tables. For example, a table called pods contains Pods existing in the current namespace as rows, and a nodes table contains Node information. What's interesting is dynamic schema inference - Kubernetes resources have different fields for each type, but kuqu analyzes them from JSON representing resources and maps them as queryable columns on DataFusion. This allows direct access to nested JSON fields like spec.nodeName or status.phase using dot notation.

As actual usage examples, kuqu enables queries like:

Pod listing

SELECT metadata.name,
       metadata.namespace
FROM pods
WHERE status.phase = 'Running'

JOIN example (combining Pod and Node)
- This query combines each Pod's host Node with that Node's instance type label for display

SELECT pod.metadata.name,
       pod.spec.nodeName,
       node.metadata.labels.'node.kubernetes.io/instance-type'
FROM pod JOIN node 
ON pod.spec.nodeName = node.metadata.name

In the future, I'm considering features like REPL (Read-Eval-Print Loop) format for interactive query execution and options to output query results in JSON format.

Conclusion

Above, I've reflected on my Rust OSS project portfolio and explained the technical characteristics and development background of each. Starting with the JSON filter viewer jnv, continuing with stream grep sig, log pattern analysis logu, pipeline experiment tool empiriqa, Kubernetes query engine kuqu, and finally promkit which serves as the foundation supporting them all, my Rust journey has expanded into tool development across diverse domains. These are unified by a consistent philosophy of supporting people's "try and understand" work through interactive tools.

This concludes my delivery of 7 years' worth of OSS development trajectory woven with Rust and its technical essence. I hope readers will visit each project's repository, try them out, and contribute (and of course GitHub Sponsors too!). Thank you for reading this far.

May you all have a wonderful Rust journey!