Delving into Rust's capabilities, this post guides you through the process of crafting a command-line utility, wc-rs
, mirroring the functionality of the classic wc
tool with a modern twist.
Introduction
The motivation to build wc-rs
comes from John Crickett's build your own wc
tool coding challenge. Solving these challenges is a great way of learning different concepts, in my opinion. So, here we are starting with our first one.
The challenge is to build your own version of the Unix command line tool wc. The functional requirements for wc are concisely described by it’s man page - give it a go in your local terminal now:
man wc
The TL/DR version is: wc
– word, line, character, and byte count.
So, let's get Rusty!!
Code Walkthrough
The wc-rs
program is designed to be familiar to those who have used the original wc
command, but under the hood, it leverages Rust's advanced features for improved performance and reliability.
The complete code is available at gauravgahlot/getting-rustywc-rs.
Dependencies
use clap::Parser;
We use the clap crate to simplify command-line argument parsing, making it effortless to define and handle the flags and options our program accepts.
The CLI Struct
#[derive(Parser)]
#[command(name = "wc-rs")]
#[command(version = "0.1.0")]
#[command(about="The wc-rs utility displays the count of lines, words, characters, and bytes contained in each input file", long_about=None)]
struct CLI {
/// The number of bytes in each input file is written to the
/// standard output. This will cancel out any prior usage of the
/// -m option.
#[arg(short = 'c')]
bytes: bool,
/// The number of lines in each input file is written to the
/// standard output.
#[arg(short)]
lines: bool,
/// The number of words in each input file is written to the
/// standard output.
#[arg(short)]
words: bool,
/// The number of characters in each input file is written to the
/// standard output.
#[arg(short = 'm')]
chars: bool,
files: Option<Vec<String>>,
}
Our CLI
struct provides the skeleton for the command-line interface of wc-rs
. With clap, beyond just defining the struct, we annotate it with information like the program's name, version, and a brief description.
Flags and Options
Each field in the CLI
struct represents a command-line flag or option, complete with descriptive comments that clap uses to generate help messages:
-
-c
for byte count -
-l
for line count -
-w
for word count -
-m
for character count
The files
field holds an optional list of files to process. If it's empty, wc-rs
reads from standard input.
The Output Struct
#[derive(Default)]
struct Output {
bytes: u64,
lines: u64,
words: u64,
chars: u64,
file: Option<String>,
}
As we prepare to crunch numbers, we store our results in the Output
struct. This is where the counts of bytes, lines, words, and characters will be accumulated, along with an optional filename for display purposes.
The Main Event
The main
function is where we tie everything together. We parse the command-line arguments, iterate over files (or standard input), and calculate the statistics. We handle files and potential I/O errors gracefully, reflecting Rust's commitment to safe and explicit error management.
fn main() -> io::Result<()> {
let cli = CLI::parse();
let mut output: Vec<Output> = vec![];
if let Some(files) = &cli.files {
for file in files.iter() {
let f = File::open(file)?;
let reader = io::BufReader::new(f);
let mut out = Output::new(file);
out.bytes = fs::metadata(file)?.len();
process_lines(reader, false, &mut out);
output.push(out);
}
} else {
let stdin = io::stdin();
let reader = stdin.lock();
let mut out = Output::default();
process_lines(reader, true, &mut out);
output.push(out);
}
print_output(&cli, output);
Ok(())
}
Processing the Input
The process_lines
function is at the heart of wc-rs
. It takes a
reader—anything that implements the BufRead
trait—and an Output
struct by mutable reference, updating the counts as it iterates over the lines in the text.
fn process_lines<T: BufRead>(reader: T, from_stdin: bool, out: &mut Output) {
for line in reader.lines() {
match line {
Ok(input) => {
out.lines += 1;
if from_stdin {
out.bytes += input.as_bytes().len() as u64;
}
let line_words: Vec<_> = input.split_terminator(" ").collect();
out.words += line_words.len() as u64;
line_words.iter().for_each(|w| out.chars += w.len() as u64);
}
Err(e) => eprintln!("{}", e),
}
}
}
We account for characters and bytes differently depending on whether we're reading from a file or from standard input to ensure accuracy. Word counts are obtained by splitting lines on spaces, highlighting Rust's iterator and collection capabilities.
Showing the Numbers
Finally, print_output
is responsible for displaying the collected counts. Following the flags provided, it either prints specific stats or defaults to all counts if no flags are specified.
fn print_output(cli: &CLI, output: Vec<Output>) {
let mut print_all = false;
let mut total = Output::default();
for out in &output {
if !cli.lines && !cli.words && !cli.chars && !cli.bytes {
print_all = true;
if let Some(f) = &out.file {
print!("\t{}\t{}\t{}\t{}\n", out.lines, out.words, out.bytes, f);
} else {
print!("\t{}\t{}\t{}\n", out.lines, out.words, out.bytes);
}
} else {
if cli.lines {
print!("\t{}", out.lines);
}
if cli.words {
print!("\t{}", out.words);
}
if cli.bytes {
print!("\t{}", out.bytes);
} else if cli.chars {
print!("\t{}", out.chars);
}
if let Some(f) = &out.file {
print!("\t{}\n", f);
} else {
println!();
}
}
total.lines += out.lines;
total.words += out.words;
total.bytes += out.bytes;
total.chars += out.chars;
}
if output.len() > 1 {
if print_all {
print!("\t{}\t{}\t{}\t{}\n", total.lines, total.words, total.bytes, "total");
} else {
if cli.lines {
print!("\t{}", total.lines);
}
if cli.words {
print!("\t{}", total.words);
}
if cli.bytes {
print!("\t{}", total.bytes);
} else if cli.chars {
print!("\t{}", total.chars);
}
println!("\ttotal");
}
}
}
Getting wc-rs
Up and Running
To try out wc-rs
, you'll compile and install it with cargo, Rust's build system and package manager. Once installed, running the program is just like using the traditional wc
.
Installation
You can install the CLI using the below command:
cargo install --path .
Help
By using the --help
option you can obtain the usage help for the CLI:
wc-rs --help
The wc-rs utility displays the count of lines, words, characters, and bytes contained in each input file
Usage: wc-rs [OPTIONS] [FILES]...
Arguments:
[FILES]...
Options:
-c The number of bytes in each input file is written to the standard output. This will cancel out any prior usage of the -m option
-l The number of lines in each input file is written to the standard output
-w The number of words in each input file is written to the standard output
-m The number of characters in each input file is written to the standard output
-h, --help Print help
-V, --version Print version
Examples
- Getting details of a single file:
wc-rs test.txt
5 5 22 test.txt
#lines #words #bytes
- Getting details for multiple files:
wc-rs -wm test.txt Cargo.toml
5 16 test.txt
25 138 Cargo.toml
30 154 total
- Getting details for data from standard input
wc-rs
data from std input
1 4 19
- Display number of lines and characters only
wc-rs -lm test.txt
5 16 test.txt
Conclusion
wc-rs
might be a simple tool, but it embodies the elegance and robustness of Rust for command-line applications. Through an exploration of this utility, we’ve seen the power of meticulous error handling, the convenience of clap
for argument parsing, and how straightforward it can be to work with files and strings in Rust.
Whether you're an experienced developer or new to the command line, wc-rs
is a testament to Rust's capability to reinvent classic tools with a modern and reliable twist.
Top comments (0)