12.3.0 Before We Begin
In Chapter 12, we will build a real project: a command-line program. This program is a grep (Global Regular Expression Print), a tool for global regular-expression search and output. Its job is to search for the specified text in the specified file.
This project has several steps:
- Receive command-line arguments
- Read files
- Refactor: improve modules and error handling (this article)
- Use TDD (test-driven development) to develop library functionality
- Use environment variables
- Write error messages to standard error instead of standard output
If you find this helpful, please like, bookmark, and follow. To keep learning along, follow this series.
12.3.1 Why Refactor
The purpose of refactoring is to improve modularity and error handling.
Here is all the code written up to the previous article:
use std::env;
use std::fs;
fn main() {
let args:Vec<String> = env::args().collect();
let query = &args[1];
let filename = &args[2];
println!("search for {}", query);
println!("In file {}", filename);
let contents = fs::read_to_string(filename)
.expect("Something went wrong while reading the file");
println!("With text:\n{}", contents);
}
This code has four problems:
The
mainfunction is doing too much. It handles command-line parsing and file reading. The guiding principle of program design is that each function should handle only one responsibility, so the function should be split up.The variables
queryandfilenamestore program configuration, whilecontentsstores file contents. As code and variables accumulate, it becomes harder to track what each variable actually means. These values should be stored in a struct.File-reading errors are handled with
expect, which always prints an error message and panics no matter what went wrong. That is not ideal, because a file read failure might mean the file does not exist, or it might be a permissions problem. The panic message"Something went wrong while reading the file"does not help the user diagnose the issue.If
expectis used throughout the program, users will see error messages coming from Rust internals, such as"Index out of bounds", which makes it hard to understand what actually caused the problem. It is better to centralize error handling so future maintainers only need to consider one place when changing the logic, and so the error messages shown to users are understandable.
12.3.2 A Guiding Principle for Separating Concerns in Binary Programs
Many Rust binary projects run into the same organizational problem: they put too much functionality and too many responsibilities into main. The Rust community has a guiding principle for separating concerns in binary programs:
- Split the program into
main.rsandlib.rs, and put business logic inlib.rs - If the logic is small, keeping it in
main.rsis fine - As the logic becomes more complex, extract it from
main.rsintolib.rs
After this split, the responsibilities that should remain in main in this example are:
- Call the command-line parsing logic using the argument values
- Perform other configuration
- Call the
runfunction inlib.rs - Handle any problems that
runmay return
12.3.3 Separating Logic
Take another look at the code:
use std::env;
use std::fs;
fn main() {
let args:Vec<String> = env::args().collect();
let query = &args[1];
let filename = &args[2];
println!("search for {}", query);
println!("In file {}", filename);
let contents = fs::read_to_string(filename)
.expect("Something went wrong while reading the file");
println!("With text:\n{}", contents);
}
First, extract the command-line argument handling:
fn parse_config(args: &[String]) -> (&str, &str) {
let query = &args[1];
let filename = &args[2];
(query, filename)
}
-
&[String]means a slice of aVectorwhose elements areString - There is no need to print
queryandfilenamehere, so that part is removed
Then change main to call parse_config:
fn main() {
let args:Vec<String> = env::args().collect();
let (query, filename) = parse_config(&args);
let contents = fs::read_to_string(filename)
.expect("Something went wrong while reading the file");
println!("With text:\n{}", contents);
}
12.3.4 Using a Struct
parse_config returns query and filename together as a tuple, and then main splits those two tuple values back into two variables. This back-and-forth splitting and combining shows that the abstraction in the program is not ideal.
query and filename are both part of the configuration and are related to each other, so putting them in a tuple does not express that relationship well enough. A struct is a better fit:
struct Config {
query: String,
filename: String,
}
fn main() {
let args:Vec<String> = env::args().collect();
let config = parse_config(&args);
let contents = fs::read_to_string(config.filename)
.expect("Something went wrong while reading the file");
println!("With text:\n{}", contents);
}
fn parse_config(args: &[String]) -> Config {
let query = args[1].clone();
let filename = args[2].clone();
Config {
query,
filename,
}
}
In parse_config, pay attention to the types of query and filename: the parameter args has type &[String], which is a reference and therefore does not own the data, so query and filename are also references. But Config expects String, not &String, so we need to clone to gain ownership and convert &String into String.
Cloning uses more time and memory than storing references directly, but it saves us from dealing with lifetimes and makes the code more direct and simpler. In some scenarios, giving up a bit of performance in exchange for simplicity is well worth considering.
Of course, using String::from to wrap the values also works:
fn parse_config(args: &[String]) -> Config {
let query = &args[1];
let filename = &args[2];
Config {
query: String::from(query),
filename: String::from(filename),
}
}
There are other valid ways to write this code too, but here I will use the cloning approach.
12.3.5 Turning a Function Into a Struct Method
Since parse_config creates a Config instance, it is effectively a constructor. A constructor can be written like this:
impl Config {
fn new(args: &[String]) -> Config {
let query = args[1].clone();
let filename = args[2].clone();
Config {
query,
filename,
}
}
}
Just place this function on the Config implementation block (for details on methods, see 5.3. Methods on Structs). I also renamed parse_config to new, because I am treating it as a constructor (constructors are usually named new).
After this change, main also needs to be updated:
let config = Config::new(&args);
12.3.5 The Full Code
Here is all the code written up to this article:
use std::env;
use std::fs;
struct Config {
query: String,
filename: String,
}
fn main() {
let args:Vec<String> = env::args().collect();
let config = Config::new(&args);
let contents = fs::read_to_string(config.filename)
.expect("Something went wrong while reading the file");
println!("With text:\n{}", contents);
}
impl Config {
fn new(args: &[String]) -> Config {
let query = args[1].clone();
let filename = args[2].clone();
Config {
query,
filename,
}
}
}
Top comments (0)