Continuing my journey with Rust, I recently completed a suggested exercise from "The Book":
Using a hash map and vectors, create a text interface to allow a user to add employee names to a department in a company. For example, “Add Sally to Engineering” or “Add Amir to Sales.” Then let the user retrieve a list of all people in a department or all people in the company by department, sorted alphabetically.
Having completed the exercise in Rust, I wondered what differences I would note if I implemented the same thing in Go, a language that I am much more comfortable with. In order to compare more easily, I wrote a similar implementation (with a similar code structure) in Go. Code for both the Rust and the Go versions are at the bottom of this article.
Lines of code
The first observation is that the Rust code comes in at 131 lines, and the Go code at 158. While the Rust code is a little shorter, I suspect that more knowledge of and experience with Rust would enable further simplifications. Rust seems to have a lot more language features than Go. While language features can be helpful, they do have a cost: there's a lot more you need to know and keep in your head in order to read other people's code (and sometimes your own). There's some advantages to a smaller feature set, even if it means you sometimes write a handful of extra lines when implementing certain things. It can be easier to read and interpret a few extra lines of code that make explicit the behaviour, than read code using some clever features of convenience that I only see once every few months. I think Go could do with a small number of extra language features, but I worry that Rust has too many. It seems like there is a lot to learn and keep in one's head, but it remains to be seen by me if Rust has too much, or whether with practice the burden will seem light.
Control over use
Another thing I experienced when writing the Rust code is how it can afford me a much greater ability to define exactly how my code is to be used. One problem that exists with Go code is that you can misuse packages in ways that break things. For example, in the Go implementation for this task, nothing stops a developer from declaring org := Organisation{}
and then having the program crash when you try to add someone to a department because the map was not initialised. In the Rust version, the same code would fail to compile unless you initialise the HashMap when you instantiate the struct. Furthermore, it seems that if we wanted in Rust to stop people from initialising the struct themselves at all -- e.g., to ensure struct is filled out with some default data -- then we could separate it into a module with appropriate members being private so that external consumers cannot construct the struct directly.
You can have private members in Go, but there's more hoops to jump through. To achieve something similar to Rust, likewise we would put 'Organisation' into its own package, and keep the departments field private. Unfortunately, developers can still instantiate the struct without first initialising the map. For example, suppose we created the following package to import:
package company
type Organisation struct {
departments map[string][]string
}
func (o *Organisation) Add(department string, name string) {
if o.departments[department] == nil {
o.departments[department] = make([]string, 0)
}
o.departments[department] = append(o.departments[department], name)
}
...
func NewOrganisation() Organisation {
return Organisation{departments: make(map[string][]string)}
}
Anyone who uses the NewOrganisation
constructor will have a properly initialised Organisation
struct. However, a programmer could still avoid the constructor like so:
org := company.Organisation{}
And then any call to org.Add()
will panic because the departments
map is not initialised. We can work around this in a couple of ways. The most direct way is to have a check inside EVERY function call that makes use of departments, which ensure that the field has been initialised. This means the creator of the package needs to be vigilant to ensure that they check all required fields are initialised as they should be in every function that accesses those fields.
Another option is to use an interface. We can define a new interface that specifies the kinds of methods we want to be able to call. We then make the Organisation
struct private, so that the only way for an external package to obtain an instance is via the constructor. E.g.:
package main
type companyOrganisation interface {
Add(string, string)
...
}
package company
type organisation struct {
departments map[string][]string
}
func (o *organisation) Add(department string, name string) {
if o.departments[department] == nil {
o.departments[department] = make([]string, 0)
}
o.departments[department] = append(o.departments[department], name)
}
...
func NewOrganisation() organisation {
return Organisation{departments: make(map[string][]string)}
}
The main
package cannot access organisation
to directly initialise it, and all calls are made through the interface. If we made the struct private without then declaring an interface, we couldn't then write functions in our main package that have organisation
listed in the function signature. The interface allows us to still do so, and also makes it easier to trade one implementation of an organisation with another. You can read more detail about this in Bypassing Golang's lack of constructors.
Of course, these enhancements still don't protect the programmer in the company
package from creating a new organisation
struct without initialising it. In Rust, this doesn't happen because the HashMap is not declared as an Option, and so the compiler complains if you create a new instance of the struct without initialising the HashMap.
We can achieve a great deal of the safety Rust has in this case in Go, but not fully, and with a bit more effort. I don't mind necessarily if my code is more verbose, but even using interfaces in Go doesn't give you the same compiler time safety as Rust does in this instance.
Referencing a HashMap entry
One of the things that Rust seems to be praised for is that it helps programmers avoid many common mistakes. I was therefore a little surprised to see that the following code will compile:
use std::collections::HashMap;
fn main() {
let hm: HashMap<String, String> = HashMap::new();
let msg = &hm[&String::from("does not exist")];
println!("{}", msg);
}
Since does not exist
does not exist as a key, it panics:
Finished dev [unoptimized + debuginfo] target(s) in 0.54s
Running `target/debug/play`
thread 'main' panicked at 'no entry found for key', src/main.rs:6:16
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
This happens because fetching a value by its key using [key]
either returns the value, or panics. What I thought might happen in such an instance is that the HashMap would return an Option
as a way to either provide the value, or tell us that there is None
. Rust does let you fetch a value via the get
method which returns an option that needs to be then checked for Some
or None
. Using get
will help you avoid requesting a value for a key that does not exist. The implementation for the index, that is, fetching by [key]
, appears to just be a call itself to get
, returning the value wrapped by Some
, or panicking if not:
fn index(&self, key: &Q) -> &V {
self.get(key).expect("no entry found for key")
}
To avoid making mistakes like this, we can just avoid fetching by index, and instead use get
every time. If there's cases where we want to panic if the key doesn't exist, then we can explicitly use get(key).expect("no entry found for key")
to make it clearer that we want to panic.
In short, it seems a little odd to me at the moment that the default behaviour for fetching a value by index in the way you do in many other language leads you to missing out on valuable compile time checks. A potential trap for newer programmers coming from other languages?
No redundant return values
One of the nice features about Go is that you can return multiple values without having to define a new object or struct that can carry all the values you want to return. In Go code, we frequently return both the value that is desired, as well as an error value that may or may not be nil.
For most functions, when we receive an error, we discard any returned value as possibly incorrect, or corrupt, or nil, or just the default value. That means we typically only care about two cases: the value when error is nil, or the error when error is not nil. In Go, however, four possible combinations of values are returned (a similar point is made in Aiming for correctness with types):
- Value is provided, error is nil
- Value is provided, error is not nil
- Value is not provided, error is nil
- Value is not provided, error is not nil
In Rust, with the Result enum we can cover exactly the two cases we care about: Ok
with the value provided, or Err
with an error, and thereby avoid redundant returned values. In the particular programming exercise that this blog post was inspired by, I made use of enums to describe the possible return values from executions of various commands:
enum Action {
Failure(String),
Done,
Quit,
}
For Done
and Quit
, no value needs to be returned from the function, but for Failure
we want to know what the returned message is so we can do something about it -- in this case, print the failure message for the user.
In Go, I used a similar pattern, in the form of a new type:
type Action string
var (
ActionQuit Action = "quit"
ActionDone Action = "done"
ActionFailure Action = "failure"
)
However, unlike in Rust, I can't send a message alongside with returning ActionFailure. So in this case, ActionFailure despite being returned, is never checked, since I always accompany it with an error. It would be good if I could force the requirement that every ActionFailure be accompanied by a message of some sort. In Rust, it's nice that I can make enums be so useful, and moreover, the compiler will throw errors if I have enum values that are never used in match statements. Compare that to Go where missing type values in a switch statement will not throw any kind of warning.
Moreover, while I haven't tested, I assume in Rust that if I were to put this into a separate module that it wouldn't be possible for consumers of the module to invent their own enum entries. In Go, while we have this type as public, any programmer can create their own entries and use them to return meaningless actions, since the type is public. If I make the type private, then it can't be put in function signatures by external consumers of the package.
The code
That's the end of my observations for now. Regarding the exercise itself, I did not follow the instructions for this quite exactly. Among other tweaks, I added 'help' and 'quit' commands, while also assuming everything is entered in lower case.
Rust
As mentioned before, I am in the process of learning Rust and its myriad of features. I have no doubt there will be better ways to write this. If I was to take this further, I would also implement handling of both upper and lower case for the commands, as well as splitting 'Organisation' into its own module to control its usage better.
use std::io;
use std::io::Write;
use std::collections::HashMap;
enum Action {
Failure(String),
Done,
Quit,
}
struct Organisation {
departments: HashMap<String, Vec<String>>,
}
impl Organisation {
fn new() -> Organisation {
return Organisation{
departments: HashMap::new(),
}
}
fn add(&mut self, department: String, name: String) {
self.departments.entry(department)
.or_insert_with(Vec::new)
.push(name);
}
fn print(&self) {
let mut keys: Vec<String> = self.departments.keys().cloned().collect();
keys.sort();
for key in keys {
println!("{}:", key);
self.print_department(key);
}
}
fn print_department(&self, name: String) {
let dep_op = self.departments.get(&name);
let mut dep = match dep_op {
None => {
println!("No department named '{}' found", name);
return
},
Some(x) => x.clone(),
};
dep.sort();
for name in dep {
println!("* {}", name);
}
}
}
fn main() {
let mut org = Organisation::new();
println!("Hi! Enter 'help' for information on available commands.");
loop {
let mut cmd = String::new();
print!("> ");
let _ = io::stdout().flush();
io::stdin().read_line(&mut cmd).expect("Did not enter a correct string");
let action = parse_command(&mut cmd, &mut org);
match action {
Action::Failure(s) => println!("{}", s),
Action::Quit => break,
Action::Done => continue,
}
}
}
fn parse_command(cmd: &mut String, org: &mut Organisation) -> Action {
let parts: Vec<&str> = cmd.split_whitespace().collect();
if parts.len() == 0 {
return Action::Failure(String::from("No commands found"));
}
match parts[0] {
"add" => return add_command(parts, org),
"help" => return help_command(),
"list" => return list_command(parts, org),
"quit" => return Action::Quit,
_ => return Action::Failure(format!("Unknown command {}", parts[0])),
}
}
fn help_command() -> Action {
let help_msg = r#"
add: add <person> to <department>
help: print this message
list: view all people and departmnets
quit: exit program
"#;
println!("{}", help_msg);
return Action::Done;
}
fn add_command(parts: Vec<&str>, org: &mut Organisation) -> Action {
let expected = String::from("Expect command of form: add <name> to <department>");
if parts.len() != 4 {
return Action::Failure(expected);
}
if parts[2] != "to" {
return Action::Failure(expected);
}
org.add(String::from(parts[3]), String::from(parts[1]));
return Action::Done;
}
fn list_command(parts: Vec<&str>, org: &mut Organisation) -> Action {
if parts.len() == 1 {
org.print();
return Action::Done;
}
if parts.len() == 2 {
org.print_department(String::from(parts[1]));
return Action::Done;
}
return Action::Failure(String::from("Unexpected usage. Expect 0 or 1 parameters for list"));
}
Go
Like with the Rust code, and as discussed before, there are enhancements we could make such as switching the Organisation struct into a separate package, and using interfaces to control access.
package main
import (
"bufio"
"fmt"
"os"
"sort"
"strings"
)
type Action string
var (
ActionQuit Action = "quit"
ActionDone Action = "done"
ActionFailure Action = "failure"
)
type Organisation struct {
departments map[string][]string
}
func (o *Organisation) Add(department string, name string) {
if o.departments[department] == nil {
o.departments[department] = make([]string, 0)
}
o.departments[department] = append(o.departments[department], name)
}
func (o *Organisation) Print() {
keys := make([]string, 0)
for k := range o.departments {
keys = append(keys, k)
}
sort.Strings(keys)
for _, k := range keys {
fmt.Println(k + ":")
o.PrintDepartment(k)
}
}
func (o *Organisation) PrintDepartment(department string) {
dep, ok := o.departments[department]
if !ok {
fmt.Printf("No department named '%s' found\n", department)
return
}
sort.Strings(dep)
for _, name := range dep {
fmt.Println("* " + name)
}
}
func NewOrganisation() Organisation {
return Organisation{departments: make(map[string][]string)}
}
func main() {
// org := NewOrganisation()
org := NewOrganisation()
reader := bufio.NewReader(os.Stdin)
fmt.Println("Hi! Enter 'help' for information on available commands.")
done := false
for !done {
fmt.Print("> ")
text, err := reader.ReadString('\n')
if err != nil {
panic(err)
}
text = strings.TrimSpace(text)
action, err := parseCommand(text, org)
if err != nil {
fmt.Println(err.Error())
continue
}
switch action {
case ActionQuit:
done = true
break
case ActionDone:
continue
}
}
}
func parseCommand(cmd string, org Organisation) (Action, error) {
parts := strings.Split(cmd, " ")
if len(parts) == 0 {
return ActionFailure, fmt.Errorf("No commands found")
}
switch parts[0] {
case "add":
return addCommand(parts, org)
case "help":
return helpCommand()
case "list":
return listCommand(parts, org)
case "quit":
return ActionQuit, nil
default:
return ActionFailure, fmt.Errorf("Unknown command %s", parts[0])
}
}
func helpCommand() (Action, error) {
msg := `
add: add <person> to <department>
help: print this message
list: view all people and departmnets
quit: exit program
`
fmt.Println(msg)
return ActionDone, nil
}
func addCommand(parts []string, org Organisation) (Action, error) {
errExp := fmt.Errorf("Expect command of form: add <name> to <department>")
if len(parts) != 4 {
return ActionFailure, errExp
}
if parts[2] != "to" {
return ActionFailure, errExp
}
org.Add(parts[3], parts[1])
return ActionDone, nil
}
func listCommand(parts []string, org Organisation) (Action, error) {
if len(parts) == 1 {
org.Print()
return ActionDone, nil
}
if len(parts) == 2 {
org.PrintDepartment(parts[1])
return ActionDone, nil
}
return ActionFailure, fmt.Errorf("Unexpected usage. Expect 0 or 1 parameters for list")
}
Top comments (4)
In Rust, the last expression of a function is automatically returned so a lot of
return
s in your code can be easily removed. Also, within animpl
block you can useSelf
to refer to the type you're implementing stuff on.For example, in
you can remove the
return
and useSelf
to make itSimilarly you can do -
match
is also an expression so yourlist_command
fn can be -Thanks for those tips. I was aware that you can return without using the keyword 'return', but old habits kicked in when I wrote this :)
Thinking about it, the explicitness of having the 'return' keyword is something I find nice, to quickly scan a function to easily identify the places it returns.
With regard to your comments about indexing operations panicking when out of bounds, this makes an interesting read. In that context indexing out of bounds would be considered a programmer error, so panicking is the appropriate action.
Also consider how painful it would be to use if it did return an Option - in most cases you would end up liberally sprinkling
unwrap
orexpect
all over the place. It would be especially weird where an indexed array is on the LHS of an assignment:That is not valid syntax, of course.
container[index]
is actually syntactic sugar for*container.index_mut(index)
which could not work ifindex_mut
returned an option.For Slices there is also
get_unchecked
, an unsafe method which does no bounds checking.Besides the return and weird naming (should be snake_case rather than camelCase and PascalCase like go). I would just use match against the slice rather than the first item to prevent the user from specifying extra values and not needing to check len(parts), this can remove those functions and do all the checks inside the match, probably can save 10-30 lines as it would remove the need for Action. You can also just derive Default rather than new function.