Aha! Understanding lifetimes in Rust

#rust #lifetimes #programming

When I'm learning something complex, I often find myself waiting around for the Eureka effect (also known as the "Aha! moment"). Wikipedia defines the Eureka Effect as the "common human experience of suddenly understanding a previously incomprehensible problem or concept". This effect has been understood for thousands of years and was most-famously popularised by the great Greek mathematician Archimedes, who discovered how to measure the volume of an object whilst sitting in his bath.

In this post I'd like to share my own "Aha!" moments whilst grappling with lifetimes in Rust. I hope this helps some of you in your journey to the summit of Rust mountain.

I will assume that the readers already know some basic Rust and have read the documentation on ownership, lifetimes and borrowing.

A simple scenario

First, I want to propose a simple, suitably contrived snippet that requires us to stipulate a lifetime

// Our Person struct consists of a name and an optional reference to another Person.
#[derive(Debug, Eq, PartialEq)]
struct Person<'a> {
    name: &'static str,
    parent: Option<&'a Person<'a>>,
}

impl<'a> Person<'a> {
    fn new(name: &'static str, parent: Option<&'a Person<'a>>) -> Person<'a> {
        Person{name: name, parent: parent}
    }

    // Rust lets us elide the lifetime of &self here.
    fn parents_name(&self) -> Option<&'static str> {
        self.parent.and_then(|p| Some(p.name))    
    }
}

And some code that implements it:

let jane = Person::new("Jane", None);
let tom = Person::new("Tom", Some(&jane));

assert_eq!(tom.parent.unwrap().name, "Jane");
assert_eq!(tom.parents_name(), Some("Jane"));
assert_eq!(jane.parent, None);
assert_eq!(jane.parents_name(), None);

It's just a contract!

If we omit the lifetime specifiers in the previous example, we run into problems:

$ cargo test
  Compiling lifetimes v0.1.0 (file:///home/andrew/dev/rust/lifetimes)
  src/lib.rs:4:20: 4:27 error: missing lifetime specifier [E0106]
  src/lib.rs:4     parent: Option<&Person>,

Why is this? Well, it's because Rust needs a way to ensure that any borrowed Person does not outlive any other Person that is borrowing it. And therein lay my first "Aha!". When we specify a lifetime explicitly, we are simply entering a contract with the compiler guaranteeing that the given resource will be available for a certain scope. We are, in a sense, making a promise to the compiler. But we all know that promises can be broken and so Rust still makes all of the necessary checks for us.

We are not "creating" lifetimes. Nor are we telling Rust to allow a reference to exist somewhere that it shouldn't. We are just telling the compiler to complain if the calling code breaks the lifetime rules we've dictated.

What's in a name?

So exactly which scope is 'a? Is it referring to something in our Person type? No, 'a is defined by the calling code. Our type just dictates how long the value we are borrowing must live. Think of it like â€œfor any lifetime â€˜aâ€ rather than â€œthe particular lifetime called â€˜aâ€. The constructor demonstrates this pretty well. Look at the function signature:

fn new(name: &'static str, parent: Option<&'a Person<'a>>) -> Person<'a>

Here we are saying "new is a function that accepts a reference to a Person that must live for a lifetime we are going to call 'a' and it returns a new Person that contains a reference that must live for the same lifetime".

See that? We've stated at the syntactic level that the borrowed value (parent) will have the same lifetime as the thing that's borrowing it (the return value). In reality, the borrowed value may outlive its container. Our lifetime annotations just refer to the smallest lifetime for which the value must exist.

But why?

Why canâ€™t Rust just deal with this for us under the hood? And why canâ€™t Rust just ensure that an object lasts for exactly long as itâ€™s required by the program? Like, for example, an object that escapes its scope in Go will be heap-allocated and dealt with by the garbage collector at a later time. Well, thatâ€™s just it: such conveniences require a garbage collector, which would give our programs a performance hit.

Edit: As told by Reddit user andytoshi in this comment, Rust's ownership and borrowing systems do a lot more than simply "avoid garbage collection for performance reasons". They also, potentially more importantly, guarantee that any borrowed value you have access to will not change out from underneath you, even if you have a mutable reference to something. This gives our code a higher degree of reasonability and reduces overall complexity.

Rustâ€™s ownership and lifetime systems allow the compiler to validate the memory safety of our programs at compile-time. We donâ€™t have to delegate this task to a garbage collector at runtime such as in Ruby, Python, Go, etc nor do we have to manually allocate/deallocate memory ourselves such as in C. Infact, it's probably worth mentioning at this point that all this mumbo-jumbo about lifetimes is only of concern at compile-time. We are only referring to the lexical scopes that our borrows must exist in.

Rusts memory model is one of a few things that Rust refers to as a â€œzero-cost abstractionâ€, which brings me to my next Aha!

Zero-cost to our programs, not to us!

When I first started reading about some of these zero-cost abstractions, I wondered how it really applied here as the learning curve involved certainly didnâ€™t seem like a â€œzero-costâ€. But then I realised that when Rustaceans talk about zero-cost, they are talking solely about runtime performance of programs. Aha! Infact, in the case of lifetimes, Rust is actually removing the necessity of a garbage collector entirely by pushing additional rules onto the programmer. If your program compiles - itâ€™s almost certainly memory safe. These rules force us to think about the memory safety of the way we are programming upfront rather than in bug fixes months down the track. Yes, it's potentially more work, but it will undoubtedly save us headaches in the future!

One last point

Often when Iâ€™m dealing with simple structs I find myself wondering why Rust canâ€™t elide here like it does in certain function declarations. For example, when we say:

struct Person {
    name: &str,
    parent: Option<&Person>,
}

Canâ€™t Rust just translate it to this under the hood:

struct Person<'a> {
    name: &'a str,
    parent: Option<&'a Person<'a>>,
}

This point has caused some discussion in the Rust community. Some core contributors even believe that modern Rust should elide in this scenario.

Itâ€™s important to remember here that Rust often favours explicitness over succinctness. I think the example Yehuda Katz gives in his reply in the thread above is a good one. If we were able to omit the lifetime annotations then implementing code may end up looking something like this:

let people: Vec<Person> = â€¦;

We have suddenly lost the fact that Person contains a borrowed value. In complex, real-life software, this could lead to confusing bugs. But with that said, I imagine that there will be some movement in this area in the coming versions of Rust.

The End

That's all I have for the moment. I hope that something in here helped you on your journey. And if you've noticed that I've said something that's incorrect, please feel free to correct me. Thanks in advance!