Eugene Sirota

Posted on Jan 5 • Edited on Jan 7

Rust: Ownership/Borrowing and Memory Leak Prevention

#rust #ownerhip #borrowing #memory

Introduction

The Rust language is based on the following core ideas:

Memory safety, including preventing memory leaks without garbage collection.
Zero-cost abstractions.
A combination of imperative and functional programming paradigms.

This article focuses on how memory-leak prevention is implemented in the Rust language, specifically through a set of compile-time rules known as ownership and borrowing rules. These rules are located in the core of the Rust language. Without understanding them, programming in Rust language is practically impossible.

Remark: Ownership/borrowing is not only about preventing memory leaks, but also about avoiding use-after-free bugs, data races, and other memory-safety issues. However, in this article, we focus only on the memory-leak aspect.

The problem

So, what is a memory leak, and what issues does it cause? A memory leak can be defined as memory consumption that grows over time without that memory could be used in a useful way. For example, consider a program that has a repetitive procedure that allocates a new piece of memory on every iteration and does not deallocate it. If the program is short-lived, like a console utility, meaning it does not perform many iterations like that, then it might not be a problem at all. As soon as it finishes its work, the operating system will clean up all the memory consumed by the program. But if your program is long-lived, like a server-side application that has to handle incoming requests indefinitely, then after some time, such a program will consume all the available memory, and the system will crash. For such a long-lived program, it’s vitally important to eliminate memory leaks.

Possible approaches to solving the problem

There are three possible ways to handle memory leaks:

1. Manual allocation and deallocation of memory.

Languages: C, C++.

Pros:

Full memory control.
Simple concept.

Cons:

Extremely error-prone.

It’s hard to write a big program without memory leaks using this approach. This is the reason why languages such as C or C++ are not a popular choice for server-side development, but, on the other hand, they are popular in system programming, for example.

2. Garbage collection: memory is allocated and deallocated automatically.

Languages: C#, Go, Java, JavaScript, Python.

Pros:

No need to worry (relatively) about memory management.

Cons:

The garbage collector pauses program execution during garbage collection, which can cause small non-deterministic freezes.
It can work inefficiently with a large number of small objects.
If your program relies heavily on memory, then you may find yourself in a situation where you need to fight the garbage collector.

This approach is popular in the development of server-side applications, and explains why languages such as C#, Go, Java, and others are de facto standards for server-side development.

3. Enforcement of special rules that guarantee memory leak–free code if those rules are followed.

Languages: Rust.

Pros:

Absence of memory leaks, unless unsafe code is used or in some specific edge cases.
No need for a garbage collector.

Cons:

Steep learning curve.
More complex and verbose code.

This approach is uniquely realized in the Rust language. It makes the Rust language suitable for writing programs in any domain, be it system, server-side, or client-side applications.

Memory regions

For efficient memory usage, memory is organized into regions, each with its own purpose. Memory leaks are not possible in every one of these regions, so let’s consider each of them.

1. Read-only static data. This kind of data is embedded into the binary code of a program or library. It lives for the entire lifetime of a program. It cannot cause memory leaks.

let s:&'static str = "Hello";

| *** binary *** |
|                | 
|--------------- |
| *** static *** |
|                |
|     Hello      |
|________________|

The string Hello is embedded into the binary and lives during the whole execution of the program. Rust marks such data with the 'static lifetime, hence the type of the variable s is &’static str, which means a pointer to a string slice that lives in the static memory region.

2. Stack. A stack is a memory region where a function stores its arguments, local variables, and return address. It’s worth mentioning that, instead of placing it on the stack, the compiler can put some of this data in CPU registers, but for the sake of simplicity, we will assume that it’s not the case.

let s:&'static str = "Hello";

In this example, while the string Hello lives in the static data region, the variable s, namely the pointer to the Hello string slice, lives on the stack:

| *** stack ***  |  | *** static *** |
|                |  |                | 
| s.pointer ---- |--|---> Hello      |
|                |  |                |
|________________|  |________________|

Another example:

struct Point { x: f64, y: f64 }
let p = Point { x: 3.14, y: 2.72 };

| *** stack *** | 
|               | 
|   p.x:3.14    |
|   p.y:2.72    |
|_______________|

Here, the variable p is of type Point, a struct with two fields. The variable p is not a pointer, and the whole data resides on the stack.

Data on the stack is organized in a Last In, First Out (LIFO) manner. For example, when a function ends its execution, all data on the stack that was allocated for this function is freed automatically, and the top of the stack points to the data allocated for the previous function.

| *** stack *** | 
|               | 
| * function A* |
| return addr   |
| arguments     |
| local vars    |
| calls b <-----|----|
|---------------|    |
| * function B* |    |
| return addr --|----|
| arguments     |
| local vars    |
| calls c <-----|----|
|---------------|    |
| * function C* |    |
| return addr --|----|
| arguments     |
| local vars    |
|_______________|

In the diagram, we have stack memory allocation for three functions, A, B, and C, where function A calls function B and function B calls function C. When function C ends its execution, the stack will look like this:

| *** stack ***  | 
|                | 
| * function A * |
| return addr    |
| arguments      |
| local vars     |
| calls B <----- |----|
|--------------- |    |
| * function B * |    |
| return addr ---|----|
| arguments      |
| local vars     |
|________________|

The data allocation and deallocation on the stack happen in a deterministic and automatically controlled manner, and hence, memory leaks are not possible in this memory region.

3. Heap.

If data which is stored on the stack does not cause memory leaks, then why don't we store all the data there? The stack has some limitations:

Local variables of function A are only visible to function A and not to function B. But what if both functions A and B need access to the same data?
Most language compilers, including Rust, require that the size of the data stored on the stack be known at compile time. But what if we need to get an array of an arbitrary size from a user input?
The size of the stack memory is relatively small, usually about 1–8MB. To store data of a larger size, the stack is not suitable.

The heap allows us overcoming these limitations. A heap is the region of memory that allows allocating a block of memory of dynamically determined size at run-time. A pointer to that block of memory can be passed to any function that needs it, allowing multiple functions to work with the same data.

The heap comes with its downsides:

Allocation of memory on the heap is slower.
Deallocation should be managed either manually (C, C++), by a garbage collector (C#, Java, Go), or via an ownership mechanism (Rust).

Thus, every memory region should be used in accordance with the purpose and nature of the data to achieve maximum efficiency of a program.

In Rust, values live on the stack by default:

struct Point { x: f64, y: f64 }
let p = Point { x: 3.14, y: 2.72 };

Here, the value Point { x: 3.14, y: 2.72 } is on the stack. If you want to put a value on the heap, you have to wrap it in a special smart pointer, Box<T>:

struct Point { x: f64, y: f64 }
let p: Box<Point> = Box::new(Point { x: 3.14, y: 2.72 });

| *** stack *** |  | *** heap ***  |
|               |  |               | 
| p.pointer ----|--|---> x:3.14    |
|               |  |     y:2.72    |
|_______________|  |_______________|

The variable p is a smart pointer that lives on the stack and points to the value that lives on the heap. The smart pointer is responsible for the deallocation of the memory that was allocated for the value on the heap. The deallocation happens automatically once the variable p goes out of scope. For example, if p is a local variable of a function, then at the end of the execution of the function, the value Point { x: 3.14, y: 2.72 } will be removed from the heap, and the variable p, the pointer, will be removed from the stack:

fn a() {
    let p: Box<Point> = Box::new(Point { x: 3.14, y: 2.72 });
    ...
} // <-- The smart pointer `p` is going out of scope here. 
  // It will free memory on the heap and will be removed itself 
  // from the stack.

In Rust, a scope can be more granular than the scope of a function. In essence, a pair of curly braces defines a scope:

{
    ...
    {
         let p: Box<Point> = Box::new(Point { x: 3.14, y: 2.72 });  
         ...
    } // <-- The smart pointer `p` is going out of scope here. 
      // It will free memory on the heap. 
      // The stack memory outside this scope will be intact, though. 
      // The stack memory will be reclaimed when the function returns.
    ...
}

Box<T> is not only a smart pointer that can put a value on the heap. For example, the type String also acts as a smart pointer and keeps the string content on the heap. The type String is actually more than just a smart pointer. It also has fields to store string length and capacity:

let s = String::from("Hello");

| *** stack *** |   | *** heap ***  |
|               |   |               | 
| s.pointer ----|---|---> Hello     |
| s.len:5       |   |               |
| s.cap:5       |   |               |
|_______________|   |_______________|

In this example, variable s has its fields pointer, length, and capacity on the stack, while the content of the string Hello is on the heap. When the variable s goes out of scope, it frees the memory on the heap allocated for the string Hello.

Ownership

In Rust, every value has an owner. The owner’s scope determines how long the value lives. When the owner goes out of scope, the value is automatically dropped:

struct Point { x: f64, y: f64 }

{
    let p = Point { x: 3.14, y: 2.72 };
} // Variable `p` owns Point { x: 3.14, y: 2.72 }, 
  // and it is no longer accessible here because `p` goes out of scope 
  // and the value is dropped.

If a value’s type implements the Drop trait, then its drop implementation is called automatically. In the previous example, the type Point doesn’t implement the Drop trait, so nothing special will happen. But if we use a smart pointer type like Box<T>:

struct Point { x: f64, y: f64 }
{
    let p: Box<Point> = Box::new(Point { x: 3.14, y: 2.72 });
}

Then, as soon as the variable p goes out of scope, the drop implementation of Box<T> is called, which deallocates the heap memory that was allocated for the value Point { x: 3.14, y: 2.72 }. Hence, it avoids a memory leak.

Move

Ownership can be moved to another variable. If ownership is moved, then the new owner controls the value’s lifetime.

let p = Box::new(Point { x: 3.14, y: 2.72 });
let p2 = p;
// `p` is not accessible after this. 
// `p2` is now responsible for the value lifetime.

Here, ownership is moved from variable p to variable p2. Now, p2 owns the value and it is responsible for the heap deallocation, but the variable p owns nothing and is no longer accessible.

Remark: If a type implements the Copyt trait, then ownership is not moved. Instead, p2 receives a bitwise copy of the value, so both variables hold their own independent copies and remain accessible.

With a move, a value can appear in a new scope. Here, a move happened from the outer scope to the inner scope:

{
     let p = Box::new(Point { x: 3.14, y: 2.72 });
     {
        let p2 = p; // The value is now bound to the variable `p2` 
        // in the inner scope and will be dropped as soon as 
        // the inner scope ends.

     } // The value is dropped here. 
     ...
}

And here, a move happened from the inner scope to the outer scope:

{
     let mut p: Box<Point>;
     {
        let p2 = Box::new(Point { x: 3.14, y: 2.72 });
        p = p2;  // The value is now bound to the variable `p` 
        // from the outer scope and will be dropped 
        // as soon as the outer scope ends.
     } 
     ...
} // The value is dropped here.

In the previous examples, a move happened through variable reassignment. But reassignment is not the only way to move ownership. Below is a list of other ways ownership can be moved:

By passing a value to a function as an argument:

let p = Box::new(Point { x: 3.14, y: 2.72 }); // `p` is the owner.
take_value(p);
// `p` is no longer the owner and is not accessible.

fn take_value(v: Box<Point>){
    // Parameter `v` is the owner of the passed value now.
    ...
} // The value is dropped at the end of the function execution.

By returning a value from a function:

fn return_value() -> Box<Point> {
    let r = Box::new(Point { x: 3.14, y: 2.72 }); // `r` is the owner.
    r // By returning a value from the function, `r` is no longer the owner. 
    // The value will be returned to the caller and will not be dropped 
    // at the end of the function.
} 

let p = return_value(); // `p` is the owner now. 
// The value will be dropped as soon as `p` goes out of scope.

By assigning to a field of a struct:

struct Point { x: f64, y: f64 }
struct TwoPoints { point1: Box<Point>, point2: Box<Point> }

let p1 = Box::new(Point { x: 1.1, y: 1.1 });
let p2 = Box::new(Point { x: 2.2, y: 2.2 });
let tp = Box::new(TwoPoints { point1: p1, point2: p2 });
// `p1` and `p2` are not accessible anymore, 
// and their values are now under `tp`’s ownership.

By moving out of a struct field:

struct Point { x: f64, y: f64 }
struct TwoPoints { point1: Box<Point>, point2: Box<Point> }

let tp = TwoPoints {
    point1: Box::new(Point { x: 1.1, y: 1.1 }),
    point2: Box::new(Point { x: 2.2, y: 2.2 })
};

let p = tp.point1; // Partial move happens here. 
// `tp.point1` is no longer valid and cannot be used. 
// `p` now owns the value of `point1`. 
// `tp.point2` is still valid and owns its value.

By moving into a closure:

struct Point { x: f64, y: f64 }

let p = Box::new(Point { x: 1.1, y: 2.2 });
let c = move || { p; }; // The closure owns the value now.

// `p` is no longer accessible here and is no longer the owner of the value.

c(); // The value will be dropped when the closure finishes executing.

Borrowing

Ownership transfer is not always suitable. For example, suppose you want to pass a value into a function, but after the function call, you want the caller to continue using that value:

struct Point { x: f64, y: f64 }

fn take_value(v: Box<Point>) {
    println!("{}", v.x);
}

let p = Box::new(Point { x: 1.1, y: 2.2 });
take_value(p);
println!("{}", p.x); // Compilation error.

This will not compile because ownership of the value was moved into the function take_value. Variable p becomes inaccessible, and as soon as the function finishes its work, the value will be dropped. One way to fix this would be to return the passed value from the function:

fn take_and_return_value(v: Box<Point>) -> Box<Point> {
    println!("{}", v.x);
    v
}

let p = Box::new(Point { x: 1.1, y: 2.2 });
let p = take_and_return_value(p);
println!(“{}”, p.x);

Now, the code compiles, and the value can be used again after the function call. However, this approach has downsides:

It’s inconvenient to pass and return a value.
An ownership move can impact performance because, during a move, a bitwise copy of the stack representation of the value is performed. In the example above, the performance is not affected because the type Box is just a pointer, but if the passed value is a big fixed-size array, then it can.

To overcome this issue, Rust has a borrowing mechanism. Borrowing gives access to a value without transferring ownership:

fn take_reference(v: &Box<Point>) {
    println!("{}", v.x); // The function has access to the value but
    // doesn’t own it and doesn’t drop it at the end of its execution.
}

let p: Box<Point> = Box::new(Point { x: 1.1, y: 2.2 });
let r: &Box<Point> = &p; // Borrowing. Creates a reference.
take_reference(r);
println!("{}", p.x); // `p` is still the owner of the value, 
// can be accessed, and will drop the value when it goes out of scope.

By borrowing &T, we create an immutable reference to T that can be used to access the value without controlling its lifetime. An immutable reference means that you can read the referenced data but can’t modify it. Rust also allows the creation of mutable references (mutable borrowing, in other words). But we will not consider mutable borrowing here.

Remark: Borrowing is similar to what happens in languages like C/C++, when you make a reference to a value, but Rust imposes a set of constraints on such references that guarantee memory safety at a compile time. These constraints will be partially discussed later.

Since the function take_reference doesn’t need to drop the value and hence doesn’t need to free the memory, but only needs access to the Point, then instead of passing a reference to Box<Point>, we can pass a reference to just Point:

fn take_reference(v: &Point) {
    println!("{}", v.x);      
}

let p: Box<Point> = Box::new(Point { x: 1.1, y: 2.2 });
let r: &Point = &p;
take_reference(r);
println!("{}", p.x);

Remark: Here, the same notation, &p, is used to obtain a reference to Point, even though p has type Box<Point>. This works because Rust applies deref coercion, automatically converting &Box<T> into &T when a &T is expected. This coercion topic is outside the scope of this article.

Shared ownership

Rust allows having one owner of a value and any number of immutable borrows:

let p: Box<Point> = Box::new(Point { x: 1.1, y: 2.2 });
let r1: &Point = &p; 
let r2: &Point = &p;
println!("{}", p.x);
println!("{}", r1.x);
println!("{}", r2.x);

However, Rust enforces that a borrow must not outlive its owner:

let r: &Point;
{
    let p: Box<Point> = Box::new(Point { x: 1.1, y: 2.2 });
    r = &p; // Compilation error
}
println!("{}", r.x);

This code will not compile because the owner p lives in a shorter scope than the borrower r. The value Point { x: 1.1, y: 2.2 } would be dropped before r is used in println!("{}", r.x). Without this restriction, the program could read freed memory, producing a dangling reference.

There are cases where access to the same value is required from multiple scopes with independent lifetimes, and no single owner can be identified. To address this, Rust has the Rc<T> type.

Rc<T> is a smart pointer similar to Box<T> in that it owns a value on the heap. But unlike Box<T>, Rc<T> allows multiple clones that all point to the same value on the heap, meaning the value can have multiple owners.

“Rc” stands for Reference Counted. All clones of a given Rc<T> share a single reference counter stored alongside the heap allocation. Cloning Rc<T> increments this counter, and dropping Rc<T> decrements it. When the counter reaches zero, the value is dropped, and the heap memory is freed.

The key distinction from Box<T> is that while Box<T> enforces exclusive ownership at compile time, Rc<T> manages shared ownership by tracking the value’s lifetime at runtime.

{
    let p1: Rc<Point>;
    {
        let p2: Rc<Point> = Rc::new(Point { x: 1.1, y: 2.2 }); 
        // A new Rc is created, reference count == 1, 
        // heap memory is allocated for the value.

        p1 = Rc::clone(&p2); 
        // The Rc is cloned, reference count == 2.

    } // `p2` goes out of scope, reference count == 1, 
      // the value is still on the heap.

    println!("{}", p1.x); // `p2`, the original value owner 
    // is no longer available, but the value can still be accessed 
    // via another owner, `p1`.

} // `p1` goes out of scope, reference count == 0, 
  // the heap memory is freed here.

This example demonstrates the creation of two Rc clones, two owners of the value, and how the reference counter is incremented and decremented. Finally, when the value is dropped, memory is freed.

The way owners and the value are laid out in memory can be depicted like this:

| *** stack ***  |       | *** heap ***   |
|                |       |  ____________  | 
| p1.pointer --- |---|   |  |x:1.1     |  |
|                |   |---|->|y:2.2     |  |
| p2.pointer --- |---|   |  |refcount:2|  |
|                |       |  |__________|  |
|________________|       |________________|

The example above is trivial and doesn’t demonstrate a real need for Rc<T> type usage. A more practical case where shared ownership is needed is a linked list data structure that allows two or more lists to share the same tail:

enum List<T> { 
    Nil, 
    Cons(T, Rc<List<T>>) 
}

// `tail` owns a heap allocation containing the list nodes `2 -> 3 -> Nil`.
let tail = Rc::new(List::Cons(2, Rc::new(List::Cons(3, Rc::new(List::Nil))))); 

// `Rc::clone(&tail)` clones the Rc pointer, 
// increments the reference count, 
// and creates another owner of the same tail allocation.
let list1 = Rc::new(List::Cons(0, Rc::clone(&tail))); // 0 -> 2 -> 3 -> Nil
let list2 = Rc::new(List::Cons(1, Rc::clone(&tail))); // 1 -> 2 -> 3 -> Nil

// At this point, there are three owners of the tail allocation:
// - `tail` itself
// - the tail pointer stored inside `list1`
// - the tail pointer stored inside `list2`

Here, we create a tail first, and then the same tail is shared between both lists. The tail value exists as long as at least one owner exists.

Remark: The Rc<T> type is not thread-safe. A thread-safe version is Arc<T>.

Rc can cause a memory leak

If you build a structure where two nodes own each other, forming a reference cycle, their reference counts never reach zero. As a result, Rust never runs their destructors and never frees the heap allocations, which leads to a memory leak. The following example builds a parent/child structure where the parent points to the child and the child points back to the parent:

struct Node {
    value: i32,
    parent: Option<Rc<RefCell<Node>>>,
    children: Vec<Rc<RefCell<Node>>>
}

{
    let p = Rc::new(RefCell::new(Node {
        value: 1,
        parent: None,
        children: Vec::new(),
    }));

    let c = Rc::new(RefCell::new(Node {
        value: 2,
        parent: Some(Rc::clone(&p)),
        children: Vec::new(),
    }));

    p.borrow_mut().children.push(Rc::clone(&c));

    // At this point, each node has two owners:
    // - the local variables, `p` and `c`.
    // - the Rc stored in the other node (`p.children` contains `c`, 
    // and `c.parent` contains `p`).
    // Therefore, the Rc reference count for both nodes becomes 2.

} // At the end of the scope, the local variables `p` and `c` are 
  // dropped, but each node is still referenced by the other node,  
  // keeping the reference count equal to 1. 
  // Therefore, neither allocation is freed.

Remark: The example uses the RefCell<T> type. The usage of this type is required to mutate a value wrapped by Rc<T> because Rc<T> does not provide mutation by itself.

Resolving Rc memory leaks using Weak

Rust provides Weak<T>, a smart pointer that references an allocation managed by Rc<T> but does not own the value. This means that if all Rc<T> owners go out of scope, the existence of Weak<T> references does not prevent the value from being dropped. Conversely, dropping a Weak<T> has no effect on the value’s lifetime.

The role of Weak<T> is to allow conditional access to a value: it can be accessed only if at least one owning Rc<T> is still alive. To create a Weak<T>, you call Rc::downgrade() on an Rc<T>. To access the value, you must call the upgrade() method, which returns an Option<Rc<T>>. If at least one owner exists, the upgrade() method returns Some(Rc<T>); otherwise, it returns None, indicating that the value has already been dropped:

let weak: Weak<Point>;
{
    let rc: Rc<Point> = Rc::new(Point { x: 1.1, y: 2.3 });

    // Create a Weak<T> that does not own the value
     weak = Rc::downgrade(&rc); 

    // Temporarily upgrade Weak<T> to Rc<T> to access the value.
    if let Some(rc2) = weak.upgrade(){ 
        println!("{}", rc2.x);
    }
}

// All Rc<T> owners are out of scope. The value has been dropped.
if weak.upgrade().is_none() {
    println!("The value is already dropped.");
}

In the parent/child example, where the parent points to the child and the child points back to the parent, we can avoid the memory leak by replacing the strong reference (Rc<T>) from the child to the parent with a weak reference (Weak<T>):

struct Node {
    value: i32,
    parent: Option<Weak<RefCell<Node>>>, // Weak reference to the parent.
    children: Vec<Rc<RefCell<Node>>>,
}

let p = Rc::new(RefCell::new(Node {
    value: 1,
    parent: None,
    children: Vec::new(),
}));

let c = Rc::new(RefCell::new(Node {
    value: 2,
    parent: Some(Rc::downgrade(&p)), // A weak reference from child to parent.
    children: Vec::new(),
}));

// Create a strong reference (Rc<T>) from a parent to a child.
p.borrow_mut().children.push(Rc::clone(&c));

In this setup, the child node has two strong owners: the local variable c and the children vector of the parent node, so its reference count is 2. The parent node, however, has only one strong owner: the local variable p, because the child holds only a weak reference to the parent.

When c goes out of scope, the child’s reference count is reduced from 2 to 1. When p goes out of scope, the parent’s reference count is reduced from 1 to 0, so the parent node is dropped. Dropping the parent also drops its children vector, which removes the last strong reference to the child, reducing the child’s reference count from 1 to 0. As a result, both nodes are deallocated, and no memory leak occurs.

To access the parent node’s fields from the child node, the following pattern can be used:

let c_ref = c.borrow();
if let Some(p_weak) = &c_ref.parent {
    if let Some(p_strong) = p_weak.upgrade() {
        let p_value = p_strong.borrow().value;
        println!("Parent value is {p_value:?}");
    }
}

Here, we first check whether the child node has a parent. We afterwards attempt to upgrade the weak reference (Weak<T>) to a strong reference (Rc<T>). If the upgrade succeeds, the parent node is guaranteed to be alive for the duration of the returned Rc<T>, and its fields can be safely accessed. The calls to borrow() are required because the nodes are wrapped in RefCell<T>.

Conclusion

Rust approaches memory management in a very different way from most mainstream languages. Instead of relying on manual discipline or a garbage collector, it enforces a set of compile-time rules that make memory safety the default. This approach removes whole classes of bugs, including most memory leaks, without adding runtime overhead. In order to write a correct Rust code, it is important to understand where memory lives and how ownership moves through a program. At first, these ideas can be felt unusual and even rather confusing. Nevertheless, with the help of sufficient practice, they start to make sense and gradually become quite natural. In return, Rust provides certain guarantees that are quite difficult to achieve in many other languages.

DEV Community

Rust: Ownership/Borrowing and Memory Leak Prevention

Introduction

The problem

Possible approaches to solving the problem

Memory regions

Ownership

Move

Borrowing

Shared ownership

Rc can cause a memory leak

Resolving Rc memory leaks using Weak

Conclusion

Top comments (0)