Ayoub Alouane for This is Learning

Posted on Jul 26, 2023

The Power Of RUST: Introduction and Deep Dive in Advanced Concepts: Ownership, References and Borrowing.

#rust #performance #compiling #programming

Rust is a language created by an employee in Mozilla before 2010, but after this date the programming language was sponsored by the company, inspired by C/C++, but with more efficiency, compiled, strongly typed, safe and performant but without a garbage collector.

A language empowering everyone to build reliable and efficient software.
-rust-lang.org

Rust is used by multiple big companies like Mozilla of course, Microsoft and Dropbox, it’s used for a lot of cases, building web servers, command-line interfaces, native desktop application, web apps using WebAssemblly, and also for operating systems.

Having a safe and multithread language is already great, but with the maximum performance it will be greater, Rust is here for that, it’s a systems programming language with a compiler, so it’s not interpreted like javascript, it’s compiler is the key for having performant programs and a safe memory. Not only performance, Rust has also productivity due to the package of tooling that it has, like package manager, testing and documentation.

If just start using Rust, maybe you will find it complexe and not easy to understand, but the good thing about Rust is that it makes sure that you understand the concepts before using them.

In this article we will try to cover the basic and advanced concepts that differentiate Rust from other languages in order to give an accessible introduction to all developers who are trying to start coding using Rust.

Basic Concepts

How Mutability works?

The principal feature of Rust is that variables are immutable by default, it means that when we give a value to a variable, we can’t change it anymore, for exemple:

fn main() {
    let a = 7;
    println!("a is: {x}");
    a = 1;
    println!("a is: {x}");
}

If we run this code it will show an error “cannot assign twice to immutable variable a”, because we tried to assign a new value to an immutable variable, so the compiler is able to show us errors in the compile time in order to have a safe code. But maybe in other cases we will need a mutable variable, it's possible in Rust, but we should specify that in the code by adding the keyword mut, for exemple:

fn main() {
    let mut a = 2;
    println!("a is: {x}");
    a = 9;
    println!("a is: {x}");
}

The compiler in this case will not show us an error, because we specify that the variable is mutable, so we can change its value.

Now we have an understanding of how to define variables using the let keyword and how we can change its nature with mut in order to be mutable, but there is another keyword to define variables, its const, using this keyword will allow us to declare constants.

Maybe you will say that let without the keyword mut is already used for defining constants that will not change, but there are some differences:

Constants are immutable, always immutable, there is no chance that they will become mutable, we can’t use the keyword mut.
Constants must have there values in the compile time, they should not have values that we will be calculated in the runtime.
When we declare Constants we should annotate the type of the value.

Exemple:

const RESULT: u32 = 57 * 24;

You can notice that the variable has a specific naming, it’s the convention of naming constants in Rust. u32 is the type of the value that we assigned to the variable RESULT.

We should mention also that constants are also available in the scope where they were defined the entire time of the running of our program.

What is exactly Shadowing?

Another important concept, before introducing it, we should show an exemple:

Exemple 1	Exemple 2
let a = 4;	let a = 4;
let a = a * 2;	a = a * 2;

As we learned, we can’t re-assign a value to a variable defined with only let, it must be immutable, so the exemples that we have will show error normally, that’s the case for Exemple 2, but the exemple 1 is different, it’s not about re-assigning, it’s about declaring a variable with the same name of the last variable. In Rust we say that the first variable is shadowed by the second one, in other words, the second variable overshadows the first, it means that if we print variable a, we will have the first value multiplied by 2. But it’s also about scope, for exemple:

fn main() {
    let a = 4;
    println!("(1) a is: {a}");
    let a = a * 2;
    println!("(2) a is: {a}");
    {
        let a = a + 6;
            println!("(3) a is: {a}");
    }
    println!("(4) a is: {a}");
}
//result
Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 0.57s
     Running `target/debug/playground`
Standard Output
(1) a is: 4
(2) a is: 8
(3) a is: 14
(4) a is: 8

So, as we can see, a was equal to 4, but after re-declaring it, the first variable was shadowed by the second so it equals 8. Now, after opening an other scope we re-declare the variable, so the third variable will shadow the second, but after closing the scope, we returned to the previous declaration that we had in the parent scope.

In this process, we didn’t change the variable, we transformed the value, but the variable itself still immutable after any transformation that we can perform on the value.

We can also change the type of the value in the transformation if we want, but like we said we should use the keyword let in order to create a new variable but with the same name.

Introduction to Data Types

Data Types in Rust are really different, they have a specific syntax. the first thing that we should know is that Rust is statically typed, in other words we should specify the types of all variables at the compile time, Rust can also understand the typed of a variable based on the value that we assigned to it.

We have here two types of Data Type: Scalar and Compound:

Scalar

Single value: integers, floating-point numbers, booleans and characters.

Integers

In Rust there is Signed and Unsigned integers. Signed are integers that can be negative (they can have a signe), unsigned are only positive integers so they will never have a sign. Here is a table that illustrate multiple types:

Length	Signed	Unsigned
8-bit	i8	u8
16-bit	i16	u16
32-bit	i32	u32
64-bit	i64	u64
128-bit	i128	u128
arch	isize	usize

Floating-Point

Floating-point types are numbers with decimal points, they are always signed, there is no unsigned floating-points, and they have two types: f32, f64 (it’s the default type).

fn main() {
    let a = 6.0; // f64

    let b: f32 = 2.9; // f32
}

Boolean

In Rust Boolean Types are: true and false, to define a boolean we use the keyword bool.

fn main() {
    let condition: bool = true;
}

Character

The Character type is a primitive alphabetic type. In Rust we define char with single quotes, but string literals with double quotes.

fn main() {
    let a: char = 'B';
}

Compound Types:

Compound types in rust represent a group of values inside one type, in Rust we have two: Tuples and Arrays.

Array

In Rust an Array is a group of values that have the same type, and the group itself has a fixed length.

fn main() {
    let arr: [i32; 5] = [34, 51, 11, 42, 87];
    let result = arr[2]; // result = 11
}
// i32 is the type of the values
// 5 is the size of the array

Tuple:
In Rust a Tuple is like an array but it can have values with different types.

fn main() {
    let tup: (u64, f32, i8) = (902, 2.2, -1);
    let result = tup.1; //result = 2.2
}

How Functions works in Rust?

Functions in Rust are defined with keyword fn, main function is the entry point of our program:

fn main() {
    hello_adservio(28); //here we pass the value 28 as an argument
}

//Declaring the type of each parameter in a function is mandatory
fn hello_adservio(a: u32) { //the function here has a parametre of type u32
    println!("The age is: {a}");
}

fn main() {
    let x : u64 = multiply_3(5);// the function has a return value of type u64

    println!("The value of x is: {x}");
}

// '-> u64' means that our function has a return type of u64
fn multiply_3(x: u64) -> u64 {
        // we can see here that we don't have a semicolon ';' in the end of our ligne
    x * 3 
}

//result: The value of x is: 15

In this exemple, x * 3 has not a semicolon, so Rust will see it as an expression that will return a value, if we add a semicolon in the end of the line we will get an error:

Compiling playground v0.0.1 (/playground)
error[E0308]: mismatched types
  --> src/main.rs:8:26
   |
8  | fn multiply_3(x: u64) -> u64 {
   |    ----------            ^^^ expected `u64`, found `()`
   |    |
   |    implicitly returns `()` as its body has no tail or `return` expression
9  |         // we can see here that we don't have a semicolon ';' in the end of our ligne
10 |     x * 3;
   |          - help: remove this semicolon to return this value

How OWNERSHIP Wokrs ?

What is Ownership

Being able to manage the memory is the key, Rust is able to test a set of rules through its compiler. These set of rules represent ownership, it's a new concept for developers, it may appear complexe but when we will have a deep understanding of how it works, we will be more productive and we will be able to make safe and performant programs.

Stack and Heap

Before talking about ownership rules, we should give a reminder about "stack and Heap". In a system programming language like RUST, we should have a total control over the memory, the stack and the heap, both are parts of our memory.

Last in first out is the definition of the stack, it’s the part of the memory where we push and pop data from, this data should have a fixed size. Therefore the Heap is for data with unknown size at the compile time. In the Heap we don't push data, we allocate the memory for it, so in other words, when we want to store data in the heap. The memory allocator will search for a place in the heap that can store this data, and it will return a pointer to it, this pointer will be stored in the stack, because its size is fixed, the pointer is the address for the data stored in the heap, after having a deep dive in ownership we will have a deep understanding of how the memory works.

Variable Scope

All variables have a defined scope, in where they are valid, exemple:

let a = "Adservio";

The variable is a String Literal type, it's valid when it is between the brackets, the a variable is immutable, so it's stored in the stack. In order to go deep in ownership we should choose another exemple, STRING type, can be mutable, so we can change it in the runtime.

let mut a = String::from("Adservio");

a.push_str(", Great Place!"); 

println!("{a}"); // this will print `Adservio, Great Place!`

The variable now is mutable, so in order to support a size that can be changed in the runtime, we should allocate memory in the heap.

Now I should throw a reminder, RUST doesn't have a Garbage collector, so in programming languages that has a GC, the GC will take the responsibility to free the memory from a variable when it detects that it will never be used, but in languages that don't have GC, it's up to us to free the memory when we identify that the variable will never be used, but if we forget, the variable will be always there occupying the memory for nothing, so we will have a wasted memory.

Rust is different, it has not a GC, but it manage to do that for us, its approach is to free the memory automatically when the variable go out of scope by calling a function called drop.

Deep Dive in How variables work

Starting with a simple exemple:

let a = 7;
let b = a;

Variables a and b are immutable, so they will be stored in the stack, because they are integers that have a fixed size, so we will create a variable a with the value 7, copy this value and assign it to b, so we have two variable with different data, now let us look to another exemple that will give us a different case:

let a = String::from("Adservio");
let b = a;

We know that String variables are mutable, so the variable a is made of 3 things, the pointer, the length and the capacity, but now we will have a focus on the pointer. The pointer is stored in the stack, the value of the string in the Heap. Now, we assigned the a to b, so we copied the pointer but without copying the data in the Heap that the pointer refers to.

So the two pointers now point on the same value in the heap, the question that we will have now, is if the variable a goes out of scope, normally Rust will free the memory by calling the function drop, the error here is that we will clean a value that is used by other pointer. Another case is that both variables a and b go out of scope in the same time, so Rust will try to free the memory twice, it’s called “double free error”. So what is the solution? In order to have a safe memory, Rust will invalidate the variable a, so after its assigning to the variable b, we will no longer have access to the variable a. Exemple:

fn main() {
    let a = String::from("Adservio");
    let b = a;
    println!("The value of a is: {a}");
}

Result:

Compiling playground v0.0.1 (/playground)
error[E0382]: borrow of moved value: `a`
 --> src/main.rs:4:35
  |
2 |     let a = String::from("Adservio");
  |         - move occurs because `a` has type `String`, which does not implement the `Copy` trait
3 |     let b = a;
  |             - value moved here
4 |     println!("The value of a is: {a}");
  |                                   ^ value borrowed here after move
  |
  = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)
help: consider cloning the value if the performance cost is acceptable
  |
3 |     let b = a.clone();
  |              ++++++++

The error generated by the Rust compiler suggested us to do “let b = a.clone();” which is a deep copy instead of “let b = a;” which we called a “move”, the variable a was moved into b.

Rust is here for better performance, so it will never decide to go by default with a deep copy because of the cost in term of the runtime performance (for values stored in the Heap).

Another case, for types like integers that have a fixed size, the variables are stored completely in the stack, so their copying is not expensive. This code will not have errors:

fn main() {
    let a = 7;
    let b = a;
    println!("The value of a is: {a}");
    println!("The value of b is: {b}");
}

How Functions Work with Ownership

Assigning a variable to another is like passing a variable to a function, it works with same concept, Exemple:

fn main() {
    let a = String::from("Adservio");  
    print_var(a); 
    println!("{a}");            
}

fn print_var(value: String) { 
    println!("{value}");
}

Result:

Compiling playground v0.0.1 (/playground)
error[E0382]: borrow of moved value: `a`
 --> src/main.rs:4:14
  |
2 |     let a = String::from("Adservio");  
  |         - move occurs because `a` has type `String`, which does not implement the `Copy` trait
3 |     print_var(a); 
  |               - value moved here
4 |         println!("{a}");            
  |                    ^ value borrowed here after move
  |
note: consider changing this parameter type in function `print_var` to borrow instead if owning the value isn't necessary
 --> src/main.rs:7:21
  |
7 | fn print_var(value: String) { 
  |    ---------        ^^^^^^ this parameter takes ownership of the value
  |    |
  |    in this function
  = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)
help: consider cloning the value if the performance cost is acceptable
  |
3 |     print_var(a.clone()); 
  |                ++++++++

The code will throw an error which is normal, because the function took ownership of the value of a and Rust will automatically invalidate it. We have an other exemple for integers:

fn main() {
    let a = 7;                      
    print_var(a); 
    println!("{a}");
}

fn print_var(value: i32) { 
    println!("{value}");
}

This code will work normally, it will not throw an error. Like the exemple of assigning integers. The function will not take ownership.

References and Borrowing

To start this part, we should have an idea about how return values work in Rust, the question here is returning a value from a function can transfer ownership? the answer is a big YES, here is an exemple:

fn main() {
    let a = init_value();
    let b = String::from("Group");  
    let c = takes_and_gives_back(b);  
}
fn init_value() -> String {       
    let value = String::from("adservio"); 
    value                              
}
fn takes_and_gives_back(value: String) -> String { 
    value
}

Looking now in the main function, we start by initiating a variable called a, we give it the value returned by the function init_value(), so here we have the function moving its owenership of the value returned to the variable a.

Now let us define a variable b that we will initiate with the value “Group”, and we define now a variable called c, that will get the returned value of the function takes_and_gives_back(b) that will took b as argument, here the function takes_and_gives_back() took ownership of the variable b, and returned a new value and transfer it's ownership to c.

So, is that the only way to keep using a value? by transferring ownership? The answer is NO, Rust has a feature called references.

let us took another exemple now:

fn main() {
    let a = String::from("adservio");
    let b = calculate_length(a);
    println!("The length of '{a}' is {b}.");
}
fn calculate_length(value: String) -> usize {
    value.len()
}

This exemple will clearly throw an error, because the ownership of the value of a is moved to the function calculate_length() that returned a value to assign to b, but when we try to print a:

error[E0382]: borrow of moved value: `a`
 --> src/main.rs:6:31
  |
2 |     let a = String::from("adservio");
  |         - move occurs because `a` has type `String`, which does not implement the `Copy` trait
3 |
4 |     let b = calculate_length(a);
  |                              - value moved here
5 |
6 |     println!("The length of '{a}' is {b}.");
  |                               ^ value borrowed here after move
  |
note: consider changing this parameter type in function `calculate_length` to borrow instead if owning the value isn't necessary
 --> src/main.rs:9:28
  |
9 | fn calculate_length(value: String) -> usize {
  |    ----------------        ^^^^^^ this parameter takes ownership of the value
  |    |
  |    in this function
  = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)
help: consider cloning the value if the performance cost is acceptable
  |
4 |     let b = calculate_length(a.clone());
  |                               ++++++++

The error here is “note: consider changing this parameter type in function calculate_length to borrow instead if owning the value isn't necessary” is trying to told us already what we should do.

The Magical Symbole &

Here is the correct version that will not throw an error:

fn main() {
    let a = String::from("adservio");

    let b = calculate_length(&a); //&...

    println!("The length of '{a}' is {b}.");
}

fn calculate_length(c: &String) -> usize {
    c.len()
}

We can see here that the variable we passed has & just before it &a, and the type of the parameter in calculate_length() has a & before it: `&String`, the symbole & here is telling the rust compiler that the function will only borrow the variable, and that we will pass only a reference to it instead of moving the ownership.

By & we can create a reference to a, so that if we stopped using the reference the value will not be dropped because de reference doesn't own it.

In the function calculate_length the c will not take the ownership of the value passed, because we added & symbole to the type, so when c goes out of scope nothing will happen to the value passed because c holds only a reference to the value not the ownership. And that's what we called borrowing, is the concept of “to borrow” not “to own”.

The question here, are we able to modify a value that we have reference to (that we borrow)? the answer is NO:

fn main() {
    let a = String::from("adservio");

    add_string(&a);
}

fn add_string(b: &String) {
    b.push_str(" Group");
}

Here we are trying to append a string to the one that we passed to the function add_string(), here is the error:

error[E0596]: cannot borrow `*b` as mutable, as it is behind a `&` reference
 --> src/main.rs:8:5
  |
8 |     b.push_str(" Group");
  |     ^^^^^^^^^^^^^^^^^^^^ `b` is a `&` reference, so the data it refers to cannot be borrowed as mutable
  |
help: consider changing this to be a mutable reference
  |
7 | fn add_string(b: &mut String) {
  |                  ~~~~~~~~~~~

The error: b is a & reference, so the data it refers to cannot be borrowed as mutable, the compiler is telling us that we cannot modify the data that b refers to because it is immutable.

Mutable References

Yes we can modify a borrowed data with using a mutable reference:

fn main() {
    let mut a = String::from("adservio");
    add_string(&mut a);
}

fn add_string(b: &mut String) {
    b.push_str(" group");
}

Here we should start by defining a as mutable value, and when we will try to pass it to the function, we should tell the compiler that it's a reference to mutable value by using a mutubale reference &mut a, in the function declaration we add &mut, in order to tell the compiler that b is a reference to a mutable value so that the function will be able to mutate the value that is borrowing.

We cannot have a reference to a value that already have a mutable reference to it. BUT

let mut a = String::from("adservio");

let b = &mut a;
let c = &mut a;

println!("{b}, {c}");

This code will throw an error:

error[E0499]: cannot borrow `a` as mutable more than once at a time
 --> src/main.rs:6:13
  |
5 |     let b = &mut a;
  |             ------ first mutable borrow occurs here
6 |     let c = &mut a;
  |             ^^^^^^ second mutable borrow occurs here
7 |     
8 |     println!("{b}, {c}");
  |                - first borrow later used her

Her we have a big WHY? why we have this behaviour? Rust here is preventing data races at compile time. Because here we can modify the value using the references that we have, so how we will synchronise the modification here? how we will track our code in compile time.

So to prevent data races Rust is throwing an error at the compile time, but we can have multiple mutable references by using scopes, each reference that we create has a defined scope for it, exemple:

   let mut a = String::from("adservio");

   {
       let b = &mut a;
   } 
   let b = &mut a;

the code here compile successfully because, there is no data races thanks to defining scopes using curly brackets.

We cannot have mutable reference to a value when we have an immutable one to it.

We can have multiple immutable references to a value, because it will not produce data races, users of the references will not modify the value, so that will not affect the other references.

Conclusion

In this article we had a global view of why using Rust is great, we had a look also to the syntax and some basic concepts like mutability, and how it offer us a high level of safety and performance, we talked also about ownership, references and borrowing which are an essential concepts in Rust, if we understand them, we will have a total control of Rust when we are working with it.

Top comments (1)

Doanh Văn Lương • Sep 6 '23

Very clear, thank you so much for this. I waste a lot of time to understand this in the official document.

DEV Community