DEV Community

Cover image for (A Few) Advanced Variable Types in Rust
Jeff Culverhouse
Jeff Culverhouse

Posted on • Originally published at rust.graystorm.com on

(A Few) Advanced Variable Types in Rust

Programmer on laptop, presumably trying to figure out how to get access to his variable in several threads of his Rust program.
Keep one eye on your code at all times!

“I haven’t seen Evil Dead II yet”. Much is made about this simple question in the movie adaption of [High Fidelity](https://en.wikipedia.org/wiki/High_Fidelity(film))_. Does “yet” mean the person does, indeed, intend to see the film? Jack Black’s character is having real trouble with the concept – not only does he know that the speaker, John Cusack’s character, has seen Evil Dead II, but what idiot wouldn’t see it, “because it’s a brilliant film. It’s so funny, and violent, and the soundtrack kicks so much ass.” I love this exchange, but I’m a fan of the film anyway. It is not always clear to me how to handle advanced variable types Rust, yet.

I think of these as wrappers that add abilities (and restrictions) to a variable. They give a variable super powers since the Rust compiler is so strict about what you can and can’t do with variables.


Box<T>

PROVIDES:

Smart pointer that forces your variable’s value to be stored on the heap instead of the stack. The Box<> variable itself is just a pointer so its size is obvious and can, itself, be stored on the stack.

RESTRICTIONS:

USEFUL WHEN:

If the size of an item cannot be determined at compile time it will complain if the default is to store it on the stack (where a calculable size is necessary). Using Box<> will force the storage on the heap where a fixed size is not needed. For example, a recursive data-structure, including enums, will not work on the stack because a concrete size cannot be calculated. Turning the recursive field into a Box<> means it stores a pointer which CAN be sized. The example in the docs being:

enum List\<T\> {
  Cons(T, Box\<List\<T\>\>),
  Nil,
}

Also useful if you have a very large-sized T, and want to transfer ownership of that variable without it being copied each time.

NOTABLY PROVIDES:

just see the rust-lang.org docs

EXAMPLES/DISCUSSION:

https://doc.rust-lang.org/stable/rust-by-example/std/box.html

https://www.koderhq.com/tutorial/rust/smart-pointer/

https://manishearth.github.io/blog/2017/01/10/rust-tidbits-box-is-special/

Setting the value of a simple Box<> variable is easy enough and getting the value back looks very normal:

fn main() {
  let answer = Box::new(42);
  println!("The answer is : {}", answer);
}

Cell<T>

PROVIDES:

You can have multiple, shared references to the Cell<> (and thus, access to the value inside with .get()) and yet still mutate the value inside (with .set()). This is called interior mutability because the value inside can be changed but mut on the Cell<> itself is not needed. The inner value can only be set by calling a method on the Cell<>.

RESTRICTIONS:

It is not possible to get a reference to what is inside the Cell , only a copy of the value. Also, Cell does not implement sync , so it cannot be given to a different thread, which ensures safety.

USEFUL WHEN:

Usually used for small values, such as counters or flags, where you need multiple shared references to the value AND be allowed to mutate it at the same time, in a guaranteed safe way.

NOTABLY PROVIDES:

.set() to set the value inside

.get() to get a copy of the value inside

.take() to get a copy of the value inside AND reset the value inside to default.

see the rust-lang.org docs

EXAMPLES/DISCUSSION:

https://hub.packtpub.com/shared-pointers-in-rust-challenges-solutions/

https://ricardomartins.cc/2016/06/08/interior-mutability

Setting the inner value of a Cell<> is only possible with a method call which is how it maintains safety:

use std::cell::Cell;
fn main() {
  let answer = Cell::new(0);
  answer.set(42);
  println!("The answer is : {}", answer.get());
}

RefCell<T>

PROVIDES:

RefCell<> is very similar to Cell<> except it adds borrow checking, but at run-time instead of compile time! This means, unlike Cell<> , it is possible to write RefCell<> code which will panic!(). You borrow() a ref to the inner value for read-only or borrow_mut() in order to change it.

RESTRICTIONS:

borrow() will panic if a borrow_mut() is in place, and borrow_mut() will panic if either type is in place.

USEFUL WHEN:

NOTABLY PROVIDES:

.borrow() to get a copy of the value at the ref

.borrow_mut() to set the value at the ref

.try_borrow() and .try_borrow_mut() will return a Result<> or error instead of a panic!().

see the rust-lang.org docs

EXAMPLES/DISCUSSION:

https://ricardomartins.cc/2016/06/08/interior-mutability (again)

You must successfully borrow_mut() the RefCell<> in order to set the value (by dereferencing) and then simply borrow() it to retrieve the value:

use std::cell::RefCell;
fn main() {
  let answer = RefCell::new(0);
  *answer.borrow_mut() = 42;
  println!("The answer is : {}", answer.borrow());
}

whereas, something as simple as this compiles, but panics at run-time. Imagine how much more obscure this code could be. Remember, any number of read-only references or exactly 1 read-write reference and nothing else – although for RefCell, this is enforced at run-time:

use std::cell::RefCell;
fn main() {
  let answer = RefCell::new(0);
  let break_things = answer.borrow_mut();
  println!("The initial value is : {}", *break_things); 
  *answer.borrow_mut() = 42;
  println!("The answer is : {}", answer.borrow());
}

Rc<T>

PROVIDES:

Adds the feature of run-time reference counting to your variable, but this is the simple, lower-cost version – it is not thread safe.

RESTRICTIONS:

Right from the docs “you cannot generally obtain a mutable reference to something inside an Rc. If you need mutability, put a Cell or RefCell inside the Rc“. So while there is a get_mut() method, it’s easy to just use a Cell<> inside.

USEFUL WHEN:

You need run-time reference counting of a variable so it hangs around until the last reference of it is gone.

NOTABLY PROVIDES:

.clone() – get a new copy of the pointer to the same value, upping the reference count by 1.

see the rust-lang.org docs

EXAMPLES/DISCUSSION:

https://blog.sentry.io/2018/04/05/you-cant-rust-that#refcounts-are-not-dirty

Note that in the example below, my_answer is still pointing to valid memory even when correct_answer is dropped, because the Rc<> had an internal count of “2” and drops it to “1”, leaving the storage of “42” still valid.

use std::rc::Rc;
fn main() {
  let correct_answer = Rc::new(42);
  let my_answer = Rc::clone(&correct_answer);
  println!("The correct answer is : {}", correct_answer);
  drop(correct_answer);

  println!("And you got : {}", my_answer);
}

Arc<T>

PROVIDES:

Arc<> is an atomic reference counter, very similar to Rc<> above but thread-safe.

RESTRICTIONS:

More expensive than Rc<>. Also note, the <T> you store must have the Send and Sync traits. So an Arc<RefCell<T>> will not work because RefCell<> is not Sync.

USEFUL WHEN:

Same as Rc<> , You need run-time reference counting of a variable so it hangs around until the last reference of it is gone, but safe across threads as long as the inner <T> is.

NOTABLY PROVIDES:

see the rust-lang.org docs

EXAMPLES/DISCUSSION:

https://medium.com/@DylanKerler1/how-arc-works-in-rust-b06192acd0a6

Same idea as with Rc<> , we just show it working across multiple threads (and then sleep for just 10ms to let those threads finish).

use std::sync::Arc;
use std::thread;
use std::time::Duration;
fn main() {
  let answer = Arc::new(42);
  for threadno in 0..5 {
    let answer = Arc::clone(&answer);
    thread::spawn(move || {
      println!("Thread {}, answer is: {}", threadno + 1, answer);
    });
  }
  let ten_ms = Duration::from_millis(10);
  thread::sleep(ten_ms);
}

Mutex<T>

PROVIDES:

Mutual exclusion lock protecting shared data, even across threads.

RESTRICTIONS:

Any thread which panics will “poison” the Mutex<> and make it inaccessible to all threads. The T stored must allow Send but Sync is not necessary.

USEFUL WHEN:

working on it!

NOTABLY PROVIDES:

see the rust-lang.org docs

EXAMPLES/DISCUSSION:

https://doc.rust-lang.org/book/ch16-03-shared-state.html

use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;
fn main() {
  let answer = Arc::new(Mutex::new(42));
  for thread_no in 0..5 {
    let changer = Arc::clone(&answer);
    thread::spawn(move || {
      let mut changer = changer.lock().unwrap();
      println!("Setting answer to thread_no: {}",
        thread_no + 1,
      );
      *changer = thread_no + 1;
    });
  }
  let ten_ms = Duration::from_millis(10);
  thread::sleep(ten_ms);
  if answer.is_poisoned() {
    println!("Mutex was poisoned :(");
  }
  else {
    println!("Mutex survived :)");
    let final_answer = answer.lock().unwrap();
    println!("Ended with answer: {}", final_answer);
  }
}

RwLock<T>

PROVIDES:

Similar to RefCell, but thread safe. borrow() is read(), borrow_mut is write(). They don’t return an option, they will block until they do get the lock.

RESTRICTIONS:

Any thread which panics while a write lock is in place will “poison” the RwLock<> and make it inaccessible to all threads. A panic! during a read lock does not poison the RwLock. The T stored must allow both Send and Sync.

USEFUL WHEN:

working on it!

NOTABLY PROVIDES:

see the rust-lang.org docs

EXAMPLES/DISCUSSION:

Slightly fancier example, that shows getting both read() and write() locks on the value. If nothing panics, we should see the answer at the end.

use std::sync::{Arc, RwLock};
use std::thread;
use std::time::Duration;
fn main() {
  let answer = Arc::new(RwLock::new(42));
  for thread_no in 0..5 {
    if thread_no % 2 == 1 {
      let changer = Arc::clone(&answer);
      thread::spawn(move || {
        let mut changer = changer.write().unwrap();
        println!("Setting answer to thread_no: {}",
          thread_no + 1,
        );
        *changer = thread_no + 1;
      });
    }
    else {
      let reader = Arc::clone(&answer);
      thread::spawn(move || {
        let reader = reader.read().unwrap();
        println!( "Checking answer in thread_no: {}, value is {}",
          thread_no + 1,
          *reader
        );
      });
    }
  }
  let ten_ms = Duration::from_millis(10);
  thread::sleep(ten_ms);
  if answer.is_poisoned() {
    println!("Mutex was poisoned :(");
  }
  else {
    println!("Mutex survived :)");
    let final_answer = answer.read().unwrap();
    println!("Ended with answer: {}", final_answer);
  }
}

Checking answer in thread_no: 1, value is 42
Checking answer in thread_no: 3, value is 42
Setting answer to thread_no: 2
Checking answer in thread_no: 5, value is 2
Setting answer to thread_no: 4
Mutex survived :)
Ended with answer: 4

Summary

There are more, plus many custom types, some I’ve even used like the crate once_cell. I started using that for the web app I was (am?) working on and wrote a little about it. Also, as you saw in the last two examples, you can combine types when you need multiple functionalities. I have included these examples in a GitHub repo, pointers.

I’ll probably hear about or (much more slowly) learn about mistakes I’ve made in wording here or come up with much better examples and excuses for using these various types, so I’ll try to update this post as I do. I see using this myself as a reference until I am really familiar with each of these types. Obviously, any mistakes here are mine alone as I learn Rust and not from any of the links or sources I listed!

Also, lots of help from 3 YouTubers I’ve been watching – the best examples can been seen as they write code and explain why they need something inside an Rc<> or in a Mutex<>. Check out their streams and watch over their shoulder as they code!!

The post (A Few) Advanced Variable Types in Rust appeared first on Learning Rust.

Top comments (0)