One of the pretty neat features of Rust I'd like to talk about is block expressions. This subtle feature does not receive enough justice as everyone is focused on more prominent language features. They help sealing unnecessary variables, cleaning up scope and have many other advantages. I’d like to go over a few examples.
First of all, a little intro.
let a = func1();
let b = func2(a);
let c = func3(a, b);
func4(&c);
func5(&c, 10);
Here the code could be divided into two blocks. The first declares multiple variables that are ultimately used to produce c
and then c
is used later in the code. This pattern is not fully artificial, it can be found in many relatively long functions.
In Rust, blocks that are delimited by {...}
are expressions and are evaluated to a value. There is a way to rewrite this code using block expressions.
let c = {
let a = func1();
let b = func2(a);
func3(a, b)
};
func4(&c);
func5(&c, 10);
This code has some subtle differences to the first example. Blocks limit the scope of variables. a
and b
are internal to the block so they are not visible in the outer scope and drop
is applied to them at the block closing line. As simple as that. Using block expressions is a matter of code style that can be applied to suitable code.
When applied, blocks bring some advantages that are not immediately obvious. Let's take a look.
Refactoring
Block expressions offer a good ground in preparation for future refactoring. When a block expression is used you can guarantee that internal variables are not used anywhere else in the outer function. This makes the code in the block ready to be easily turned into a standalone function.
// Make b
let a = func1();
let b = func2(a); // Well, let's imagine it is a lot of code to get here.
// Use b
func3(b);
Yay, let’s move that to a function.
fn compute_b() -> u32 {
let a = func1();
func2(a)
}
let b = compute_b();
func3(b);
...
// 100 lines below:
println!("Btw, important to know, a={}", a); /// Compilation error, uff!
Easy to fix probably, but it makes refactoring unpleasant, it does not satisfyingly click. Was it really important to use that a far below? Maybe yes, but often it does not matter and this code is a result of having scope hygiene as an afterthought. Block expressions help us limit the scope to just the right amount.
No boilerplate variables in the top scope
Let's up the game and see a Tokio example.
let a = String::from("Hello World");
let a_clone = a.clone(); // I feel pain each time seeing this.
let u = tokio::spawn(async move { a_clone.to_uppercase() });
let l = tokio::spawn(async move { a.to_lowercase() });
println!("upper={:?}, lower={:?}", u.await, l.await);
a_clone
variable is ugly, but we need it. Two closures need to own their own copies of String
(using Arc
does not fix it), so a_clone
is moved to the first closure, and original a
ends up in the second closure. Let’s attempt a block expression style:
let a = String::from("Hello World");
let u = {
let a = a.clone();
tokio::spawn(async move { a.to_uppercase() })
};
let l = tokio::spawn(async move { a.to_lowercase() });
println!("upper={:?}, lower={:?}", u.await, l.await);
This does not look simpler than what we had before at a first glance, but this code has a few benefits. a
can remain a
and does not need a new name. The outer scope remains clean so you can easily distinguish top variables by the indentation of their let
and hide boilerplate variables to the second level of indentation.
No unnecessary mut variables in the top scope
Here is another example. PathBuf
. PathBuf::push
only works on mutable instances.
let mut sub_dir = dir.ok_or_else(|| format_err!("Cannot get dir"))?;
sub_dir.push("sub");
sub_dir
remains mut
for the rest of the scope and we don't like that in Rust, do we?
let sub_dir = {
let mut d = dir.ok_or_else(|| format_err!("Cannot get dir"))?;
d.push("sub");
d
};
The mutability of the variable is confined inside the initialization block.
Fewer bugs
Now let’s use some Tokio channels.
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let (tx, mut rx) = tokio::sync::mpsc::channel(3);
for sender in 0..2 {
let tx = tx.clone();
tokio::spawn(async move {
for i in 0..5 {
tx.send(i).await.unwrap();
println!("Sent {} from sender {}", i, sender);
}
});
}
while let Some(x) = rx.recv().await {
println!("Received {}", x);
}
Ok(())
}
Let’s check the output.
Sent 0 from sender 0
Sent 0 from sender 1
Sent 1 from sender 0
Received 0
Received 0
Received 1
Sent 2 from sender 0
Sent 3 from sender 0
Sent 1 from sender 1
Received 2
Received 3
Received 1
Sent 2 from sender 1
Sent 3 from sender 1
Sent 4 from sender 0
Received 2
Received 4
Received 3
Sent 4 from sender 1
Received 4
Looks correct… Nope, I’ve tricked you here. The text output is correct but the program does not exit! Can you spot the issue?
tx
is being cloned in the loop, so each async co-routine has its own channel Sender
. The problem is that the original tx
remains existing until the end of the main
function, but listening on rx
is expected to only finish when all tx
are dropped.
Indeed, drop
fixes the issue and the program successfully terminates.
drop(tx);
while let Some(x) = rx.recv().await {
Yuck, this is like calling free()
from C, otherwise it leaks. In my Rust. The Earl of Lemongrab screams “Unacceptable!”.
Since this article is about block expressions (a.k.a. “a hammer”), every problem is a nail. Let’s try. Thankfully block expressions are about things not leaking in scope further than needed.
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let mut rx = {
let (tx, rx) = tokio::sync::mpsc::channel(3);
for sender in 0..2 {
let tx = tx.clone();
tokio::spawn(async move {
for i in 0..5 {
tx.send(i).await.unwrap();
println!("Sent {} from sender {}", i, sender);
}
});
}
rx
};
while let Some(x) = rx.recv().await {
println!("Received {}", x);
}
Ok(())
}
It works and terminates! It does not need a drop
call. Wait, but we have been promised that refactoring is easy with block expressions, let’s try that.
async fn spawn_senders() -> Receiver<u32> {
let (tx, rx) = tokio::sync::mpsc::channel(3);
for sender in 0..2 {
let tx = tx.clone();
tokio::spawn(async move {
for i in 0..5 {
tx.send(i).await.unwrap();
println!("Sent {} from sender {}", i, sender);
}
});
}
rx
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let mut rx = spawn_senders().await;
while let Some(x) = rx.recv().await {
println!("Received {}", x);
}
Ok(())
}
Yes, it was. The block content is unchanged. We prepared for potential refactoring ahead of time and avoided a leak.
Performance
Longevity of objects can impact performance. I will show the most prominent example: lock guards.
Let’s say we need to process data from two RwLocks
.
async fn slowly_process(a: i32, b: i32) -> i32 {
tokio::time::sleep(Duration::from_millis(1000)).await;
a + b
}
async fn process_data_from_two_locks(a: Arc<RwLock<i32>>, b: Arc<RwLock<i32>>) -> i32 {
let a = a.read().await;
let b = b.read().await;
slowly_process(*a, *b).await
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let a = Arc::new(RwLock::new(1));
let b = Arc::new(RwLock::new(2));
let writer = {
let a = a.clone();
tokio::spawn(async move {
let start = Instant::now();
tokio::time::sleep(Duration::from_millis(100)).await;
// A bit late to be first to the lock...
let mut a = a.write().await;
*a = 10;
println!(
"Writing took 100ms! Wait... It took: {:?}",
Instant::now() - start
);
})
};
let result = process_data_from_two_locks(a, b).await;
println!("Result: {}", result);
writer.await.unwrap();
Ok(())
}
We run our program and it prints
Result: 3
Writing took 100ms! Wait... It took: 1.000505585s
Our writer was delayed a bit, processing took over and apparently locks were held for 1 second. The issue is caused by the fact that slowly_process()
runs with both read locks held. The read locks are implicit scope guarded locks and they are only dropped at the end of the function, when the references go out of scope.
async fn process_data_from_two_locks(a: Arc<RwLock<i32>>, b: Arc<RwLock<i32>>) -> i32 {
let a = a.read().await;
let b = b.read().await;
slow(*a, *b).await
}
This is a relatively well known pitfall with scope guarded locks, whether it is defer
from Go or std::lock_guard
from C++. If scope is used to lock and unlock the data, that scope must be minimal.
I am not going to say “Let’s fix it with Rust block expressions”. Instead I will say “If we used blocked expressions from the beginning, this would not have happened”. Or simply “I told you so”.
async fn process_data_from_two_locks(a: Arc<RwLock<i32>>, b: Arc<RwLock<i32>>) -> i32 {
let a = { *a.read().await };
let b = { *b.read().await };
slow(a, b).await
}
As a result:
Writing took 100ms! Wait... It took: 101.377406ms
This example has a shortcut. It was smooth because i32
is a Copy
type. Read locks in general only allow you to borrow the data inside while you hold the lock. To release it earlier you need to copy the data you need out of the block. For example:
let field = { a.read().await.field.clone() };
Trade-off is yours to consider.
Conclusion
I’ve shown benefits that such a shy Rust feature as block expressions can bring to your code. It should help you to keep your scope clean and can positively impact your programs at runtime.
Goes without saying, every tool must be used sparingly. The cost of block expressions is the depth of indentation and if overused it can make your programs unreadable. Let’s apply our best judgment.
I hope this was helpful. This is my first shot at writing articles at dev.to. I hope to keep this up.
Thanks,
Igor.
Unless otherwise noted, the code on this site is made available to you under the Apache 2.0 license. Copyright 2021 Google LLC.
Top comments (3)
Nice discussion! I would point out that you can also stop a variable from being
mut
by shadowing its binding:I agree that using a block seems more elegant, but like using
drop()
, this avoids a level of indentation.Will
let a = *a.read().await;
do?I'd say so, yes. Probably block expression helps more if you need to do a few operations on the unlocked value while it is still borrowed. Or if you would like to combine multiple unlocks, but you want to limit the scope for no longer than necessary.