Back on Rust after a week, it’s time to introduce yet another type. We’re still talking about ownership and its properties, but now we’re going to introduce another piece of the puzzle: we’ll work with strings and collection of strings. It’ll be a bit harder than before, to be honest, since we’ll work with iterations for the first time.
Iterations are a big part of the job, in JavaScript as well, because most functions use them. But today we will see how they deal with ownership in Rust, so it won’t be that easy: we’ll come back to them from a different perspective soon. This time we’re going to meet the slice type, a primitive of its own.
🍰 The Slice Type
If you want to follow the official definition, a slice in Rust is a dynamically-sized view into a contiguous sequence
. Although this makes sense in English, translating it to a programming convention can be hard. I think you can imagine it as an array of elements in a sequence—as I would have defined it in JavaScript.
let vec = vec![1, 2, 3];
let int_slice = &vec[..];
First, I don’t want to confuse you, but I started with the official definition, rather than the book examples, because I think slices must be introduced properly, even before seeing them working with ownership. We have a vector of three elements called vec
and a reference to it int_slice
that’s a slice because it uses [..]
.
Then, &vec[..]
is a reference of the whole collection, since the range operator [..]
includes all the elements in it, but with larger collections we’ll have slices of the entire sequence. Let’s think about the Latin alphabet: if we only want a slice of it, we can define a reference to a limited range of letters.
let letters = &alphabet[2..6];
If alphabet
is a collection of letters from a
to z
, the slice letters
will be a reference to letters from c
to g
: the range [2..6]
takes indexes starting from 0
as limits. A slice will always be a sequence of contiguous elements, so we can’t have a slice made of b
, f
, z
from the alphabet for instance.
Why slices matter in Rust? I don’t know yet, but I can imagine they work more or less like in JavaScript, where we do have a .slice()
method and its variants to operate on arrays. That said, I think we can go on with the ownership implications that now will be easier to understand.
We talked about letters, but what about words? Slices can be segments of words, or even words separated by spaces in a sentence. In the first book example, we have a function that returns the first word of a sentence by slicing it using a space as a separator between words.
fn first_word(s: &String) -> usize {
let bytes = s.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return i;
}
}
s.len()
}
OK, I said it would have been harder. In the example above, we’re considering the reference of a string s
as an array of bytes: that’s what .as_bytes()
does. The for
cycle right after introduces lots of concepts we’ll only meet later, so let me just say that b' '
is a byte literal which indicates a space in the byte format.
Summing up, the function returns either the index of the space or the string length. Of course, we didn’t achieve our goal yet, since we need to go further to get the first word of a sentence. Now we only have the index of the first white space in an array of bytes—otherwise, we’ll get the string length.
fn main() {
let mut s = String::from("hello world");
let word = first_word(&s);
s.clear();
}
Executing this function, we’ll get 5
, that is the index of the space between the words hello
and world
. Remember that indexes start from 0
. These two functions compile, since s
keep writing permissions after the first_word
execution, but it has lost its value, so we don’t have hello world
anymore and we can’t use it.
fn second_word(s: &String) -> (usize, usize) {
We could add a second_word
function with this signature which means we’ll get a tuple of two indexes instead of one, but again we won’t have the desired result, since having the starting and the ending space between words isn’t enough to get the first word of the sentence. And executing a function will empty the string value.
String Slices
That’s where string slices come in handy. Forget the bytes, we don’t need to convert the characters anymore: we can bring back the range syntax we discovered earlier, applying it to strings instead of single letters. I think the next lines of code will make more sense now, since we introduced the slice types at the beginning.
let s = String::from("hello world");
let hello: &str = &s[0..5];
let world: &str = &s[6..11];
let s2: &String = &s;
Thanks to the slice type, s
can be partially referenced in hello
and world
without loosing its value, so s2
will be the whole sentence hello world
: if we did it by using functions, s2
would have been undefined and prevents the code to be compiled. Here hello
equals to hello
, world
to world
, and s
and s2
to hello world
.
Take a breath, because slices are so-called “fat” pointers. This means that they not only point to their respective value, but they also have a length like they were collections: hello
, for example, is the value of the variable hello
, but has a length of 5
as well. Five, not four, because it implicitly includes the ending space.
Range Syntax
JavaScript has spreads, but it hasn’t ranges. Python has a range()
built-in function that works like Rusts’, but with a slightly different syntax. Rust simplifies ranges by putting two dots in square brackets [..]
. This takes the whole range of a collection, since omitting numbers means from the beginning to the end.
let s = String::from("hello");
let slice = &s[0..2];
let slice = &s[..2];
These two have the same result. Writing [0..2]
or [..2]
is the same. We can also have the opposite, because we can omit the ending number if we want to implicitly get a range that comes from a given index to the end. Or we can have the whole collection, as I said, by omitting both.
Rewriting first_word
With String Slices
Coming back to the ownership, we’re talking about slices because they help to keep the reading permission along with the original value of a given collection. Remember? When executing first_word
, we ended having the index 5
of an undefined string. With slices we’ll still have a defined string.
fn first_word(s: &String) -> &str {
let bytes = s.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return &s[0..i];
}
}
&s[..]
}
This will return 5
as the first version of our function, but will let the original variable keep its value, so that we can also execute a second function on it: &s[0..i]
equals to hello
(with an ending space) in bytes, then returns the number five, while &s[..]
will print the string length as it was the .len()
method.
fn main() {
let mut s = String::from("hello world");
let word = first_word(&s);
s.clear();
println!("the first word is: {}", word);
}
Wait, because the function above won’t compile. I said we kept the reading permission on s
, along with its hello world
value, but we lost the writing permission, so the .clear()
method will fail. We are still able to send &s
to a second_word
function, but we can’t clear the original variable.
String Literals Are Slices
OMG, I think I’m going crazy. In Rust, you may consider string literals as slices, so any string is also an immutable reference of a slice, having the same permissions. I don’t know if now this would be easier to understand, but for sure we must know it. Let’s consider the same string.
let s = "hello world";
Above, s
has a type of &str
, so it’s definitely an immutable reference of a string. We can think of &str
as a slice &s[..]
of the whole given string. This can be a bit confusing at first, so I don’t know if it’s worth looking into further, but it’s logically correct. Let’s move on.
String Slices as Parameters
Slices, slices everywhere. I must admit that I have a headache. We should now consider the differences between &String
and &str
: if you remember, we talked about boxes some weeks ago, and the differences between boxes like String
and literals like str
. Same applies to their references.
fn first_word(s: &str) -> &str {
Changing the function signature like this means to pass a reference of a slice of a String
, instead of the reference of a whole String
. But… I think you’re going to tell me to f*ck off, because technically speaking they can be the same thing. Keep in mind what we have already seen.
fn main() {
let my_string = String::from("hello world");
let word = first_word(&my_string[..]);
let word = first_word(&my_string);
let my_string_literal = "hello world";
let word = first_word(&my_string_literal[..]);
let word = first_word(my_string_literal);
}
Above, we have several ways of doing the same thing. So, &my_string[..]
and &my_string
are both references of slices of the whole String
variable. Then, you can either use the first or the second one to get the same result. On the other hand, my_string_literal
has a type of &str
, and you can use it in both ways.
Other Slices
This chapter of The Rust Programming Language Book ends where we started. I wanted to talk about slices, starting from collections or arrays of elements: I used the Latin alphabet as an example, and its letters as elements of it. We’ll have more experiences with slices of collections, than with slices of strings.
let a = [1, 2, 3, 4, 5];
let slice = &a[1..3];
assert_eq!(slice, &[2, 3]);
This code is important, because we have a slice reference of the array that is itself an array of three elements, 2
, 3
, and 4
, since it includes references to indexes 1
, 2
, and 3
. There’s an error, since the book omitted 4
in the last assertion. I think I must report it.
I love when I find bugs, and I hate when people report me mine. So, I don’t know if I’ll report the bug I found: I’m afraid I didn’t understand the book, and what I consider a bug is actually right. What if last index is excluded from the range? Nah, it can’t be, since hello
stops at 5
.
Top comments (0)