DEV Community

Cover image for Why does slicing in Rust fail on anything that's not English?
Gaurav Burande
Gaurav Burande

Posted on

Why does slicing in Rust fail on anything that's not English?

The thing is, slicing in Rust happens by byte index, not char index!

So when you do:

let full = String::from("hello world");
let part = &full[0..5]; // slice from index 0 to 5 (not inclusive)
println!("{}", part);   // prints "hello"
Enter fullscreen mode Exit fullscreen mode

Every char here takes a 1 byte space when the string is in English.

'A'  => 01000001  (1 byte)
'z'  => 01111010  (1 byte)
Enter fullscreen mode Exit fullscreen mode

So when you slice from the 0th to the 4th index on the bytes, you get "hello"!

Unlike this, let's say when you do something like using Unicode characters, Hindi characters or emojis. Let's take an example:

let s = String::from("नमस्ते");
let slice = &s[0..3]; // ⚠️ this will panic!
Enter fullscreen mode Exit fullscreen mode

Why do you think this panicked?
Rust strings are UTF-8 encoded, meaning that other than English, characters or emojis may take more than 1 byte of space.

Character UTF-8 Encoding Bytes
A 0x41 1
ñ 0xC3 0xB1 2
0xE0 0xA4 0xA8 3
😊 0xF0 0x9F 0x98 0x8A 4

So yeah, Hindi characters like न use 3 bytes!

So when you slice "नमस्ते" from 0th to 2nd index, you're basically slicing in the middle of a character, which is invalid UTF-8.
So:

  • &s[0..3] → valid (न)

  • &s[0..2] → ❌ invalid, slicing inside the byte-sequence of न

Rust is very strict about UTF-8 string encoding, so the rumtime panics!

Then how do you slice properly?
Rust gives you two options:

1. Slice using char indices:

let s = String::from("नमस्ते");
let first_char = s.chars().nth(0).unwrap(); // ✅ gives 'न'
Enter fullscreen mode Exit fullscreen mode

2. Use .get() with range:

let slice = s.get(0..3); // returns Option<&str>
Enter fullscreen mode Exit fullscreen mode

So you can safely do:

if let Some(valid_slice) = s.get(0..3) {
    println!("{}", valid_slice);
}
Enter fullscreen mode Exit fullscreen mode

This way, Rust won’t panic if the range is invalid, and it just gives you None as a Result variant.

✅ Always slice only if you know you’re at valid UTF-8 character boundaries.

Top comments (0)