DEV Community

Stefanos Kouroupis
Stefanos Kouroupis

Posted on

1

Displaying all printable* utf-8 characters using Rust

Since I got my first year badge I decided to celebrate by listing my achievements and writing one of the most useless applications I've ever written.

achievements

  • > 20.000 views...pretty neat! nearly 60 per day
  • > 2500 followers ...again nice! that's nearly 7 per day
  • 2.5 years on the same job.

application

This amazing application as the title states prints all utf8 characters that can be printed. The star on the title is that I limited the output to the first 3 bytes.

ENJOY

Alt Text

Our main function has 3 loops

  • one for the single byte chars 0000 - 007F
  • one for the two byte chars 00C0 - 00DF | 0080 - 00BF
  • one for the three byte chars 00E0 - 00EF | 0080 - 00BF | 0080 - 00BF
use std::num::ParseIntError;
use std::str;

fn main() {
    let mut char_index = 0;
    let one_byte = vec![0, 127];

    for i in one_byte[0]..one_byte[1] {
        let mut first = format!("{:X}", i);
        first = make_even(first);
        char_index = output(first, char_index);
    }

    let two_bytes = vec![192, 223, 64, 191];

    for i in two_bytes[0]..two_bytes[1] {
        for j in two_bytes[2]..two_bytes[3] {
            let mut first = format!("{:X}", i);
            let mut second = format!("{:X}", j);

            first = make_even(first);
            second = make_even(second);

            char_index = output(first.to_string() + &second.to_string(), char_index);
        }
    }

    let three_bytes = vec![224, 239, 64, 191, 64, 191];

    for i in three_bytes[0]..three_bytes[1] {
        for j in three_bytes[2]..three_bytes[3] {
            for k in three_bytes[4]..three_bytes[5] {
                let mut first = format!("{:X}", i);
                let mut second = format!("{:X}", j);
                let mut third = format!("{:X}", k);

                first = make_even(first);
                second = make_even(second);
                third = make_even(third);

                char_index = output(
                    first.to_string() + &second.to_string() + &third.to_string(),
                    char_index,
                );
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Hex string needs an even amount of characters

pub fn make_even(mut s: String) -> String {
    if s.len() % 2 == 1 {
        s = "0".to_string() + &s.to_string();
    }
    return s;
}
Enter fullscreen mode Exit fullscreen mode

I got this function from here. What it basically does is, it converts a hex string to a u8 array.

pub fn decode_hex(s: &str) -> Result<Vec<u8>, ParseIntError> {
    (0..s.len())
        .step_by(2)
        .map(|i| u8::from_str_radix(&s[i..i + 2], 16))
        .collect()
}
Enter fullscreen mode Exit fullscreen mode

Nested matches for the win. Checks :

  • if the hex string is valid
  • if the character is valid in utf-8
  • if the character has a printable representation (by looking the printed output length)
pub fn output(hex: String, mut i: i32) -> i32 {
    match &decode_hex(&hex) {
        Ok(dh) => match str::from_utf8(dh) {
            Ok(v) => {
                if format!("{:?}", v).len() < 7 {
                    if i % 10 == 0 {
                        println!("{:?} {:?} \t", hex, v);
                    } else {
                        print!("{:?} {:?} \t", hex, v);
                    }
                    i += 1;
                }
            }
            _ => {}
        },
        _ => {}
    }

    return i;
}

Enter fullscreen mode Exit fullscreen mode

Sentry image

Hands-on debugging session: instrument, monitor, and fix

Join Lazar for a hands-on session where you’ll build it, break it, debug it, and fix it. You’ll set up Sentry, track errors, use Session Replay and Tracing, and leverage some good ol’ AI to find and fix issues fast.

RSVP here →

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay