Skip to content

DEV Community

Oluwanifemi Latunde

Posted on Sep 27, 2022

UTF-8 Validation

It's day 7 of the #I4G10DaysOfCodeChallenge. The objective of today's task was to determine whether the set of integers constitutes a valid UTF8 string or not.

You can find more details about the challenge here

A character in UTF-8 can be from 1 to 4 bytes long, subjected to the following rules:

For a 1-byte character, the first bit is a 0, followed by its Unicode code.
For n-bytes character, the first n-bits are all ones, the n+1 bit is 0, followed by n-1 bytes with the most significant 2 bits being 10.

     Number of Bytes   |        UTF-8 Octet Sequence
                       |              (binary)
   --------------------+-----------------------------------------
            1          |   0xxxxxxx
            2          |   110xxxxx 10xxxxxx
            3          |   1110xxxx 10xxxxxx 10xxxxxx
            4          |   11110xxx 10xxxxxx 10xxxxxx 10xxxxxx

Syntax:

Start with count = 0.
for “c” ranging from 0 to the size of the data array.
If the count is 0, then:
If x/32 = 110, then set count as 1. (x/32 is same as doing x >> 5 )
Else if x/16 = 1110, then count = 2 (x/16 is same as doing x >> 4 )
Else If x/8 = 11110, then count = 3. (x/8 is same as doing x >> 3 )
Else if x/128 is 0, then return false. (x/128 is same as doing x >> 7 )
Else If x/64 is not 10, then return false and decrease the count by 1.
When the count is 0, return true.

Result:
Runtime: 234 ms, faster than 56.39% of Python3 online submissions for UTF-8 Validation.

Memory Usage: 14.1 MB, less than 97.22% of Python3 online submissions for UTF-8 Validation.

Top comments (0)

Subscribe

Read next

Configure IRSA using EKS to access S3 from a POD in terrafo

Gerson Morales - Jan 5

Instalação do RKE2 em HA

Guto Ribeiro - Jan 5

SpringBoot Web Service - Part 2 - Preparing Using Spring Initializr

vlaship - Jan 5

SpringBoot Web Service - Part 1 - Create Repository

vlaship - Jan 5

Oluwanifemi Latunde

I am a Web Developer with about two years experience mainly in frontend development but transitioning into backend development.

Location

Lagos, Nigeria
Education

University of Ibadan
Joined

Jun 15, 2022

Journey to Backend Development

Merge k Sorted Lists