I am close to the end of a 10-day coding challenge organized by Ingressive for Good and I'll be sharing my thoughts on Day 7 of the challenge.
Let me briefly describe how the challenge works before I share my experience with you. From September 21 through October 30, 2022, the challenge is to solve one common algorithm problem every day for ten days.
Today's “UTF-8 Validation" It is the 393rd algorithm challenge on "Leetcode.com".
Here is the question:
Given an integer array data representing the data, return whether it is a valid UTF-8 encoding (i.e. it translates to a sequence of valid UTF-8 encoded characters).
A character in UTF8 can be from 1 to 4 bytes long, subjected to the following rules:
For a 1-byte character, the first bit is a 0, followed by its Unicode code. For an n-bytes character, the first n bits are all one's, the n + 1 bit is 0, followed by n - 1 bytes with the most significant 2 bits being 10.
This is how the UTF-8 encoding would work:
Number of Bytes | UTF-8 Octet Sequence
| (binary)
1 | 0xxxxxxx
2 | 110xxxxx 10xxxxxx
3 | 1110xxxx 10xxxxxx 10xxxxxx
4 | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
x denotes a bit in the binary form of a byte that may be either 0 or 1.
Note: The input is an array of integers. Only the least significant 8 bits of each integer is used to store the data. This means each integer represents only 1 byte of data.
I'll now discuss my algorithm solving experience. I created a variable and initialized it with a value of zero. The homogeneity of each byte in the array was verified by comparing it to the UTF-8 constant. The value of the variable was increased if there was no homogeneity.
This task was completed in javaScript, took 111 milliseconds to execute, and consumed about 44 MB of memory.
Top comments (0)