Hey! It's day 7 of 10 days coding challenge with I4G. Today's task was to write a code that validates a utf-8 code.
Thought process:
Understanding of problem: The problem here is to validate a utf-8 character. a utf-8 character has the following features:
- A valid UTF-8 character can be 1 - 4 bytes long.
- For a 1-byte character, the first bit is a 0, followed by its unicode.
- For an n-bytes character, the first n-bits are all ones, the n+1 bit is 0, followed by n-1 bytes with most significant 2 bits being 10.
- The input given would be an array of integers containing the data. We have to return if the data in the array represents a valid UTF-8 encoding. The important thing to note here is that the array doesn't contain data for just a single character. As can be seen from the first example, the array can contain data for multiple characters all of which can be valid UTF-8 characters and hence the charset represented by the array is valid.
Solution: To achieve this task, I used bit manipulation. A right shift if performed on the to check it's results for either 1 byte, 2 byte, 3 bytes or 4 bytes.
Algorithm:
- Declare two integer variables: count, i
- Set count = 0;
- Using for loop iterate through the array of integers with condition; i = 0, i < data.length
- perform bitwise shift and compare to the features of utf-8 code
- Return true if condition is met or false if otherwise
Checkout the code here: https://leetcode.com/problems/utf-8-validation/submissions/
Top comments (0)