DEV Community

Cover image for The bit-wise similarity between the character cases
Mithil Poojary
Mithil Poojary

Posted on

The bit-wise similarity between the character cases

The ASCII value of 'a' in binary is 1100001.
The ASCII value of 'A' in binary is 1000001.

Notice the similarity?

The 6th LSB is set for lowercase chars and reset for uppercase chars. Rest all bits are the same.

This can be fun to play around with. For example, you can lower the case by simply doing

c | 32
Enter fullscreen mode Exit fullscreen mode

And you can toggle the case (that is uppercase to lowercase and vice versa) by:

c ^ 32
Enter fullscreen mode Exit fullscreen mode

Combining the above two actions, you can upper the case:

(c | 32) ^ 32
Enter fullscreen mode Exit fullscreen mode

Another point to observe is that the difference between these ASCII values for any uppercase char and corresponding lowercase char is 32.

Oh, and by the way, man ASCII provides a nice table for ASCII values.

The investigation phase

While doing some crypto exercise, I had a string that was encrypted such that each character of the original string was XORed with some (unknown) character, to form the new character for the encrypted string.
The task was to decrypt and get the original string. Now it is a known fact that the actions of XOR can be reversed by XORing once again with the same number (a character is a number).

(a ^ b) ^ b = a ^ (b ^ b) = a ^ 0 = a
Enter fullscreen mode Exit fullscreen mode

There are only 128 ASCII character values. So, a simple brute-force, and then by observing the output, you can make out the original string (given that it made sense originally).

When I experimented with the above logic, with arbitrary character 'x', I realized that the decryption made sense 2 times. Once at 'x' and the second time at 'X'. But the thing was, the case had been toggled for 'X'.

This led to some further investigations, and I came to the above conclusions.

Top comments (0)