## DEV Community

Gealber Morales

Posted on • Originally published at gealber.com

# Challenge RE #7

## Introduction

From my previous posts you can notice that I've been poisoning myself with small doses of assembly language and C. The best combination for a fast effect 😁, soon I will be immune to it. To be honest I've been enjoying the challenges, because all of them are accessible, in the same sense that David Hilbert said:

A mathematical problem should be difficult in order to entice us, yet not completely inaccessible, lest it mock at our efforts. It should be to us a guide post on the mazy paths to hidden truths, and ultimately a reminder of our pleasure in the successful solution.

Should be enjoyable in general, without frustrating you. Wanted to remark that because the guy who made this challenges, Dennis Yurichev, did a great job on it.

Enough for the talk. Let's see the 7th challenge. The assembly code to understand is the following

``````<f>:
0:                movzx  edx,BYTE PTR [rdi]
3:                mov    rax,rdi
6:                mov    rcx,rdi
9:                test   dl,dl
b:                je     29
d:                nop    DWORD PTR [rax]
10:                lea    esi,[rdx-0x41]
13:                cmp    sil,0x19
17:                ja     1e
1c:                mov    BYTE PTR [rcx],dl
22:                movzx  edx,BYTE PTR [rcx]
25:                test   dl,dl
27:                jne    10
29:                repz ret
``````

## Analysis

The first 4 instructions, give us the impression we are dealing with a string in `rdi` register. Specially for the copy of character and the jump to the end of the program. The character it's copied into `edx` register. Let's keep describing the program before we have a complete signature of f.

Next to this, we can see the following instructions

``````lea esi,[rdx-0x41]
cmp sil,0x19
ja 1e
``````

Something interesting here is the `lea esi,[rdx-0x41]` instruction which give us the clue that in `rdx` we might have something with more than 65 bits. Why 65? The magic here is that 0x41 or 65 it's the ASCII code for the character `'A'`. Then when we combine these two instructions, what we are checking is if the character is NOT between `'A'` and `'Z'` ASCII characters. Basically if belongs to the lowercase characters in the English alphabet. If that's the case we jump then to 1e memory position.

Now on this memory position we have the following instructions

``````add rcx,0x1
movzx edx,BYTE PTR [rcx]
test dl,dl
jne 10
``````

Which will pass to the next character in the sequence, and continue with the loop in case the character is not the `'\0'` character.

The last instructions to analyze, are the following

``````add edx,0x20
mov BYTE PTR [rcx],dl
``````

Remember that at this point we have checked if the character is an lowercase letter, so what we have at this point need to be an uppercase letter. When we add 0x20, to an uppercase letter we will get its corresponding lowercase letter.

For example:

``````'A' ASCII code is 65, after adding 0x20 would be 97, which is indeed the ASCII code for 'a'
``````

With this we already know what the code does, it's lowercasing a provided string. The code in C would be like this:

``````void f(char *str)
{
if (*str == '\0')
return;

while (*str != '\0') {
if (*str - 0x41 > 0x19) {
str++;
continue;
}

// lowercasing a character in case is latin letters
*str += 0x20;
str++;
}
}
``````

Which can be expressed shorter as

``````void f(char *str)
{
for ( ;*str != '\0'; str++) {
if (*str - 0x41 <= 0x19) {
*str += 0x20;
}
}
}
``````

## Conclusion

That's it!! The code performs a basic lowercase of a string.