Category: Forensics
Difficulty: Easy
Flag: picoCTF{r3st0r1ng_th3_by73s_939a65f5}
My First Impression
So I came across this challenge called "Corrupted File" and honestly the description was pretty straightforward — "This file seems broken... or is it? Maybe a couple of bytes could make all the difference."
The hint mentioned JPEG and tools like xxd or hexdump. That was enough for me to get started. I downloaded the file and immediately noticed it had no extension, which is pretty common in CTF challenges. They want you to figure out what the file actually is.
Step 1 — Downloading the File
First things first, I grabbed the file using wget:
wget https://challenge-files.picoctf.net/c_amiable_citadel/8646393bf40c0026e51065e57963b604edf0a9a73371e01d1af2865c050d3e68/file -O corrupted_file
The download went fine. Got an 8.56KB file saved as corrupted_file. Nothing special yet.
Step 2 — Peeking at the Header
This is where things got interesting. The hint literally said to check the file header, so I used xxd to look at the first 16 bytes:
xxd -l 16 corrupted_file
Output:
00000000: 5c78 ffe0 0010 4a46 4946 0001 0100 0001 \x....JFIF......
Okay so I immediately spotted something wrong. Let me break this down simply:
- Every JPEG file in existence starts with the magic bytes
FF D8 - But this file started with
5C 78which translates to\xin ASCII - Everything after those first two bytes looked perfectly normal — you can even see
JFIFsitting right there which is a dead giveaway that this is supposed to be a JPEG
So basically someone (or something) replaced the first two bytes of the file with garbage. That's the entire corruption — just two bytes out of place.
A Quick Explanation of Magic Bytes
If you're new to this concept, here's a simple way to think about it:
Every file format has a kind of "secret handshake" stored at the very beginning of the file. Your operating system and applications read these first few bytes to figure out what type of file they're dealing with — before even looking at the file extension.
For JPEG files specifically, the first two bytes are always FF D8. This is called the SOI (Start of Image) marker. If those bytes are wrong, your image viewer will just say "I don't know what this is" and refuse to open it.
Step 3 — Fixing the Corruption
Now that I knew exactly what needed fixing, the repair was simple. I used dd combined with printf to overwrite just those two broken bytes:
printf '\xff\xd8' | dd of=corrupted_file bs=1 seek=0 count=2 conv=notrunc
Let me explain what each part does:
-
printf '\xff\xd8'— creates the correct JPEG magic bytes -
dd of=corrupted_file— writes to our file -
bs=1— work one byte at a time -
seek=0— start at the very beginning of the file -
count=2— only write 2 bytes -
conv=notrunc— this is important — it means don't delete the rest of the file, just overwrite those specific bytes
Output confirmed:
2+0 records in
2+0 records out
2 bytes copied, 5.1539e-05 s, 38.8 kB/s
Step 4 — Verifying the Fix
Before doing anything else I wanted to make sure the fix actually worked. So I ran:
file corrupted_file && xxd -l 16 corrupted_file
Output:
corrupted_file: JPEG image data, JFIF standard 1.01, aspect ratio,
density 1x1, segment length 16, baseline, precision 8, 800x500, components 3
00000000: ffd8 ffe0 0010 4a46 4946 0001 0100 0001 ......JFIF......
The first two bytes are now FF D8 — exactly what they should be. The file command is now correctly identifying it as a JPEG image with dimensions 800x500. The repair worked perfectly.
Step 5 — Looking for Hidden Stuff
At this point the file looked fine, but in CTF challenges you never just assume. I did a few extra checks to make sure there wasn't anything hidden inside:
Checking for readable strings:
strings corrupted_file | grep -E "flag|picoCTF|CTF"
No output. The flag wasn't hidden as plain text.
Checking the end of the file:
tail -c 16 corrupted_file | xxd
Output:
00000000: 8a00 28a2 8a00 28a2 8a00 28a2 8a00 ffd9 ..(...(...(.....
File ends with FF D9 which is the JPEG EOI (End of Image) marker. Clean ending, nothing appended after it.
Checking for embedded files:
binwalk corrupted_file
Output:
DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
0 0x0 JPEG image data, JFIF standard 1.01
Just a single JPEG. No hidden zips, no embedded files, nothing sneaky.
Checking metadata:
exiftool corrupted_file
This showed normal image metadata — dimensions, color space, encoding. Nothing hidden there either.
Step 6 — Getting the Flag
With the file fully repaired and verified, I opened it as an image using Python and OCR:
cp corrupted_file flag.jpg && python3 -c "
from PIL import Image
import pytesseract
img = Image.open('flag.jpg')
text = pytesseract.image_to_string(img)
print(text)
"
And there it was, printed right on the image:
picoCTF{r3st0r1ng_th3_by73s_939a65f5}
What I Learned From This
This challenge was a great introduction to file forensics and the concept of magic bytes. Here are the main takeaways:
1. Never trust the file extension
Always check the actual bytes of a file. Extensions can be changed or missing entirely. The content of the file tells the real story.
2. xxd is your best friend
Learning to read hex dumps is a genuinely useful skill. Once you know the magic bytes of common formats (JPEG, PNG, PDF, ZIP etc.), you can diagnose corruption very quickly.
3. Surgical edits with dd
You don't need a fancy hex editor to fix a file. The dd command lets you overwrite specific bytes at specific offsets without touching anything else. Very powerful tool.
4. Always do extra checks
Even when a fix seems obvious, always verify it worked and check for anything else that might be hiding in the file. binwalk, strings, and exiftool are quick to run and can save you a lot of time.
Quick Reference — Common Magic Bytes
Since this challenge is all about file signatures, here's a handy reference:
| File Type | Magic Bytes (Hex) |
|---|---|
| JPEG | FF D8 FF |
| PNG | 89 50 4E 47 |
25 50 44 46 |
|
| ZIP | 50 4B 03 04 |
| GIF | 47 49 46 38 |
| ELF (Linux binary) | 7F 45 4C 46 |
Tools Used
-
wget— downloading the file -
xxd— hex inspection -
dd+printf— binary patching -
file— file type identification -
strings— plaintext extraction -
binwalk— embedded file detection -
exiftool— metadata analysis -
pytesseract— OCR to read the flag from the image
This was a fun and clean challenge. Two broken bytes, one quick fix, flag retrieved. If you're just getting into forensics CTF challenges, this is honestly a perfect starting point.
Top comments (0)