Vedant Kulkarni

Posted on Jun 6

CTF Writeup: Corrupted File — picoCTF

#cybersecurity #beginners #productivity #tutorial

Category: Forensics
Difficulty: Easy
Flag: picoCTF{r3st0r1ng_th3_by73s_939a65f5}

My First Impression

So I came across this challenge called "Corrupted File" and honestly the description was pretty straightforward — "This file seems broken... or is it? Maybe a couple of bytes could make all the difference."

The hint mentioned JPEG and tools like xxd or hexdump. That was enough for me to get started. I downloaded the file and immediately noticed it had no extension, which is pretty common in CTF challenges. They want you to figure out what the file actually is.

Step 1 — Downloading the File

First things first, I grabbed the file using wget:

wget https://challenge-files.picoctf.net/c_amiable_citadel/8646393bf40c0026e51065e57963b604edf0a9a73371e01d1af2865c050d3e68/file -O corrupted_file

The download went fine. Got an 8.56KB file saved as corrupted_file. Nothing special yet.

Step 2 — Peeking at the Header

This is where things got interesting. The hint literally said to check the file header, so I used xxd to look at the first 16 bytes:

xxd -l 16 corrupted_file

Output:

00000000: 5c78 ffe0 0010 4a46 4946 0001 0100 0001  \x....JFIF......

Okay so I immediately spotted something wrong. Let me break this down simply:

Every JPEG file in existence starts with the magic bytes FF D8
But this file started with 5C 78 which translates to \x in ASCII
Everything after those first two bytes looked perfectly normal — you can even see JFIF sitting right there which is a dead giveaway that this is supposed to be a JPEG

So basically someone (or something) replaced the first two bytes of the file with garbage. That's the entire corruption — just two bytes out of place.

A Quick Explanation of Magic Bytes

If you're new to this concept, here's a simple way to think about it:

Every file format has a kind of "secret handshake" stored at the very beginning of the file. Your operating system and applications read these first few bytes to figure out what type of file they're dealing with — before even looking at the file extension.

For JPEG files specifically, the first two bytes are always FF D8. This is called the SOI (Start of Image) marker. If those bytes are wrong, your image viewer will just say "I don't know what this is" and refuse to open it.

Step 3 — Fixing the Corruption

Now that I knew exactly what needed fixing, the repair was simple. I used dd combined with printf to overwrite just those two broken bytes:

printf '\xff\xd8' | dd of=corrupted_file bs=1 seek=0 count=2 conv=notrunc

Let me explain what each part does:

printf '\xff\xd8' — creates the correct JPEG magic bytes
dd of=corrupted_file — writes to our file
bs=1 — work one byte at a time
seek=0 — start at the very beginning of the file
count=2 — only write 2 bytes
conv=notrunc — this is important — it means don't delete the rest of the file, just overwrite those specific bytes

Output confirmed:

2+0 records in
2+0 records out
2 bytes copied, 5.1539e-05 s, 38.8 kB/s

Step 4 — Verifying the Fix

Before doing anything else I wanted to make sure the fix actually worked. So I ran:

file corrupted_file && xxd -l 16 corrupted_file

Output:

corrupted_file: JPEG image data, JFIF standard 1.01, aspect ratio, 
density 1x1, segment length 16, baseline, precision 8, 800x500, components 3

00000000: ffd8 ffe0 0010 4a46 4946 0001 0100 0001  ......JFIF......

The first two bytes are now FF D8 — exactly what they should be. The file command is now correctly identifying it as a JPEG image with dimensions 800x500. The repair worked perfectly.

Step 5 — Looking for Hidden Stuff

At this point the file looked fine, but in CTF challenges you never just assume. I did a few extra checks to make sure there wasn't anything hidden inside:

Checking for readable strings:

strings corrupted_file | grep -E "flag|picoCTF|CTF"

No output. The flag wasn't hidden as plain text.

Checking the end of the file:

tail -c 16 corrupted_file | xxd

Output:

00000000: 8a00 28a2 8a00 28a2 8a00 28a2 8a00 ffd9  ..(...(...(.....

File ends with FF D9 which is the JPEG EOI (End of Image) marker. Clean ending, nothing appended after it.

Checking for embedded files:

binwalk corrupted_file

Output:

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
0             0x0             JPEG image data, JFIF standard 1.01

Just a single JPEG. No hidden zips, no embedded files, nothing sneaky.

Checking metadata:

exiftool corrupted_file

This showed normal image metadata — dimensions, color space, encoding. Nothing hidden there either.

Step 6 — Getting the Flag

With the file fully repaired and verified, I opened it as an image using Python and OCR:

cp corrupted_file flag.jpg && python3 -c "
from PIL import Image
import pytesseract
img = Image.open('flag.jpg')
text = pytesseract.image_to_string(img)
print(text)
"

And there it was, printed right on the image:

picoCTF{r3st0r1ng_th3_by73s_939a65f5}

What I Learned From This

This challenge was a great introduction to file forensics and the concept of magic bytes. Here are the main takeaways:

1. Never trust the file extension
Always check the actual bytes of a file. Extensions can be changed or missing entirely. The content of the file tells the real story.

2. xxd is your best friend
Learning to read hex dumps is a genuinely useful skill. Once you know the magic bytes of common formats (JPEG, PNG, PDF, ZIP etc.), you can diagnose corruption very quickly.

3. Surgical edits with dd
You don't need a fancy hex editor to fix a file. The dd command lets you overwrite specific bytes at specific offsets without touching anything else. Very powerful tool.

4. Always do extra checks
Even when a fix seems obvious, always verify it worked and check for anything else that might be hiding in the file. binwalk, strings, and exiftool are quick to run and can save you a lot of time.

Quick Reference — Common Magic Bytes

Since this challenge is all about file signatures, here's a handy reference:

File Type	Magic Bytes (Hex)
JPEG	`FF D8 FF`
PNG	`89 50 4E 47`
PDF	`25 50 44 46`
ZIP	`50 4B 03 04`
GIF	`47 49 46 38`
ELF (Linux binary)	`7F 45 4C 46`

Tools Used

wget — downloading the file
xxd — hex inspection
dd + printf — binary patching
file — file type identification
strings — plaintext extraction
binwalk — embedded file detection
exiftool — metadata analysis
pytesseract — OCR to read the flag from the image

This was a fun and clean challenge. Two broken bytes, one quick fix, flag retrieved. If you're just getting into forensics CTF challenges, this is honestly a perfect starting point.

DEV Community

CTF Writeup: Corrupted File — picoCTF

My First Impression

Step 1 — Downloading the File

Step 2 — Peeking at the Header

A Quick Explanation of Magic Bytes

Step 3 — Fixing the Corruption

Step 4 — Verifying the Fix

Step 5 — Looking for Hidden Stuff

Step 6 — Getting the Flag

What I Learned From This

Quick Reference — Common Magic Bytes

Tools Used

Top comments (0)