DEV Community

Cover image for CTF Writeup: Corrupted File — picoCTF
Vedant Kulkarni
Vedant Kulkarni

Posted on

CTF Writeup: Corrupted File — picoCTF

Category: Forensics
Difficulty: Easy
Flag: picoCTF{r3st0r1ng_th3_by73s_939a65f5}


My First Impression

So I came across this challenge called "Corrupted File" and honestly the description was pretty straightforward — "This file seems broken... or is it? Maybe a couple of bytes could make all the difference."

The hint mentioned JPEG and tools like xxd or hexdump. That was enough for me to get started. I downloaded the file and immediately noticed it had no extension, which is pretty common in CTF challenges. They want you to figure out what the file actually is.


Step 1 — Downloading the File

First things first, I grabbed the file using wget:

wget https://challenge-files.picoctf.net/c_amiable_citadel/8646393bf40c0026e51065e57963b604edf0a9a73371e01d1af2865c050d3e68/file -O corrupted_file
Enter fullscreen mode Exit fullscreen mode

The download went fine. Got an 8.56KB file saved as corrupted_file. Nothing special yet.


Step 2 — Peeking at the Header

This is where things got interesting. The hint literally said to check the file header, so I used xxd to look at the first 16 bytes:

xxd -l 16 corrupted_file
Enter fullscreen mode Exit fullscreen mode

Output:

00000000: 5c78 ffe0 0010 4a46 4946 0001 0100 0001  \x....JFIF......
Enter fullscreen mode Exit fullscreen mode

Okay so I immediately spotted something wrong. Let me break this down simply:

  • Every JPEG file in existence starts with the magic bytes FF D8
  • But this file started with 5C 78 which translates to \x in ASCII
  • Everything after those first two bytes looked perfectly normal — you can even see JFIF sitting right there which is a dead giveaway that this is supposed to be a JPEG

So basically someone (or something) replaced the first two bytes of the file with garbage. That's the entire corruption — just two bytes out of place.


A Quick Explanation of Magic Bytes

If you're new to this concept, here's a simple way to think about it:

Every file format has a kind of "secret handshake" stored at the very beginning of the file. Your operating system and applications read these first few bytes to figure out what type of file they're dealing with — before even looking at the file extension.

For JPEG files specifically, the first two bytes are always FF D8. This is called the SOI (Start of Image) marker. If those bytes are wrong, your image viewer will just say "I don't know what this is" and refuse to open it.


Step 3 — Fixing the Corruption

Now that I knew exactly what needed fixing, the repair was simple. I used dd combined with printf to overwrite just those two broken bytes:

printf '\xff\xd8' | dd of=corrupted_file bs=1 seek=0 count=2 conv=notrunc
Enter fullscreen mode Exit fullscreen mode

Let me explain what each part does:

  • printf '\xff\xd8' — creates the correct JPEG magic bytes
  • dd of=corrupted_file — writes to our file
  • bs=1 — work one byte at a time
  • seek=0 — start at the very beginning of the file
  • count=2 — only write 2 bytes
  • conv=notruncthis is important — it means don't delete the rest of the file, just overwrite those specific bytes

Output confirmed:

2+0 records in
2+0 records out
2 bytes copied, 5.1539e-05 s, 38.8 kB/s
Enter fullscreen mode Exit fullscreen mode

Step 4 — Verifying the Fix

Before doing anything else I wanted to make sure the fix actually worked. So I ran:

file corrupted_file && xxd -l 16 corrupted_file
Enter fullscreen mode Exit fullscreen mode

Output:

corrupted_file: JPEG image data, JFIF standard 1.01, aspect ratio, 
density 1x1, segment length 16, baseline, precision 8, 800x500, components 3

00000000: ffd8 ffe0 0010 4a46 4946 0001 0100 0001  ......JFIF......
Enter fullscreen mode Exit fullscreen mode

The first two bytes are now FF D8 — exactly what they should be. The file command is now correctly identifying it as a JPEG image with dimensions 800x500. The repair worked perfectly.


Step 5 — Looking for Hidden Stuff

At this point the file looked fine, but in CTF challenges you never just assume. I did a few extra checks to make sure there wasn't anything hidden inside:

Checking for readable strings:

strings corrupted_file | grep -E "flag|picoCTF|CTF"
Enter fullscreen mode Exit fullscreen mode

No output. The flag wasn't hidden as plain text.

Checking the end of the file:

tail -c 16 corrupted_file | xxd
Enter fullscreen mode Exit fullscreen mode

Output:

00000000: 8a00 28a2 8a00 28a2 8a00 28a2 8a00 ffd9  ..(...(...(.....
Enter fullscreen mode Exit fullscreen mode

File ends with FF D9 which is the JPEG EOI (End of Image) marker. Clean ending, nothing appended after it.

Checking for embedded files:

binwalk corrupted_file
Enter fullscreen mode Exit fullscreen mode

Output:

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
0             0x0             JPEG image data, JFIF standard 1.01
Enter fullscreen mode Exit fullscreen mode

Just a single JPEG. No hidden zips, no embedded files, nothing sneaky.

Checking metadata:

exiftool corrupted_file
Enter fullscreen mode Exit fullscreen mode

This showed normal image metadata — dimensions, color space, encoding. Nothing hidden there either.


Step 6 — Getting the Flag

With the file fully repaired and verified, I opened it as an image using Python and OCR:

cp corrupted_file flag.jpg && python3 -c "
from PIL import Image
import pytesseract
img = Image.open('flag.jpg')
text = pytesseract.image_to_string(img)
print(text)
"
Enter fullscreen mode Exit fullscreen mode

And there it was, printed right on the image:

picoCTF{r3st0r1ng_th3_by73s_939a65f5}
Enter fullscreen mode Exit fullscreen mode

What I Learned From This

This challenge was a great introduction to file forensics and the concept of magic bytes. Here are the main takeaways:

1. Never trust the file extension
Always check the actual bytes of a file. Extensions can be changed or missing entirely. The content of the file tells the real story.

2. xxd is your best friend
Learning to read hex dumps is a genuinely useful skill. Once you know the magic bytes of common formats (JPEG, PNG, PDF, ZIP etc.), you can diagnose corruption very quickly.

3. Surgical edits with dd
You don't need a fancy hex editor to fix a file. The dd command lets you overwrite specific bytes at specific offsets without touching anything else. Very powerful tool.

4. Always do extra checks
Even when a fix seems obvious, always verify it worked and check for anything else that might be hiding in the file. binwalk, strings, and exiftool are quick to run and can save you a lot of time.


Quick Reference — Common Magic Bytes

Since this challenge is all about file signatures, here's a handy reference:

File Type Magic Bytes (Hex)
JPEG FF D8 FF
PNG 89 50 4E 47
PDF 25 50 44 46
ZIP 50 4B 03 04
GIF 47 49 46 38
ELF (Linux binary) 7F 45 4C 46

Tools Used

  • wget — downloading the file
  • xxd — hex inspection
  • dd + printf — binary patching
  • file — file type identification
  • strings — plaintext extraction
  • binwalk — embedded file detection
  • exiftool — metadata analysis
  • pytesseract — OCR to read the flag from the image

This was a fun and clean challenge. Two broken bytes, one quick fix, flag retrieved. If you're just getting into forensics CTF challenges, this is honestly a perfect starting point.

Top comments (0)