When I first started picoCTF forensics challenges, I had a folder full of installed tools and no idea which one to open first. Every challenge felt like staring at a locked box with twenty keys on the table. The problem wasn't a lack of tools — it was not knowing the decision process behind picking the right one.
This page is what I wish had existed when I started. Not a list of tools with feature descriptions, but a map of when to reach for each one and — just as importantly — when to put it down and try something else.
Step Zero: Identify What You're Dealing With
Before touching any specialized tool, run these two commands on every unknown file:
$ file challenge.bin
challenge.bin: Zip archive data, at least v2.0 to extract
$ xxd challenge.bin | head -5
00000000: 504b 0304 1400 0000 0800 ... PK..........
file reads magic bytes and tells you the actual format regardless of the extension. xxd shows the raw hex so you can spot a corrupted header immediately. I've lost count of how many times a file named data.png turned out to be a ZIP or a disk image — the magic bytes 50 4B 03 04 (PK) are a dead giveaway for ZIP regardless of what the filename says.
If file says "data" or gives something unexpected, that's your first clue. See the Corrupted File writeup for a real example of this — the challenge handed me a PNG with a broken magic byte, and file was what made that obvious.
By File Type: Which Tool to Reach For
Disk Images (.img, .dd, raw)
Disk image challenges are a category where picking the wrong tool first wastes a lot of time. Here's the order I follow now:
- fdisk — read the partition table first. Tells you how many partitions exist and their offsets.
- dd — carve out individual partitions by byte offset for closer inspection.
-
mount — only after you know the partition layout. Mounting blindly often fails; fdisk tells you the offset you need for the
-oflag.
The Rabbit Hole I fell into early on: jumping straight to mount without checking the partition table. If the image has multiple partitions, mount defaults to the first one and you might miss the flag entirely.
Audio Files (.wav, .mp3, .flac)
Audio forensics challenges almost always hide data in one of three places: the spectrogram, the waveform LSBs, or metadata. Your first move should always be the spectrogram.
- Audacity — open the file and switch to spectrogram view immediately. If there's a visual message hidden in the frequency domain, you'll see it in seconds. This is the tool I open first for any audio challenge.
- SoX — when I need to script audio analysis or batch-process files. Also useful for speed/pitch manipulation when a challenge hints that audio has been distorted.
- FFmpeg — for video files or when a challenge mixes audio and video. Also my go-to when a file won't open in Audacity due to codec issues — FFmpeg can transcode it first.
Image Files (.png, .jpg, .bmp)
Image steganography is one of the most common forensics categories. The approach depends on whether the file is structurally intact or corrupted.
- pngcheck — run this first on any PNG. It validates chunk integrity and will immediately flag if something is wrong with the file structure. A challenge with a "broken" PNG almost always has an intentionally modified chunk.
- steghide — for JPEG/BMP files that might have data embedded with a passphrase. If the challenge gives you a password hint, steghide is usually involved.
- binwalk — when the image looks clean but is suspiciously large. Scans for embedded files and compressed data appended after the image end.
One pattern I've noticed: if a PNG passes pngcheck cleanly but still feels suspicious, look at the IDAT chunk data and palette entries. Some challenges inject data there that doesn't break the structure.
Documents and Archives
- pdfdumper — PDFs are containers. pdfdumper extracts embedded objects, JavaScript, and hidden streams that you'd never see just opening the file normally.
- zip2john — for password-protected ZIPs. The key thing here is identifying the encryption type first: ZipCrypto is crackable with zip2john + hashcat/john, but AES-256 encryption requires the actual password. I wrote about this distinction in detail in the zip2john article.
QR Codes and Barcodes
- zbarimg — the fastest way to decode QR/barcodes from the command line. The Scan Surprise challenge in picoCTF is a straightforward example — see the writeup for how it plays out in practice.
My First-Pass Workflow
When I get a new forensics challenge, this is the sequence I actually follow:
# 1. What is this file?
file challenge.*
xxd challenge.* | head -30
# 2. Anything embedded?
binwalk challenge.*
strings challenge.* | grep -i flag
# Example output that changes my approach:
$ strings mystery.dat | grep -i flag
picoCTF{hidden_in_plain_sight_3a9f2}
# Done in 10 seconds. Sometimes it's that simple.
# 3. Branch based on file type
# → disk image: fdisk → dd → mount
# → audio: Audacity spectrogram → sox/ffmpeg
# → image: pngcheck → steghide
# → archive: zip2john (check encryption type first)
# → PDF: pdfdumper
# → QR/barcode: zbarimg
The strings pass in step 2 sounds too simple to mention, but I've found flags in plaintext embedded in binary files more than once. Never skip it before reaching for a specialized tool.
Common Rabbit Holes in Forensics CTF
Things I've learned to check before going deep on a tool:
-
Wrong file type assumption — the extension lies. Always check magic bytes with
fileandxxd. - Multiple layers — extracting one file from a ZIP and stopping. There's often another layer inside.
- Mounting without reading partition offsets — mount fails or mounts the wrong partition when you skip fdisk.
- AES-256 ZIP + zip2john — zip2john cannot crack AES-256 encrypted ZIPs. If you're seeing $zip2$* in john output, you need the actual password, not a dictionary attack.
- Spectrogram at wrong scale — if Audacity's spectrogram looks like noise, zoom in on the frequency range 1–4kHz. Flags are sometimes hidden in a narrow band that's invisible at default zoom.
Tool Reference Index
| Tool | File Type | When to Use |
|---|---|---|
| fdisk | Disk image | Read partition table before anything else |
| dd | Disk image | Carve partitions by byte offset |
| mount | Disk image | Browse filesystem after fdisk gives you the offset |
| Audacity | Audio | Spectrogram analysis — open first for any audio file |
| SoX | Audio | Scripted analysis, speed/pitch manipulation |
| FFmpeg | Audio/Video | Video files, codec issues, format conversion |
| pngcheck | PNG | Validate chunk integrity on any PNG challenge |
| steghide | JPEG/BMP | Extract passphrase-protected embedded data |
| binwalk | Any binary | Detect and extract embedded files; watch for false positives in PNG IDAT chunks |
| zbarimg | QR/Barcode | Fastest CLI decoder for QR and barcode images |
| pdfdumper | Extract hidden streams, embedded objects, JavaScript | |
| zip2john | ZIP archive | Password hash extraction (ZipCrypto only — check encryption type first) |
Further Reading
Here are related writeups from alsavaudomila.com that show these tools in action on real picoCTF problems:
If you want to see how disk forensics tools chain together in practice, the DISKO 1 writeup walks through a full fdisk → dd → mount workflow on an actual picoCTF challenge.
For a real example of corrupted file identification using magic bytes and pngcheck, the Corrupted File writeup covers exactly how a broken PNG header gets diagnosed and fixed.
The Scan Surprise writeup is the most straightforward QR code challenge in picoCTF — a good first read if you've never used zbarimg before.
For network forensics (Wireshark/tshark), which isn't covered above, the Ph4nt0m 1ntrud3r writeup goes deep on packet filtering and Base64 fragment extraction across TCP streams.
Top comments (0)