1. Executive Summary
| Field | Detail |
|---|---|
| Challenge Name | Autorev 1 |
| Platform | picoCTF |
| Category | Reverse Engineering |
| Difficulty | Beginner-Intermediate |
| Key Technique | Automated Binary Analysis via Opcode Pattern Matching |
| Flag | picoCTF{4u7o_r3v_g0_brrr_78c345aa} |
Overview: This challenge tests a fundamental reverse engineering skill — the ability to automate analysis at scale. The remote service sends 20 unique ELF binaries, one at a time, as raw hex-encoded byte streams. For each binary, you have exactly 1 second to extract a hardcoded "secret" integer and send it back. The challenge is impossible to solve manually and teaches the critical lesson that a good reverse engineer isn't just someone who can read assembly — it's someone who can script their way through it.
2. Reconnaissance & Enumeration
2.1 — Initial Connection: Understanding the Protocol
Before writing any exploit, the very first step is always to understand what you're interacting with. The challenge description gives us a simple netcat command to connect to the service.
nc mysterious-sea.picoctf.net 57369
Why
netcat?nc(netcat) is the most basic tool for raw TCP communication. It's the right first choice here because the challenge description explicitly tells us to use it, and we have no reason to believe any higher-level protocol (like HTTP) is involved. We just need to see the raw data the server sends.
2.2 — Analyzing the Server's Handshake
Upon connecting, the server reveals its rules:
Welcome! I think I'm pretty good at reverse enginnering.
There's NO WAY anyone's better than me.
Wanna try? I have 20 binaries I'm going to send you and you
have 1 second EACH to get the secret in each one. Good luck >:)
This immediately tells us three critical things:
| Observation | Implication |
|---|---|
| 20 binaries | We need a loop, not a one-shot script. |
| 1 second EACH | Manual analysis (e.g., opening in Ghidra) is impossible. We must automate. |
| "get the secret" | Each binary has a single hardcoded value we need to find. |
Following the banner, the server dumps a massive hex string — this is the raw bytes of an ELF (Executable and Linkable Format) binary, the standard executable format for Linux. It then prompts:
What's the secret?:
2.3 — Static Analysis: Finding the Secret in the Hex
This is the most critical analytical step. Instead of trying to run the binary (which would be slow and complex to automate), we need to find the secret directly in the raw bytes. To do this, we need to understand the binary's logic.
The Concept: What Does the Binary Do?
Every one of these binaries is compiled from a simple C template that looks conceptually like this:
#include <stdio.h>
int main() {
// The "secret" is a hardcoded 32-bit integer
unsigned int secret = 0xb174cbbb; // <-- This value changes each time
unsigned int guess = 0;
printf("What's the secret?\n");
scanf("%u", &guess);
if (guess == secret) {
printf("Correct!\n");
} else {
printf("Nice try :(\n");
}
return 0;
}
The Concept: How Does C Become Assembly?
When the C compiler (gcc) compiles unsigned int secret = 0xb174cbbb;, it translates it into an x86_64 assembly instruction that stores that value onto the function's stack frame. The specific instruction is:
mov DWORD PTR [rbp-0x4], 0xb174cbbb
Let's break this instruction down piece by piece, as this is the heart of the entire challenge:
| Assembly Component | Meaning |
|---|---|
mov |
"Move" — copy a value into a destination. |
DWORD PTR |
The destination is a 4-byte (32-bit) memory location. |
[rbp-0x4] |
The destination address: 4 bytes below the base pointer (rbp), which is where local variable secret lives on the stack. |
0xb174cbbb |
The immediate (hardcoded) value — our secret! |
The Concept: How Does Assembly Become Bytes (Opcodes)?
The CPU doesn't read text like mov DWORD PTR [rbp-0x4], .... It reads machine code — raw bytes called opcodes. The x86_64 encoding for mov DWORD PTR [rbp-0x4], <imm32> is:
c7 45 fc [4 bytes of the value in Little-Endian]
Let's decode this:
| Byte(s) | Meaning |
|---|---|
c7 |
Opcode for MOV r/m32, imm32 (move a 32-bit immediate value into a memory location). |
45 |
ModR/M byte: indicates the addressing mode is [rbp + disp8]. |
fc |
The signed 8-bit displacement: 0xfc is -4 in two's complement, giving us [rbp-0x4]. |
| Next 4 bytes | The 32-bit immediate value, stored in Little-Endian byte order. |
What is Little-Endian? x86 processors store multi-byte integers with the least significant byte first. So the value
0xb174cbbbis stored in memory asbb cb 74 b1. This is a common "gotcha" in reverse engineering — if you read the bytes left-to-right, you get the wrong number. You must reverse the byte order.
Verification in the First Binary
Let's find this pattern in the first binary's hex stream. Searching for c745fc:
...c745fcbbcb74b1...
^^^^^^^^
bb cb 74 b1 (Little-Endian bytes)
Converting bb cb 74 b1 from Little-Endian:
Reverse the bytes: b1 74 cb bb
Hex value: 0xb174cbbb
Decimal: 2977221563
The server helpfully displayed 2977221563 as the expected answer for the first binary — our analysis is confirmed!
2.4 — The Road Not Taken: Why Not Use a Disassembler?
You might be thinking: "Why not save the binary to a file, run objdump -d or load it into Ghidra?"
This is a perfectly valid instinct for a single binary. However, here's why it fails for this challenge:
| Approach | Problem |
|---|---|
| Ghidra/IDA | These are GUI tools. Opening, analyzing, and reading the output of 20 binaries in 20 seconds is humanly impossible. |
objdump -d |
This could be automated, but it adds unnecessary complexity. You'd need to: (1) decode the hex to a file, (2) run objdump, (3) parse the text output with regex. This is slower and more fragile than just searching the raw bytes. |
| Running the binary | You'd need to decode, save, chmod +x, and then somehow brute-force or debug it. Far too slow for a 1-second window. |
The optimal path — and the lesson of this challenge — is that pattern matching on raw bytes is the fastest, most elegant solution. We don't need a full disassembler; we just need a 3-byte search key (c745fc).
3. Initial Foothold (Exploitation)
3.1 — The Automation Strategy
With our analysis complete, the exploitation plan is straightforward:
┌──────────────────────────────────────────────┐
│ AUTOMATION LOOP (x20) │
│ │
│ 1. READ data from socket until prompt │
│ "What's the secret?:" │
│ │
│ 2. SEARCH the received data for the regex │
│ pattern: c745fc([0-9a-f]{8}) │
│ │
│ 3. EXTRACT the 8 hex chars (4 bytes) │
│ │
│ 4. CONVERT from Little-Endian hex to │
│ unsigned 32-bit decimal integer │
│ │
│ 5. SEND the decimal integer + newline │
│ back to the server │
│ │
└──────────────────────────────────────────────┘
│
▼
After 20 rounds, READ the flag
3.2 — The Exploit Script (Fully Annotated)
import socket # For raw TCP connections (like netcat, but programmable)
import re # For Regular Expressions (pattern matching in text)
import struct # For converting between Python values and C structs (byte packing)
def solve():
"""
Connects to the Autorev 1 challenge server, automatically extracts
the hardcoded secret from 20 binaries, and retrieves the flag.
"""
# --- STEP 1: Establish Connection ---
# socket.create_connection() is like running `nc host port`.
# It returns a socket object we can read from and write to.
host = 'mysterious-sea.picoctf.net'
port = 57369
s = socket.create_connection((host, port))
print(f"[*] Connected to {host}:{port}")
# --- Helper Function: Read Until a Specific Suffix ---
# The server sends data in chunks. We need to keep reading
# until we see the specific prompt "What's the secret?:".
# This ensures we've captured the ENTIRE binary hex stream
# before we try to parse it.
def recv_until(sock, suffix):
"""Reads from the socket byte-by-byte until `suffix` is found."""
result = b'' # Use bytes, not string, for raw socket data
while not result.endswith(suffix):
chunk = sock.recv(1) # Read one byte at a time (simple & reliable)
if not chunk:
# If recv returns empty bytes, the connection was closed.
raise ConnectionError("Server closed the connection prematurely.")
result += chunk
return result.decode() # Convert bytes to a Python string for regex
try:
for i in range(20):
# --- STEP 2: Receive Data Until the Prompt ---
# We read everything the server sends until we see the question.
# This blob of text contains the hex-encoded ELF binary.
data = recv_until(s, b"What's the secret?:")
# --- STEP 3: Extract the Secret Using Regex ---
# Pattern breakdown:
# c745fc - The 3 opcode bytes for `mov DWORD PTR [rbp-0x4], ...`
# ( - Start of a "capture group" (the part we want to extract)
# [0-9a-f] - Any single hexadecimal digit (lowercase)
# {8} - Exactly 8 of them (representing 4 bytes = 32 bits)
# ) - End of capture group
match = re.search(r'c745fc([0-9a-f]{8})', data)
if match:
hex_val = match.group(1) # e.g., "bbcb74b1"
# --- STEP 4: Convert Little-Endian Hex to Decimal ---
# bytes.fromhex("bbcb74b1") -> b'\xbb\xcb\x74\xb1'
# struct.unpack('<I', ...) unpacks as:
# '<' = Little-Endian byte order
# 'I' = Unsigned 32-bit Integer
# Result is a tuple, so we take the first element with [0].
raw_bytes = bytes.fromhex(hex_val)
secret = struct.unpack('<I', raw_bytes)[0]
# --- STEP 5: Send the Answer ---
answer = f'{secret}\n'
s.sendall(answer.encode())
print(f"[+] Round {i+1}/20: Found secret = {secret} | Sent!")
else:
# If the pattern isn't found, something unexpected happened.
# Print the data for debugging.
print(f"[!] Round {i+1}: Pattern 'c745fc' NOT FOUND!")
print(f" Raw data (last 200 chars): ...{data[-200:]}")
break # Stop, because continuing would be pointless.
# --- STEP 6: Receive the Flag ---
# After all 20 rounds, the server should send a congratulations
# message containing the flag. We read generously.
final_output = s.recv(4096).decode()
print(f"\n{'='*50}")
print(f"[*] SERVER RESPONSE:")
print(final_output)
print(f"{'='*50}")
except Exception as e:
print(f"[!] Error: {e}")
finally:
# Always close the socket cleanly.
s.close()
print("[*] Connection closed.")
# --- Entry Point ---
if __name__ == '__main__':
solve()
3.3 — Execution Output
[*] Connected to mysterious-sea.picoctf.net:57369
[+] Round 1/20: Found secret = 3406797415 | Sent!
[+] Round 2/20: Found secret = 4233232755 | Sent!
[+] Round 3/20: Found secret = 2312384779 | Sent!
[+] Round 4/20: Found secret = 2250124380 | Sent!
[+] Round 5/20: Found secret = 826731480 | Sent!
[+] Round 6/20: Found secret = 1642351210 | Sent!
[+] Round 7/20: Found secret = 3796822336 | Sent!
[+] Round 8/20: Found secret = 2986324885 | Sent!
[+] Round 9/20: Found secret = 2400094377 | Sent!
[+] Round 10/20: Found secret = 1885613138 | Sent!
[+] Round 11/20: Found secret = 1143096664 | Sent!
[+] Round 12/20: Found secret = 1757652120 | Sent!
[+] Round 13/20: Found secret = 3300608193 | Sent!
[+] Round 14/20: Found secret = 3072272321 | Sent!
[+] Round 15/20: Found secret = 1763129777 | Sent!
[+] Round 16/20: Found secret = 128687777 | Sent!
[+] Round 17/20: Found secret = 957976931 | Sent!
[+] Round 18/20: Found secret = 2093904350 | Sent!
[+] Round 19/20: Found secret = 2320907858 | Sent!
[+] Round 20/20: Found secret = 790478185 | Sent!
==================================================
[*] SERVER RESPONSE:
Correct!
Woah, how'd you do that??
Here's your flag: picoCTF{4u7o_r3v_g0_brrr_78c345aa}
==================================================
[*] Connection closed.
4. Privilege Escalation
Not applicable. This was a pure reverse engineering and scripting challenge. There was no shell to obtain and no system to escalate privileges on. The "escalation" here was conceptual: escalating from understanding one binary to automating the analysis of twenty.
5. Lessons Learned & Mitigation
5.1 — Key Takeaways for CTF Players
| # | Lesson | Detail |
|---|---|---|
| 1 | Automation is a core RE skill. | The flag literally says it: 4u7o_r3v_g0_brrr. Real-world malware analysis often involves triaging hundreds of samples. The ability to write scripts that extract Indicators of Compromise (IOCs) from binaries programmatically is invaluable. |
| 2 | Learn your opcodes. | You don't need to memorize every x86 instruction, but recognizing common patterns like c7 45 (stack variable assignment) or 48 8d (LEA) will dramatically speed up your analysis, even when using tools like Ghidra. |
| 3 | Little-Endian is not optional knowledge. | Almost every binary exploitation and reverse engineering challenge on x86 requires you to correctly handle Little-Endian byte order. Misunderstanding this is the #1 source of "off-by-everything" errors. Python's struct module is your best friend. |
| 4 | Start simple, then optimize. | We connected with nc first to understand the problem before writing a single line of Python. Resist the urge to start coding before you know what you're coding for. |
5.2 — Blue Team / Mitigation Perspective
While this is a CTF game, the underlying concepts map to real-world scenarios:
| Vulnerability | Real-World Parallel | Mitigation |
|---|---|---|
| Hardcoded secrets in binaries | API keys, passwords, or encryption keys compiled directly into client-side applications. Attackers routinely use tools like strings, binwalk, or custom scripts to extract them. |
Never hardcode secrets. Use environment variables, secure vaults (e.g., HashiCorp Vault), or runtime configuration. If a secret must be in a binary, use obfuscation (though this only slows attackers, it doesn't stop them). |
| Predictable binary structure | Using the same code template across many builds makes automated analysis trivial (as we just demonstrated). This is relevant to malware families that share a common builder. |
Introduce variability. If generating challenge binaries, randomize variable offsets (different stack layouts via compiler flags like -fstack-protector), use different registers, or add junk code to break simple pattern matching. |
Top comments (0)