Have you ever stumbled upon code that works almost like a miracle, entirely because of how the compiler lays out data in memory?
Recently, while I was working on a port of a decryption for executable files (specifically EBOOT.BIN) from the PSP that was originally written in C++, I found a design pattern that brought me different emotions such as "fascination" and "paranoia"...
Today I want to document how a PSP decryption routine relies on contiguous structure layout to treat multiple adjacent fields as a single cryptographic workspace, the challenges of porting this design to Rust, and how I validated the implementation using an integration test with a real EBOOT.BIN file from Lego Batman.
First case: When The Structure Layout Becomes Part of the Algorithm
Let's analyze the original C++ structure I used to map the encrypted executable layout of the PRXType1 format:
struct PRXType1
{
explicit PRXType1(const u8 *prx)
{
memcpy(tag, prx+0xD0, sizeof(tag));
memcpy(sha1, prx+0xD4, sizeof(sha1));
memcpy(unused, prx+0xE8, sizeof(unused));
memcpy(kirkBlock, prx+0x110, 0x40);
memcpy(kirkBlock+0x40, prx+0x80, sizeof(kirkBlock)-0x40);
memcpy(prxHeader, prx, sizeof(prxHeader));
}
void decrypt(int key)
{
// LOOK AT THIS NOW
kirk7(sha1+0xC, sha1+0xC, 0xA0, key);
}
u8 tag[4]; // 4 bytes
u8 sha1[0x14]; // 20 bytes (0x14)
u8 unused[0x28]; // 40 bytes
u8 kirkBlock[0x90];// 144 bytes
u8 prxHeader[0x80];// 128 bytes
};
static_assert(sizeof(PRXType1) == 0x150, "inconsistent size of PRX Type 1");
Now u may ask: Where is the trick?
Pay attention to the decrypt fn. It calls the PSP's cryptographic engine (which is kirk7), passing as a source SHA1 + 0xC.
- The
sha1array is EXACTLY0x14(20 bytes) long - If we move
0xC(12 bytes) FORWARD, we only have 8 bytes left in that specific array - HOWEVER, the third parameter says to process
0xA0(which is 160 BYTES)
Now that we know that, Why the program didn't crash? Because the fields reside in a single contiguous structure object. (one after another). Although the pointer originates inside the sha1 field, the cryptographic routine processes 160 consecutive bytes. In practice, this means that the operation spans the remaining bytes of sha1, all of unused, and part of kirkBlock. The implementation therefore treats several adjacent fields as a single contiguous cryptographic workspace.
Because the decryption routine depends on a very specific memory layout, changing the declaration order of the fields would alter the 160-byte region processed by kirk7. Such a change would likely corrupt the decrypted data and break the algorithm entirely.
What Does kirk7 Actually Do?
void kirk7(u8* outbuff, const u8* inbuff, size_t size, int keyId)
{
AES_ctx aesKey;
u8* key = kirk_4_7_get_key(keyId);
AES_set_key(&aesKey, key, 128);
AES_cbc_decrypt(&aesKey, inbuff, outbuff, size);
}
Since the pointer begins at sha1 + 0xC and the requested size is 0xA0 (160 bytes), the operation spans the remaining bytes of sha1, all of unused, and part of kirkBlock.
In other words, the algorithm is not really interested in the sha1 field itself. It is operating on a 160-byte cryptographic workspace whose starting point happens to lie inside sha1.
The Rust Philosophy: Explicit safety without sacrificing performance.
When I decided to port this to Rust, I quickly realized that the original implementation relies on pointer arithmetic spanning multiple adjacent fields. While Rust can express the same behavior through raw pointers and unsafe code, I wanted a solution that made the memory region explicit and remained fully safe. Instead of relying on a pointer that implicitly goes to several fields, I represented the entire workspace as a single byte array and passed the exact range required by the decryption routine.
To solve this, I decided to use a single flat byte array (which is [u8; 0x150) and create read-only views (slices) based on fixed offsets.
Here is my implementation:
pub struct PrxType1 {
pub data: [u8; 0x150],
}
impl PrxType1 {
/// Constructs a new PrxType1 from raw file bytes.
pub fn new(prx: &[u8]) -> Self {
let mut data = [0u8; 0x150];
// Reconstruct the layout respecting the original C++ offsets:
data[0..4].copy_from_slice(&prx[0xD0..0xD4]); // tag
data[4..0x18].copy_from_slice(&prx[0xD4..0xE8]); // sha1 (20 bytes)
data[0x18..0x40].copy_from_slice(&prx[0xE8..0x110]); // unused (40 bytes)
data[0x40..0x80].copy_from_slice(&prx[0x110..0x150]); // kirkBlock part 1 (64 bytes)
data[0x80..0xD0].copy_from_slice(&prx[0x80..0xD0]); // kirkBlock part 2
data[0xD0..0x150].copy_from_slice(&prx[0..0x80]); // prxHeader (128 bytes)
Self { data }
}
pub fn decrypt(&mut self, key_id: i32) -> Result<(), KirkError> {
// In C++ the signature was: kirk7(sha1+0xC, sha1+0xC, 0xA0, key);
// Our 'sha1' starts at offset 4.
// 4 + 12 (0xC) = 16 (0x10).
// If we want to decrypt 160 bytes (0xA0): 16 + 160 = 176 (0xB0).
// Rust allows us to express the exact memory range explicitly and safely:
kirk7(&mut self.data[0x10..0xB0], key_id)?;
Ok(())
}
// --- Safe Views (Read-only Slices) ---
pub fn tag(&self) -> &[u8] { &self.data[0..4] }
pub fn sha1(&self) -> &[u8] { &self.data[4..0x18] }
pub fn unused(&self) -> &[u8] { &self.data[0x18..0x40] }
pub fn kirk_block(&self) -> &[u8] { &self.data[0x40..0xD0] }
pub fn prx_header(&self) -> &[u8] { &self.data[0xD0..0x150] }
/// This verifies integrity by recalculating the header hash (it does this in the original decrypt project but I prefer doing it here)
pub fn is_valid(&self, xorbuf: &[u8]) -> bool {
let mut hasher = Sha1::new();
hasher.update(&xorbuf[0..0x14]);
hasher.update(self.unused());
hasher.update(self.kirk_block());
hasher.update(self.prx_header());
let hash_calculated = hasher.finalize();
hash_calculated[..] == self.sha1()[..]
}
}
Why I think this solution is great?
We keep the extreme performance of C++ because we are doing 0 dynamic allocations or array cloning. By passing the range &mut self.data[0x10..0xB0] to kirk7, Rust guarantees via bounds checking that the cryptographic function operates strictly within those 160 bytes. So if we were to miscalculate the offsets, the program would trigger panic rather than silentyly corrupting other memory.
The moment of Truth
To ensure that this implementation was mathematically justified and the offset math was perfect, I wrote an integration test using the actual EBOOT.BIN file from the Lego Batman Game from the PSP.
This test load the specified file, extract the TAG dinamically, looks up to the corresponding hardware key using a key service (allocated into keys_service.rs), it generates the XOR buffer, runs the decryption, and validates via SHA-1 if the resulting bytes match the game's original structure!!!.
#[cfg(test)]
mod tests {
use super::*;
use std::fs::File;
use std::io::Read;
#[test]
fn test_lego_batman_type1_valido() {
// 1. Load the real PSP binary
let ruta_eboot = "/home/snake/Downloads/lego_batman_game/PSP_GAME/SYSDIR/EBOOT.BIN";
let mut file = File::open(ruta_eboot).expect("Could not open EBOOT!");
let mut eboot_data = Vec::new();
file.read_to_end(&mut eboot_data).unwrap();
// 2. Map data to our flat structure
let mut type1 = PrxType1::new(&eboot_data);
// 3. Extract the crypto Tag (Should evaluate to 0xC0CB167C automatically)
let tag_bytes: [u8; 4] = type1.tag().try_into().expect("Tag doesn't have 4 bytes");
let tag = u32::from_le_bytes(tag_bytes);
// 4. Fetch the keys for this specific game
let key_eboot = keys_service::get_tag_info(tag)
.expect("This game's Tag is missing from the database!");
let key_id = key_eboot.code as i32;
let mut xorbuf = [0u8; 144];
match &key_eboot.key {
KeyType::U8(key_array) => { xorbuf.copy_from_slice(*key_array); }
KeyType::U32(key_array) => {
for (i, &word) in key_array.iter().enumerate() {
let start = i * 4;
let end = start + 4;
xorbuf[start..end].copy_from_slice(&word.to_le_bytes());
}
}
}
// 5. Decrypt using our safe Rust memory range!
type1.decrypt(key_id).expect("AES engine failed...");
// 6. Final validation: Does the calculated hash match the EBOOT offset?
let es_valido = type1.is_valid(&xorbuf);
// If this passes, our contiguous memory emulation was an absolute success
assert!(es_valido, "SHA-1 hash mismatch... Decryption failed!");
}
}
Result: TEST PASSED (OK)
We managed to replicate the exact behavior of the original PSP implementation without inheriting it's dangerous security risks
Conclusion
I am writing this article in order to prevent myself from forgetting this architectural headache as the project continues to scale...
Mastering memory isn't just about programming retro consoles. It's about understanding how data layout, performance, and correctness interact in every system we build.
Top comments (0)