KL3FT3Z

Posted on May 1

From Research PoC to Redteam Toolkit: Hardening CVE-2026-31431 for Production Operations

#webdev #redteam #cybersecurity #linux

From Research PoC to Redteam Toolkit: Hardening CVE-2026-31431 for Production Operations

Introduction

On April 29, 2026, Theori and Xint disclosed CVE-2026-31431 — a local privilege escalation vulnerability in the Linux kernel's AF_ALG crypto subsystem. Their research, published at copy.fail, demonstrated a novel page-cache mutation primitive: by abusing the authencesn AEAD template's in-place optimization combined with splice(), an attacker could overwrite cached pages of a setuid binary without ever modifying the on-disk inode.

The original proof-of-concept was written in Python — excellent for research demonstration, but impractical for real-world redteam operations where Python is rarely available on target servers and the tool's footprint must be minimal.

Tony Gies quickly produced a baseline C port using nolibc, which solved the deployment problem but remained a research tool at heart.

This article documents our work extending that foundation into a production-grade redteam toolkit — adding operational security, anti-forensics, automatic target discovery, fileless payload delivery, and cross-platform build infrastructure. We share the architectural decisions, trade-offs, and defensive takeaways from this effort.

The Gap Between Research and Operations

Why Python PoCs Don't Survive First Contact

Research Requirement	Operational Reality
Python 3.8+ available	Servers run minimal images; no Python
`pip install` dependencies	Airgapped networks; no package manager
50+ MB with libraries	Binary must be < 100 KB for covert deployment
Run once, observe output	Must survive for weeks with minimal interaction
Clean environment	EDR, SIEM, AppArmor, SELinux actively hunting
Manual target selection	Operator may not know which setuid binary exists

The baseline C port solved the deployment size problem (~2 KB payload), but lacked:

Operational control: How does an operator trigger execution remotely?
Stealth: How do we hide from ps, top, and EDR process monitoring?
Cleanup: How do we remove forensic artifacts after exploitation?
Resilience: What happens if the C2 server is down?
Cross-platform support: Cloud targets run ARM64, not just x86_64.

Architecture Overview

Our toolkit is organized into nine modules spanning four layers:

┌─────────────────────────────────────────────────────────────┐
│                     ORCHESTRATOR (exploit.c)               │
│  Coordinates all modules in a 7-step pipeline:             │
│  Hide → Discover → Prepare → Verify → Exploit → Cleanup →   │
│  Deliver                                                     │
└─────────────────────────────────────────────────────────────┘
                              │
    ┌─────────────┬─────────┴─────────┬─────────────┐
    ▼             ▼                     ▼             ▼
┌────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌────────┐
│ patch  │  │ target   │  │ anti     │  │ stage1   │  │ memfd  │
│ chunk  │  │ discovery│  │ forensics│  │ delivery │  │ exec   │
│        │  │          │  │          │  │          │  │        │
└────────┘  └──────────┘  └──────────┘  └──────────┘  └────────┘
    │             │              │             │            │
    └─────────────┴──────────────┴─────────────┴────────────┘
                              │
    ┌─────────────────────────┴─────────────────────────┐
    ▼                                                   ▼
┌──────────────┐                              ┌──────────────┐
│ proc_hide    │                              │ sleep_jitter │
│ signal       │                              │ stage2 C2    │
│ trigger      │                              │ implant      │
└──────────────┘                              └──────────────┘

Module Responsibilities

Module	File(s)	Core Function
Exploit Primitive	`patch_chunk.c/h`	AF_ALG/splice page cache mutation with socket reuse, parallel writes, and verification
Target Discovery	`target_discovery.c/h`	Auto-scan and score setuid binaries; MAC-aware selection
Anti-Forensics	`anti_forensics.c/h`	Cache dropping, timestamp restoration, self-destruction
Stage-1 Delivery	`stage1.c/h`	Fileless payload fetch via HTTP/HTTPS/DNS/embedded
Stage-2 C2	`stage2_template.c/h`	Reverse shell with reconnect, jitter, signal control
memfd Execution	`memfd_exec.c/h`	Anonymous file execution with cloaking and decryption
Process Hiding	`proc_hide.c/h`	argv/cmdline/comm masquerading
Signal Control	`signal_trigger.c/h`	Operator-triggered execution with zero-CPU waiting
Sleep Jitter	`sleep_jitter.c/h`	Random delays with uniform/triangular/exponential distributions
Vulnerability Checker	`vulnerable.c`	Non-destructive kernel susceptibility test

Module Deep Dives

1. Hardened Exploit Primitive: `patch_chunk.c`

The original baseline opened a fresh AF_ALG socket for every 4-byte window. Our implementation reduces the syscall footprint by ~60% through socket reuse:

// Original: socket() + bind() + setsockopt() + accept() per chunk
// Ours:     accept() per chunk; ctrl socket reused across all chunks

int ctrl = -1, op = -1;
for (off_t off = 0; off < len; off += 4) {
    patch_chunk(fd, off, window, &ctrl, &op);  // ctrl reused
}

Key improvements:

Atomic verification: After each write, mmap() + memcmp() confirms the mutation landed. If page cache was reclaimed (rare under load), auto-retry with 1ms backoff.
Parallel writes: fork() distributes chunks across up to 16 CPU cores. A 50 KB payload drops from ~12 seconds to ~800ms on modern hardware.
Granular error codes: 0 = verified success, 1 = kernel patched (operation rejected), -1 = fatal error.
Zero heap allocations: All buffers on stack; no malloc/free jitter for EDR to hook.

2. Automatic Target Discovery: `target_discovery.c`

Manually specifying /usr/bin/su fails when:

The target uses sudo instead of su
AppArmor blocks su but not pkexec
The binary is in /usr/local/bin or a snap package

Our scanner operates in three phases:

Phase 1: Check 18 priority targets (su, sudo, passwd, pkexec, mount, ping...)
Phase 2: Scan standard directories (/usr/bin, /bin, /usr/sbin...)
Phase 3: Deep scan (/usr/lib, /opt) if aggressive mode enabled

Each candidate receives a composite score:

score = setuid_root(1000) + setuid_user(500)
      + small_size_bonus(200 per KB under 100KB)
      + no_apparmor(300) - apparmor_enforced(-500)
      + no_selinux(200) - selinux_enforced(-400)
      + standard_path(100)

This automatically deprioritizes binaries under active MAC enforcement — reducing the chance of an exploit that "works" but immediately triggers an EDR alert.

3. Fileless Execution: `memfd_exec.c`

The memfd_create(2) syscall creates an anonymous file existing only in RAM. Combined with fexecve(3), this enables zero-disk execution:

int mfd = memfd_create("kworker", MFD_CLOEXEC);
write(mfd, payload, len);
lseek(mfd, 0, SEEK_SET);
fexecve(mfd, argv, envp);  // Never touches filesystem

Cloaking: The memfd name appears in /proc/$pid/fd/ as memfd:kworker — indistinguishable from legitimate kernel worker threads to casual inspection.

Fork-and-forget: A double-fork sequence creates an orphan process adopted by init (PPID=1), severing the parent-child relationship visible in process trees:

pid_t child = fork();
if (child == 0) {
    pid_t grandchild = fork();
    if (grandchild == 0) {
        setsid();
        fexecve(mfd, argv, envp);
    }
    _exit(0);  // Intermediate dies, grandchild orphaned
}
waitpid(child, NULL, 0);  // Original parent exits cleanly

4. Anti-Forensics: `anti_forensics.c`

The page cache mutation is unique among LPE techniques: the on-disk inode is never modified. However, mutated pages in RAM are still forensic artifacts. Our cleanup sequence:

Step	Technique	Target
1	`posix_fadvise(POSIX_FADV_DONTNEED)`	Per-file page cache eviction
2	`echo 3 > /proc/sys/vm/drop_caches`	Global cache drop (post-root)
3	`utimensat()` timestomp	Restore original atime/mtime
4	Self-destruct	Overwrite dropper binary with zeros
5	Memory wipe	`volatile` zeroing of keys, C2 addresses

Timestomp is critical: splice() reads the target file, which may update atime. Restoring the original timestamp prevents EDR heuristics from flagging "setuid binary accessed at unusual time."

5. Signal-Based Operator Control: `signal_trigger.c`

Traditional implants use polling loops (sleep(1); check_flag();), consuming CPU and standing out in EDR telemetry. We use sigsuspend() for zero-CPU waiting:

// Process state: S (sleeping, interruptible)
// CPU usage: 0.0%
// EDR sees: normal idle daemon

while (!trigger_received) {
    sigsuspend(&wait_mask);  // Returns only on signal
}

Operational modes:

Mode	Behavior	Use Case
`trigger_oneshot()`	Sleep → execute → exit	Hit-and-run assessment
`trigger_daemon()`	Sleep → execute → loop	Persistent long-term implant
`trigger_auto()`	Sleep with timeout fallback	Unattended deployment

Operator commands:

kill -USR1 $PID   # Execute now
kill -USR2 $PID   # Request status (no execution)
kill -TERM $PID  # Graceful shutdown with cleanup

6. Sleep Jitter: `sleep_jitter.c`

Regular reconnect intervals (every 600 seconds exactly) trigger beaconing detection in SIEM. We implement three statistical distributions:

Distribution	Pattern	Detection Evasion
Uniform	Equal probability across range	Basic jitter
Triangular	Cluster around mean	Mimics "normal" random traffic
Exponential	Mostly short, occasional long	Breaks time-based correlation

Drift compensation maintains the average interval despite jitter — ensuring a 10-minute target doesn't drift to 5 or 20 minutes over hours of operation.

RNG backends (in order of preference): getrandom(2), /dev/urandom, rdtsc fallback. Rejection sampling eliminates modulo bias.

Build System: Cross-Platform Static Binaries

Why Static Linking Matters

Dynamic binaries fail when:

Target lacks libc.so.6 (Alpine Linux uses musl)
LD_LIBRARY_PATH is sanitized
EDR hooks dlopen() or ld.so

Our Makefile supports four toolchain strategies:

# Standard: glibc static (portable, ~2 MB)
make redteam

# Tiny: musl static (~50-100 KB, no glibc dependency)
make musl-static

# Modern: zig cross-compile (no toolchain installation)
make cross-zig-arm64

# Traditional: GNU cross toolchain
make cross-arm64 CROSS_COMPILE=aarch64-linux-gnu-

Supported Architectures

Architecture	Typical Target
x86_64	On-premise servers, workstations
ARM64	AWS/Azure/GCP cloud instances
RISC-V	Embedded, experimental cloud
ARM HF	IoT devices, Raspberry Pi

Operational Security Considerations

What We Can Hide

Artifact	Technique	Effectiveness
Command line	`overwrite_argv()`	High — visible in `/proc/$pid/cmdline`
Process name	`prctl(PR_SET_NAME)`	High — visible in `ps`, `top`
Parent relationship	Double-fork	High — PPID=1 (init)
Binary on disk	Self-destruct	High — zeroed before exec
Page cache	`fadvise(DONTNEED)`	Medium — may be reclaimed naturally
Network connections	DNS beaconing, jitter	Medium — reduces correlation

What We Cannot Hide (Kernel-Enforced)

Artifact	Why Visible	Mitigation
`/proc/$pid/exe`	Kernel-maintained symlink	Use memfd (shows as `(deleted)`)
PID number	Kernel-assigned	None without rootkit
`/proc/$pid/status`	Kernel-generated	None from userspace
AF_ALG socket creation	Syscall traceable	Minimize via socket reuse

Defensive Detection Opportunities

For blue teams, this toolkit reveals several detection vectors:

AF_ALG + splice() correlation: eBPF programs can trace this specific combination — rare in legitimate workloads.
memfd_create with suspicious names: While memfd:kworker blends in, the memfd_create syscall itself is uncommon for non-browser processes.
Bracketed process names in userspace: Kernel threads don't have userspace memory maps; checking /proc/$pid/maps reveals the masquerade.
DNS beaconing: Regular TXT queries or A-record lookups to a single domain, especially with jittered intervals.
Page cache integrity: Kernel modules or hypervisors can verify setuid binary cache pages against on-disk hashes.

Defensive Takeaways

Immediate Mitigations

Patch the kernel: Upgrade to Linux >= 6.14 with commit a664bf3d603d, or apply your distribution's backport.
Enable MAC enforcement: AppArmor and SELinux profiles on setuid binaries significantly raise the exploitation bar.
Monitor AF_ALG: The authencesn template is rarely used legitimately; audit its usage via auditd or eBPF.
Verify page cache: Periodic integrity checks on cached setuid pages can detect in-memory mutation.

Long-Term Architectural Changes

The root cause — treating splice'd file pages as writable crypto destinations — suggests a broader principle: input and output buffers in kernel crypto paths should never alias. Future kernel designs should enforce separate scatterlists for source and destination, even when "in-place" optimization seems safe.

Credits and Acknowledgments

This work builds directly on the research and code of others:

Theori (Jinoh Kang, Yonghwi Jin, Seunghyun Lee) and Xint — Original vulnerability discovery, disclosure, and the Python proof-of-concept at copy.fail.
Tony Gies — Baseline C port (tgies/copy-fail-c) using nolibc, providing the foundational cross-platform syscall wrappers.
Linux kernel developers — memfd_create(2), fexecve(3), and the nolibc header-only libc alternative.
musl libc and Zig projects — Toolchains enabling tiny, portable static binaries.

Our contributions are strictly the operational hardening layer: anti-forensics, stealth, automatic targeting, and build infrastructure. The core vulnerability research belongs entirely to Theori and Xint.

Repository and License

Repository: https://github.com/toxy4ny/copy-fail-exploit-on-c-redteam
License: Dual LGPL-2.1-or-later / MIT
Original PoC: theori-io/copy-fail-CVE-2026-31431
Baseline C Port: tgies/copy-fail-c

Disclaimer

This software is provided solely for authorized security research and authorized penetration testing. The authors assume no liability for misuse. Always obtain explicit written permission before testing systems you do not own.

If you discover indicators of compromise matching this toolkit's behavior on your systems:

Apply the kernel patch (commit a664bf3d603d or distribution backport)
Review /var/log/audit/ and EDR telemetry for AF_ALG anomalies
Verify integrity of setuid binary page caches

Have you adapted research tools for production redteam operations? What operational challenges did you encounter? Share your experiences in the comments.

Top comments (2)

GnomeMan4201 • May 14

I always get pumped when I see you have a new post I haven’t read yet. Every time, it’s top tier.

Sometimes I feel like I’m starting to move toward my own lane of research, and then your posts pull me right back into these sharp security concepts and red-team ideas that remind me why this space is so interesting in the first place.

I have a lot of respect for all of it.

KL3FT3Z • May 15

thank you so much — honestly, this kind of feedback means the world to me.
Reading your comment, I felt that rare thing when you realize someone actually gets it — not just the technical bits, but the spirit behind why we do this work. The endless rabbit holes, the frustration when a primitive breaks at 3 AM, and that quiet satisfaction when everything clicks into place.
I checked out your projects, and they're genuinely impressive. The depth and creativity you bring to your research — that's exactly the kind of work that pushes the whole community forward. We're all standing on each other's shoulders here, whether we realize it or not.
Your "own lane" of research? That's not something pulling you away — it's the lane that makes the whole road wider for everyone else. I'd love to see where you're headed with it. If you ever want to bounce ideas, collaborate, or just trade war stories about sandbox evasion techniques, my DMs are open.
Keep building, keep breaking, keep sharing. This space is better because people like you are in it.
Respect!

From Research PoC to Redteam Toolkit: Hardening CVE-2026-31431 for Production Operations

Introduction

The Gap Between Research and Operations

Why Python PoCs Don't Survive First Contact

Architecture Overview

Module Responsibilities

Module Deep Dives

1. Hardened Exploit Primitive: patch_chunk.c

2. Automatic Target Discovery: target_discovery.c

3. Fileless Execution: memfd_exec.c

4. Anti-Forensics: anti_forensics.c

5. Signal-Based Operator Control: signal_trigger.c

6. Sleep Jitter: sleep_jitter.c

Build System: Cross-Platform Static Binaries

Why Static Linking Matters

Supported Architectures

Operational Security Considerations

What We Can Hide

What We Cannot Hide (Kernel-Enforced)

Defensive Detection Opportunities

Defensive Takeaways

Immediate Mitigations

Long-Term Architectural Changes

Credits and Acknowledgments

Repository and License

Disclaimer

1. Hardened Exploit Primitive: `patch_chunk.c`

2. Automatic Target Discovery: `target_discovery.c`

3. Fileless Execution: `memfd_exec.c`

4. Anti-Forensics: `anti_forensics.c`

5. Signal-Based Operator Control: `signal_trigger.c`

6. Sleep Jitter: `sleep_jitter.c`