DEV Community

Cover image for Copying fail
ChienPang Lee
ChienPang Lee

Posted on

Copying fail

Copy Fail (CVE-2026-31431) has started a fierce fire that's gone rampant in the Linux woods. Are we exploitable? The answer is almost certain because pretty much every actively maintained enterprise distribution has it. While folks are anxiously looking for a way to put it out, Dirty Frag and Copy Fail 2: Electric Boogaloo have caught up to the game, spilling oil on the flame. The correct way to address these is to upgrade your kernel to new versions that have the fixes merged. That's, however, easier said than done. Unless you're dealing with your own laptop where you're already on a relatively modern distro version, it's more complicated than "apt update && apt install && reboot". You have a service running on a certain vendor's Linux. The vendor needs weeks, likely months, to get you a release. You have your own company policy and concerns to schedule a widespread kernel rollout that would likely incur service interruptions. The reasons for procrastination go on and on, except that the risk stays high, as well as client inquiry. If the ideal solution is not going to happen soon, what're the mitigations we can do now? Be warned! The commonly-known kernel module suppression may not work, even if your overall services and OS operations could not be guaranteed. Here I'm proposing an addition to the short-term mitigation actions - monitor if your system is exploited by privilege escalations.

Take Copy Fail for example. It has to do with the Linux kernel's internal crypto API, managing functions like kTLS and IPsec. kernel’s internal cryptographic subsystem and may be impacted if the algif_aead module is disabled or restricted as a mitigation. Common features are KTLS (Knernel TLS), IPsec (Internet Protocol Security), Disk Encryption (dm-crypt/LUKS), User-space Crypto Offloading and Zero-Copy Networking: Functions like splice() and sendmsg().
One of the concrete recommendation to mitigate this CVE is to disable algif_aead module and restrict it from being loaded.

echo "install algif_aead /bin/false" > /etc/modprobe.d/disable-algif.conf
rmmod algif_aead 2>/dev/null || true
Enter fullscreen mode Exit fullscreen mode

The fake module installation works on my Ubuntu 24.04 LTS (kernel 6.18-7) but not one another Linux platform of Centos9 (kernel 6.12.74-1).

Here is a working example with a happy ending on my laptop.

chien-pang@pop-os:~/Downloads$ cat /etc/modprobe.d/disable-copyfail.conf
install algif_aead /bin/true
chien-pang@pop-os:~/Downloads$ lsmod | egrep -i "algif|aead"
algif_hash             16384  1
algif_skcipher         12288  1
af_alg                 32768  6 algif_hash,algif_skcipher
chien-pang@pop-os:~/Downloads$ lsmod | grep algif_aead
chien-pang@pop-os:~/Downloads$ modprobe algif_aead ; echo $?
0
chien-pang@pop-os:~/Downloads$ lsmod | grep algif_aead
chien-pang@pop-os:~/Downloads$ ./copyFail30.py
Traceback (most recent call last):
  File "/home/chien-pang/Downloads/./copyFail30.py", line 36, in <module>
    while i<len(e):c(f,i,e[i:i+4]);i+=4
                   ^^^^^^^^^^^^^^^
  File "/home/chien-pang/Downloads/./copyFail30.py", line 30, in c
    a=s.socket(38,5,0);a.bind(("aead","authencesn(hmac(sha256),cbc(aes))"));h=279;v=a.setsockopt;v(h,1,d('0800010000000010'+\
'0'*64));v(h,5,None,4);u,_=a.accept();o=t+4;i=d('00');u.sendmsg([b"A"*4+c],[(h,3,i*4),(h,2,b'\x10'+i*19),(h,4,b'\x08'+i*3),]\
,32768);r,w=g.pipe();splice(f, w, o, offset_src=0);splice(r, u.fileno(), o)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory
chien-pang@pop-os:~/Downloads$ id
uid=1000(chien-pang) gid=1000(chien-pang) groups=1000(chien-pang),4(adm),27(sudo),107(lpadmin)
chien-pang@pop-os:~/Downloads$ ls /root
ls: cannot open directory '/root': Permission denied
Enter fullscreen mode Exit fullscreen mode

On the contrary, the same mitigation doesn't work on another system. Blacklisting it from grub menu did not help.

[cc2 ~]$ cat /proc/cmdline
BOOT_IMAGE=/Part2/bzImage root=UUID=571ee1af-1421-479d-845d-ea6b4f97292f ro net.ifnames=0 acpi=force intel_iommu=on amd_iommu=on iommu=pt console=ttyS0 console=tty0 initcall_blacklist=algif_aead_init
cc2 ~]# su - nova
Last login: Tue May 12 12:15:09 CST 2026 on pts/2
[nova@cc2 ~]$ id
uid=116(nova) gid=124(nova) groups=124(nova),123(libvirt),64055(qemu)
[nova@cc2 ~]$ ls /root ; echo $?
ls: cannot open directory '/root': Permission denied
2
[nova@cc2 ~]$ lsmod | egrep -i "algif|aead"
[nova@cc2 ~]$ /tmp/copyFail30.py
[cc2 /var/lib/nova]# id
uid=0(root) gid=124(nova) groups=124(nova),123(libvirt),64055(qemu)
[cc2 /var/lib/nova]# 
[cc2 /var/lib/nova]# ls /root ; echo $?
0
Enter fullscreen mode Exit fullscreen mode

The exploitation works like a charm. This is also what makes "Copy Fail" notorious as the exploit script would work as is without further dependencies or complex preparations. Note the empty output of lsmod | egrep -i "algif|aead". The subsystem is still up though it's never loaded. It turns out the kernel has it compiled as built-in.

[cc2 /var/lib/nova]# grep  af_alg /proc/kallsyms | tail
ffffffff98d52d20 r __ksymtab_af_alg_release_parent
ffffffff98d52d2c r __ksymtab_af_alg_sendmsg
ffffffff98d52d38 r __ksymtab_af_alg_unregister_type
ffffffff98d52d44 r __ksymtab_af_alg_wait_for_data
ffffffff98d52d50 r __ksymtab_af_alg_wmem_wakeup
ffffffff99f33ff0 t __pfx_af_alg_init
ffffffff99f34000 t af_alg_init
ffffffff9a185a10 d __initcall__kmod_af_alg__884_1325_af_alg_init6
ffffffff9a35b420 t __pfx_af_alg_exit
ffffffff9a35b430 t af_alg_exit
Enter fullscreen mode Exit fullscreen mode

AF_ALG symbols exist in kernel memory which makes the likely-hood high that kernel was compiled with CONFIG_AF_ALG=y.

In any way, you still have to worry about breaking existing features or functionalities even if the path of blacklisting modules works. And then what about "Dirty Frag" and "Copy Fail 2"?

  • Remediations takes long
  • Rolling-out new kernel is a painful operation
  • Mitigation recommendations may not work
  • Mitigation recommendations could break features

Given the challenges, what else can we do to reduce risks? My other piece of advice is to reduce attack surface but that'd be another story because they usually involve modifications of existing configurations or account setups.
What I find helpful is implementing a scanning tool (script) that can tell me if my system has obviously been compromised? Combining it with an event/alert notification system would immediately boost our confidence level while pending on remediations.

#!/usr/bin/bash                                                                                                                                                                                                                                                                                                              

Fmt() {
    printf "%-8s %-8s %-8s %-8s %-16s %-12s %s\n" \
           "${1:-PID}" "${2:-PPID}" "${3:-UID}" "${4:-EUID}" "${5:-CapEff}" "${6:-ParentUID}" "${7:-CMD}"
}

FmtHeader() {
    echo "------------------------------------------------------------------------------------------"
    Fmt
}
FmtContent() {o
    Fmt "$pid" "$ppid" "$uid" "$euid" "$capeff" "$parent_uid" "$cmd"
}

FmtHeader
for pid in /proc/[0-9]*; do
    pid=${pid#/proc/}

    status="/proc/$pid/status"
    cmdline="/proc/$pid/cmdline"

    [[ -r "$status" ]] || continue

    uid=$(awk '/^Uid:/ {print $2}' "$status")
    euid=$(awk '/^Uid:/ {print $3}' "$status")
    capeff=$(awk '/^CapEff:/ {print $2}' "$status")
    ppid=$(awk '/^PPid:/ {print $2}' "$status")

    # read parent UID safely                                                                                                                                                                                                                                                                                                 
    parent_uid="NA"
    if [[ -r "/proc/$ppid/status" ]]; then
        parent_uid=$(awk '/^Uid:/ {print $2}' "/proc/$ppid/status")
    fi

    # command name                                                                                                                                                                                                                                                                                                           
    if [[ -r "$cmdline" ]]; then
        cmd=$(tr '\0' ' ' < "$cmdline")
        cmd=${cmd:0:80}
    else
        cmd="?"
    fi

    if [[ "$capeff" != "000001ffffffffff" || "$parent_uid" == "NA" ]] ; then
        continue
    else
        parent_cmdline="/proc/$ppid/cmdline"
        pcmd=$(tr '\0' ' ' < "$parent_cmdline")
        pcmd=${pcmd:0:80}
    fi

    if [[ "$uid" == "0" && "$parent_uid" != "0" ]]; then
        FmtContent
        echo "[!] ALERT: process ($pid[$(id -nu $uid)]) spawned from non-root parent ($ppid[$(id -nu $parent_uid)] $pcmd $(ps -o etime $ppid | tail -1 | xargs))"
    elif [[ "$parent_uid" != "0" ]]; then
        FmtContent
        echo "[!] WARN: capabilities detected in non-root lineage (PID $pid[$(id -nu $uid)])"
    elif [[ "$uid" != "$parent_uid" && "$uid" == "0" ]]; then
        FmtContent
        echo "[!] INFO: UID escalation detected (PID $pid[$(id -nu $uid)])"
    fi
done

Enter fullscreen mode Exit fullscreen mode

Saving it as /tmp/copying-fail-detect.sh that attempts to detect "Copy Fail" in action. This is an effective detector that looks into all running privileged processes that have their user IDs different from those of their parents. On the same system, the execution looks like:

[cc2 ~]# bash /tmp/copying-fail-detect.sh
------------------------------------------------------------------------------------------
PID      PPID     UID      EUID     CapEff           ParentUID    CMD
2988938  2988937  0        0        000001ffffffffff 116
[!] ALERT: process (2988938[root]) spawned from non-root parent (2988937[nova] python3 /tmp/copyFail30.py  36:16)
...(truncated)...
Enter fullscreen mode Exit fullscreen mode

The alert provides us with a vital message and details in good depth. There is a process with pid 2988938, running root privilege. This process was invoked by its parent 2988937 whose user was nova. The actual command of the parent process was "python3 /tmp/copyFail30.py" and till now it has run this much time 36:16.

The detection has its short-coming of only checking the overall system process status at that second it's executed. On the flip side, it's a short and simple script. What else do you expect? Making it a scheduled job together with your notification or monitoring systems would serve you well when asked about your risk management and confidence level in the perspective of process privilege escalations.

Often times, I found the results of executing this "copying-fail-detect.sh" hilarious in that the very first wave of attackers are usually not actual hackers from the other end of the earth but the internal employees who found the exploit program and gave it a shot on the internal systems. Some of them, me included, are from security IT sector doing risk assessments and evaluating how much harm the CVE can do. Whatever conclusion they have reached, they were either too tired (rushing back home) or too excited (presenting to the team) and never came back to wipe their ass clean. As soon as the audit took place, the red records needed to be justified with embarrassments.

That'd better be my first thing in the morning tomorrow.

Top comments (0)