XZ Backdoor: “That was a close one”

#devops #cybersecurity #appsec

A nefarious or compromised maintainer inserted malicious behavior in a library named liblzma, part of the xz compression tools and libraries, resulting in a backdoor in SSH. This is an advanced software supply chain attack as the library was intentionally modified for the backdoor, with obfuscation and stealth techniques for hiding the attack payload from reviewers.

It was discovered and disclosed recently (on past Mar 29th), and the attack handling is ongoing. However, it was quickly contained as it seems to affect only pre-release versions of a limited set of environments (DEB and RPM packages, for the x86_64 architecture, and built with GCC). Anyway, the CVE was given a CVSS base score of 10, which is reserved for the most critical cybersecurity vulnerabilities. Should it enter stable distributions, the impact would be overwhelming.

The technical analysis of the attack, including the xz backdoor explained in depth, was analyzed elsewhere. This post will focus on the timeline of the attack, how it could be detected, how the incident was handled up to date, and what lessons may be extracted from the attack.

How the XZ backdoor was injected

Note: The git repository is in git.tukaani.org. However, there was also a GitHub hosted repository (currently blocked) where the GitHub account was posting the changes that were later integrated into the Git repository.

One portion of the backdoor seems to be only in the distributed tarballs for the 5.6.0 and 5.6.1 versions, not in the git repositories and relies on a single line in the build-to-host.m4 macro file used by autoconf.

The other portion was in two supposed testfiles:

bad-3-corrupt_lzma2.xz
good-large_compressed.lzma

These were committed by the GitHub account “Jia Tan” (JiaT75) in the xz repository on 23 Feb. It was an innocuous change adding testfiles (supposedly .lzma and .xz compressed blocks). Interestingly enough, the test files were not used by the tests!

The line in the .m4 file injects an obfuscated script (included in the tarball) to be executed at the end of configure if some conditions match. It modifies the Makefile for the liblzma library to contain code that extracts data from the .xz file, which after deobfuscation ends in this script, invoked at the end of configure.

It decides whether to modify the build process to inject code: only under GCC and the GCC linker, under Debian or rpm, and only for x86_64 Linux. When matched, the injected code intercepts execution by replacing two ifunc resolvers so certain calls are replaced. This causes the symbol tables to be parsed in memory (this takes time, which led to the detection, as explained later).

Then things get interesting: The backdoor installs an audit hook into the dynamic linker, waiting for the RSA_public_decrypt function symbol to arrive, which is redirected to a point into the backdoor code, which in turn calls back libcrypto, presumably to perform normal authentication. And the payload activates if the running program has the process name /usr/sbin/sshd. It was clear that SSH servers were the target.

Traditionally, sshd servers like OpenSSH were not linked with liblzma, but sshd is often patched to support systemd-notify so other services can start when sshd is running. And then liblzma is indirectly loaded by systemd, closing the circle.

The backdoor is not yet fully analyzed, but it seems to be allowing remote command execution (RCE) with the privileges of the sshd daemon, running in a pre-authentication context. Info from the remote certificate, when matched by the backdoor, is decrypted with ChaCha20, and when it decrypts successfully it is passed to the system(). So this is essentially a gated RCE, much worse than a mere public key bypass.

A later 5.6.1 tarball showed additional efforts to hide the traces, adding further obfuscation for symbol names, and trying to fix the errors seen. An extension mechanism where additional test files were looked for certain signatures to add to the backdoor was also put in place.

This fairly sophisticated attack could pass unnoticed until stable Linux distributions are reached. Fortunately, some people like to check why abnormal things happen.

The Discovery of the XZ Backdoor Attack

Many times injected malicious behavior is unearthed by chance or accident. A good example was a deprecation warning (“Who cares about warnings?”) that led to the discovery of the event-stream attack in Oct 2018. Another is the user who warned Codecov in April 2021 that their bash uploader script did not pass the checksum.

Anomalies and odd symptoms with ssh logins (logins taking a lot of CPU and increased elapsed time, valgrind errors) aroused the curiosity of Andres Freund, a vigilant PostgreSQL developer but not a security analyst (as he stated). After some investigation with OpenSSH on Debian Sid, he concluded that a response time problem relied on a library, liblzma, part of the xz-utils compression library. The reason: “the upstream xz repository and the xz tarballs have been backdoored”.

On Mar 29 2024 Andres posted in Openwall the first analysis: “backdoor in upstream xz/liblzma leading to ssh server compromise”.

The fact: XZ Utils 5.6.0 and 5.6.1 tarballs contain a backdoor. These tarballs were created and signed by the aforementioned Jia Tan account.

He posted in Mastodon later that day, recognizing that the discovery was accidental and required a lot of coincidences. The comments from other users are worth reading.

GitHub user thesamesam (aka Sam James) published a nice Gist FAQ on the xz-utils backdoor where the attack was summarized, linking into more in-depth analyses of the attack payload.

These analyses were technically juicy, and helped us to better understand the injection, which was highly elaborated:

xz/liblzma: Bash-stage Obfuscation Explained – Nice analysis on the deobfuscation by the injection script, in four “stages”.
Filippo Valsorda's bluesky thread – Analysis of the backdoor itself in RSA_public_decrypt, which shows its nature: an RCE, not auth bypass, and gated.
XZ Backdoor Analysis by @smx-smx (WIP)
xz backdoor documentation wiki

This poster from Thomas Roccia shows part of the activity of JiaT75 on the GitHub repository, and how the injection script inserts the binary backdoor, further illustrating the xz backdoor explained.

How the incident was handled

Disclosure by Andreas Freund was cautious because, in his own words:

“Given the apparent upstream involvement, I have not reported an upstream bug. As I initially thought it was a debian specific issue, I sent a more preliminary report to security@...ian.org. Subsequently, I reported the issue to distros@. CISA was notified by a distribution.”

Red Hat assigned this issue CVE-2024-3094. Then the word circulated like wildfire.

Lasse Collin, the other maintainer for XZ, added a new commit on Sat 30 Mar titled “CMake: Fix sabotaged Landlock sandbox check”. He promptly disclosed the issue in the XZ Utils backdoor advisory.

CVE references: CVE-2024-3094 | MITRE | NVD | Ubuntu.

CISA on Mar 29th released an alert recommending downgrade to 5.4.6.

GitHub repositories under the Tukaani org were disabled. The GitHub accounts JiaT75 and Larhzu were suspended.

Vendors reacted quickly with Yara rules, detections in tools like Sysdig and PAN.

We are now in the Eradication and Recovery phase. Other projects by JiaT75 (e.g. libarchive, oss-fuzz) are under review.

Who is under the attack?

Either the GitHub JiaT75 account was compromised (GitHub recently mandated 2FA) or the user behind the account turned malicious. Evidence points to a possible APT, given sophistication.

Discussion on Hacker News sheds more light.