Math.random, friend or foe?

#discuss #javascript #security

Math.random is flawed, as bold a claim as that may seem. It does what it sets out to do, It generates a number between 0 and 1 that’s random… Right? Wrong!

Math.random doesn’t actually produce a truly random number. The number it creates is deterministic, meaning that it’s predictable. Now maybe for a quick dice roller or game feature, this won’t matter, but for security, this matters a lot.

Computers, too smart for their own good

So, why is this even a problem in the first place?

For one, we can't generate truly random numbers, not easily at least. Computers are very bad at generating random numbers. Without some kind of outside influence, it’s hard to get anything that is truly random. Computers are good at math, but not at randomness. The problem with any kind of “outside influence”, is most of that data would need to be sent in from some other source, which we’d have to trust. If we don’t, it would be prone to being manipulated to get a desired result, for example, a previously made random number.

To get around this issue and avoid needing outside data as much as can be helped, a lot of systems implement pseudorandom number generators, which although are fine in a lot of use cases, are not random, and therefore if not made securely, can be reverse engineered.

Which brings us to our second problem. Even though it may on the surface be indistinguishable from a random number, it is different. A lot of services now will secure your account behind 6 digit codes. A lot of password managers give you the option to generate a random password. You likely don’t want people using your program to so easily know where data is stored, and you might want a secure hashing function or encryption. All of these could cause problems should a malicious actor know you’re using Math.random, as all it takes is a little bit of math to find all of your permutations.

The Problems with Math.Random, and what is Cryptographic Security?

The problem with Math.random then, in a lot of these applications, is just that it’s not cryptographically secure, which may sound like a smart jumble of words, but is kind of meaningless without explanation.

The difference between a “cryptographically secure pseudo random number generator” and one that is not, are two things:

One, they cannot be cracked with cryptoanalysis easily, using statistics to figure out future value.

Two, even when its state is potentially compromised, they cannot backtrack to see what values were previously available and what values will be available in the future.

For a long time, Math.random was actually failing at both of those, as the randomness produced collisions much more often than it does now, meaning that it would evaluate to recurring values more often (although, we’re speaking after multiple millions of invocations, but enough to be cracked easily), due to using a low quality method of getting a random number. However, despite the changes made, it still can be predicted by knowing the initial or later state of the generator.

Crypto: getRandomValue, our solution

So, how do we create a pseudorandom number generator that can’t so easily be cracked by a malicious actor who knows a little math? There’s a lot involved that goes into it, but one of the easiest to understand concepts is “Entropy”. As the generator creates new values, the state slowly shifts and morphs over time in unpredictable ways, making future and past values impossible to determine even with access to the state, current or initial. Usually, this consists of stacking other pseudorandom number generators on top of each other, of similarly unpredictable statistics, to seed values with good entropy.

Crypto: getRandomValues, being called from an API, can handle most of this externally. It uses seeds provided by the “user” to supply these so they can also add onto the entropy quickly. Since this is built into most browsers, the “user” in this case is more so the service allowing the user to interact, rather than the person typing away at the keyboard. This allows the generator to run fast itself without needing a true random number generator. They suggest that these in-betweens use their own pseudorandom generator, seeded with outside values to produce the best entropy.

Friend, or Foe?

In most situations relating to security, you really need to watch out for the PSEUDO part of pseudorandom number generators. In other situations? It’s true, it may really not matter. As some have said in my research, it may even be better to avoid it if you don’t know what you’re doing, as you may be over complicating your code for very little gain. But the second you are handling someone’s data, or need to encrypt something, just remember that Math.random could prove to be your enemy. For that very reason, getRandomValues is your friend.

This topic has a lot of depth, and even as the writer I actually feel as if I understand very little. Certainly enough to get the point of the various warnings about Math.random I see everywhere, and I hope this article helped you get there as well, rather than just having to take them for granted. Still, there's a lot more to it. To help understand this topic further, below will be the sources that I used to help write this article, and that made the most sense to me. Basically all of them contain links to even more other further sources, so I believe you'll find these to be good places to start.

Doom’s RNG- An example of a pseudorandom number generator in DOOM, and unpacking how it works. I found this to be really digestible and got me interested in the topic. Trust me, it’s a good video, good enough that I was actually interested in the word salad of “cryptographically secure pseudo random number generator”
There’s Math.random(), and then there’s Math.random() - This is an article detailing some of the problems with earlier iterations of Math.random. I found it from this stack overflow post.
How computers generate RANDOMNESS from math- This video helped me understand a lot of the weird technical terms used in the above source, and how the computer handles it internally.
mdn_ Crypto: getRandomValues() method - Yes, this one is obvious, but the most interesting tidbit is how the “user-agent” seeds the entropy that will be generated. I think it's integral to understanding entropy.
wikipedia: Cryptographically secure pseudorandom number generator - This helps specify the differences between any old pseudorandom number generator and one that is cryptographically secure.