loading...

Explain the challenge of generating random numbers like I'm five

peter profile image Peter Kim Frank ・1 min read
  • Why is it so hard for computers to generate random numbers?
  • What's all this business about pseudo-random vs. truly-random?

Please, ELI5

Discussion

markdown guide
 

In order to explain that we have to understand what computers are good at, and what they're bad at.

Computers are good at following instructions. Computers are bad at making things up.

Computers need clear, unambiguous instructions, and "make up a random number" is very ambiguous. Which number? How should I find it? How do I know if it's random? etc.

What we can do is give a computer a set of instructions to do some complicated math procedures and arrive at a seemingly random number (take this really large number, multiply it by another large number, divide it by this third number, etc, etc). Butthe problem with giving a computer a set of instructions is that anyone who follows the same instructions will get the same number, which is why we say it's only pseudo-random.

In order to get a truly random number we can use an outside input. You remember part of the instructions was to multiply some large number? What if we chose that number not by telling the computer directly, but by using a number from a source that changes often and chaotically? For example, we could look at the humidity level in Croatia, the position of the DOW, even the color of a lava lamp at a given moment.

By adding in a random chaotic input we can ensure that even if someone else follows the exact same set of instructions as you, they will always get a different number.

 

"Computers are good at following instructions. Computers are bad at making things up."
This!💯

 
 

Because we try very, very hard that the hardware does absolutely nothing randomly. You ask 1 its 1, you ask 0 its 0 (and you want that !). You cannot ask "0 or 1, I don't care".

So there is no source of randomness in your computer, something that would fluctuate without you knowing why. So you can try to make one and hide it. For example, if you tried to implant a random function from scratch, you could write a list of number in the wrong order [7, 6, 0, 9, 4, 3, 2, 5, 1, 8] and loop through it. It would seem random enough… until people ask 20 times and notice the pattern.

This very simplistic, but in essence that's the problem about randomness: hide the pattern. Patterns can get very, very sophisticated mathematicaly (like Nobel prize sophisticated). But the name of the game is always the same: as long as "you" (or a computer…) cannot guess the pattern, it's random. If you guess, even partially, then you can make prediction = it's not random anymore.

 

I didn't see the other answers diving into the second question, so here we go:

Pseudorandom numbers are good enough for the common usage, mostly statistics and sampling (e.g. Monte Carlo methods), as there's not really an issue if an attacker is able to reverse the stream of random data and "derandomize". It is not good enough for sensitive purposes (such as cryptography) because of that.

In computer security, we always assume an attacker has more resources and motivation than us, for these special purposes you either need to have a truly random source of data (several examples were given) or a cryptographically secure pseudorandom number generator.

 

My favorite way to make random number generator is by John Van Neumann. In the early days of computers, he needed a very simple and quick way to produce random numbers. His solution is called the "middle-square" method.

The middle-square method is pretty simple.

  1. A human is needed for the first step.
    • Choose a large number of any length.
    • For now let 's use a 5-digit number, like: 86,147
  2. Now the computer multiplies that number by itself.
    • It gets: 86,147 x 86,147 = 7,421,305,609
  3. Finally, the computer finds the "middle-number".
    • Our number (7,421,305,609) is 10-digits-long and we want the middle.
    • Now we count out the digits from left to right.
    • Count out the digits 3 to 8 and keep those number.
    • We get 21,305.
  • The good news is "You" just made a random number.
  • The bad news is that these numbers will repeat after a while. How long? It all depends on the number "You" choose.

Could you think of different ways to get random numbers?
I bet there are a bunch of ways! lol

 

Of course there are (MUCH) better ways -- the Mersenne Twister, en.wikipedia.org/wiki/Mersenne_Twi... , by Makoto Matsumoto ja and Takuji Nishimura (西村 拓士), is non-pareil to the best of my knowledge, and it's what Python standard library module random uses since many years ago.

 

Alex,
Could you translate the technical jargon on the the Mersenne Twister wikipage to language that a 5 or even a 10 could understand? ;))
Isn't that in the spirit of the title? #explainlikeimfive

 

It is not difficult for computers to generate random numbers -- they just require a non-deterministic data source. Thermal, auditory, network, keyboards, mice, etc, all contain non-deterministic data (from the perspective of the computer, at least).

Pseudo-random numbers are deterministically reproducible.

This means that if I know the algorithm and seed state I can reproduce the same pseudo-random number sequence whenever I like -- this is a very useful property.

A pseudo-random number sequence can be produced by an algorithm.

An infinite pseudo-random number sequence contains a finite amount of information -- so we can store and communicate these infinite sequences.

Truly-random numbers are not deterministically reproducible.

A truly-random number sequence cannot be produced by an algorithm.

An infinite random number sequence contains an infinite amount of information -- we cannot store or communicate these infinite sequences.