DEV Community

Peter Kim Frank
Peter Kim Frank

Posted on

Explain the challenge of generating random numbers like I'm five

  • Why is it so hard for computers to generate random numbers?
  • What's all this business about pseudo-random vs. truly-random?

Please, ELI5

Top comments (13)

Collapse
 
yechielk profile image
Yechiel Kalmenson • Edited

In order to explain that we have to understand what computers are good at, and what they're bad at.

Computers are good at following instructions. Computers are bad at making things up.

Computers need clear, unambiguous instructions, and "make up a random number" is very ambiguous. Which number? How should I find it? How do I know if it's random? etc.

What we can do is give a computer a set of instructions to do some complicated math procedures and arrive at a seemingly random number (take this really large number, multiply it by another large number, divide it by this third number, etc, etc). Butthe problem with giving a computer a set of instructions is that anyone who follows the same instructions will get the same number, which is why we say it's only pseudo-random.

In order to get a truly random number we can use an outside input. You remember part of the instructions was to multiply some large number? What if we chose that number not by telling the computer directly, but by using a number from a source that changes often and chaotically? For example, we could look at the humidity level in Croatia, the position of the DOW, even the color of a lava lamp at a given moment.

By adding in a random chaotic input we can ensure that even if someone else follows the exact same set of instructions as you, they will always get a different number.

Collapse
 
etienneburdet profile image
Etienne Burdet

"Computers are good at following instructions. Computers are bad at making things up."
This!💯

Collapse
 
peter profile image
Peter Kim Frank

This is brilliant. Thanks so much, Yechiel!

Collapse
 
yechielk profile image
Yechiel Kalmenson

Thanks!

Collapse
 
etienneburdet profile image
Etienne Burdet

Because we try very, very hard that the hardware does absolutely nothing randomly. You ask 1 its 1, you ask 0 its 0 (and you want that !). You cannot ask "0 or 1, I don't care".

So there is no source of randomness in your computer, something that would fluctuate without you knowing why. So you can try to make one and hide it. For example, if you tried to implant a random function from scratch, you could write a list of number in the wrong order [7, 6, 0, 9, 4, 3, 2, 5, 1, 8] and loop through it. It would seem random enough… until people ask 20 times and notice the pattern.

This very simplistic, but in essence that's the problem about randomness: hide the pattern. Patterns can get very, very sophisticated mathematicaly (like Nobel prize sophisticated). But the name of the game is always the same: as long as "you" (or a computer…) cannot guess the pattern, it's random. If you guess, even partially, then you can make prediction = it's not random anymore.

Collapse
 
rsa profile image
Ranieri Althoff

I didn't see the other answers diving into the second question, so here we go:

Pseudorandom numbers are good enough for the common usage, mostly statistics and sampling (e.g. Monte Carlo methods), as there's not really an issue if an attacker is able to reverse the stream of random data and "derandomize". It is not good enough for sensitive purposes (such as cryptography) because of that.

In computer security, we always assume an attacker has more resources and motivation than us, for these special purposes you either need to have a truly random source of data (several examples were given) or a cryptographically secure pseudorandom number generator.

Collapse
 
mccurcio profile image
Matt Curcio • Edited

My favorite way to make random number generator is by John Van Neumann. In the early days of computers, he needed a very simple and quick way to produce random numbers. His solution is called the "middle-square" method.

The middle-square method is pretty simple.

  1. A human is needed for the first step.
    • Choose a large number of any length.
    • For now let 's use a 5-digit number, like: 86,147
  2. Now the computer multiplies that number by itself.
    • It gets: 86,147 x 86,147 = 7,421,305,609
  3. Finally, the computer finds the "middle-number".
    • Our number (7,421,305,609) is 10-digits-long and we want the middle.
    • Now we count out the digits from left to right.
    • Count out the digits 3 to 8 and keep those number.
    • We get 21,305.
  • The good news is "You" just made a random number.
  • The bad news is that these numbers will repeat after a while. How long? It all depends on the number "You" choose.

Could you think of different ways to get random numbers?
I bet there are a bunch of ways! lol

Collapse
 
alexmartelli profile image
Alex Martelli

Of course there are (MUCH) better ways -- the Mersenne Twister, en.wikipedia.org/wiki/Mersenne_Twi... , by Makoto Matsumoto ja and Takuji Nishimura (西村 拓士), is non-pareil to the best of my knowledge, and it's what Python standard library module random uses since many years ago.

Collapse
 
mccurcio profile image
Matt Curcio • Edited

Alex,
Could you translate the technical jargon on the the Mersenne Twister wikipage to language that a 5 or even a 10 could understand? ;))
Isn't that in the spirit of the title? #explainlikeimfive

Thread Thread
 
alexmartelli profile image
Alex Martelli

Alas, teaching kids has never been my forte, which is why I posted the above note as a comment, not as a response. I don't even know how many adults, say, practicing software engineers, can be taught all the crucial details of the Twister -- say, enough to code it "from scratch", without reference to existing code, pseudo-code, or formulas; my guess would be "a small minority". Which is what makes software reuse so important: if, to use the Twister, you needed to grasp its every crucial detail, almost all applications would be using absolutely horrible RNGs -- much as was the case before the late '90s, when the Twister took over:-)

Thread Thread
 
mccurcio profile image
Matt Curcio

Hi Alex,
I think you're right if we couldn't stand on the shoulders of other greats, I don't think mankind would be that far along at all. ;)) One I have come across quite often is that many doctors don't know the chemistry that occurs in the body. I was trained as Biochemist and see this a lot. To me doctors seem to learn rules and procedures.

But getting back to teaching. I have spent the last several years teaching 1st through 6th grade in the Northeast. Most days were pretty good but you're right it can be tough. But I think it is one of those things like (backpacking) that everyone should try in life once.

Thread Thread
 
alexmartelli profile image
Alex Martelli

Congrats on your choice to teach in primary school -- reminds me of what Ludwig Wittgenstein, one of my idols, chose to do with his life for a few years right after WW1.

For me, the once-in-a-lifetime try at teaching young kids was with my two children -- I was told "by everybody" that I did everything wrong... interacting with them "as if" what they lacked was only an accumulated lifetime of knowledge, NOT brainpower, smarts, attention span, etc, because that's how I recalled my own childhood and my parents interacting with myself (and how I LIKED interaction with adults back when I was a child, though hardly any adults except my parents gave me that much respect).

I know that's completely contrary to every current theory in pedagogy, but... check out a recent tweet from my now-grown-up son, twitter.com/LucioMM1/status/130557... -- "I have been blessed by a completely deranged (according to normies) father who actually came close to optimizing his role with me. I feel it will be very hard to do the same with my two children". So it seems that while I was indeed "going at it in a 100% wrong way" ("completely deranged according to normies", as my son Lucio put it:-), in retrospect, the results weren't all that bad.

(My grown-up daughter is nowhere as active on social media, being too busy using her PhD in Telecom Engineering to build an awesome career -- my own degree is more generic, Electronic Engineering [we didn't have such fine-grained specialization in my time!-)], and only a humble MSEE, not a PhD, but overall her career has been, so far, eerily parallel to the early part of my own... I suspect this may indirectly suggest that she, also, retrospectively appreciates how I interacted with her in her childhood).

If either of my kids had asked me about the Twister when they were five (unlikely, as it didn't exist back then, but, let's assume!-), I would have explained it takes a lot of background to explain it well, were they interested enough to spend several hours? They might have said no, not enough -- or they might have taken me up on it, in which latter case I'd explain that my own grasp of the subject was broad, but not fully deep enough, so we'd be doing some research together (each of them LOVED that in the not-so-rare occasions in which such things happened -- NEVER bluffing by faking a knowledge I didn't have, but rather offering to research an issue together, is maybe the most important life lesson I ever learned from my late dad... BTW, he was a doctor, and would probably agree with your scathing judgment of how most of his colleagues were trained, though it would definitely not apply to him personally). If they took me up on it, I'd start with prime numbers, then binary representation of integers (if they didn't already know either subject -- at 5, they might have!), and continue from there.

Of course, an approach that can work for teaching one or two kids might likely break down entirely if tried on a whole classroom!, but then, the question was about explaining to ONE 5-year-old, presumably quite a bright one if they ask about such esoteric issues, NOT to a classroom of them!-)

Collapse
 
pentacular profile image
pentacular

It is not difficult for computers to generate random numbers -- they just require a non-deterministic data source. Thermal, auditory, network, keyboards, mice, etc, all contain non-deterministic data (from the perspective of the computer, at least).

Pseudo-random numbers are deterministically reproducible.

This means that if I know the algorithm and seed state I can reproduce the same pseudo-random number sequence whenever I like -- this is a very useful property.

A pseudo-random number sequence can be produced by an algorithm.

An infinite pseudo-random number sequence contains a finite amount of information -- so we can store and communicate these infinite sequences.

Truly-random numbers are not deterministically reproducible.

A truly-random number sequence cannot be produced by an algorithm.

An infinite random number sequence contains an infinite amount of information -- we cannot store or communicate these infinite sequences.