DEV Community

bennettandrewm
bennettandrewm

Posted on

Seeding, Reproducibility, and other Random Thoughts on the Random Module

Turtle_stack

Why Random Module

When studying data science and machine learning, the random() function in python is vital. Whether developing code, experimenting with data visualizations, or just a naval-gazing data nerd, it's critical to use and understand.

But what is it? How do you use it? And why does the number 42 always come up? This article will dive into random bits (no pun intended) to know about the Python random number generation (rng).

Random or... not so Random

Let's define informally what we mean by random. In Python, the random number generator creates pseudo random numbers, meaning from an algorithm. It uses the system time, with additional math on top, to generate these numbers. It's deterministic, so not perfectly random. But as Larry David would say, they're "pretty, pretty, pretty good."

Seeding and Reproducibility

Import

When using the random function, remember to import the module into python... duh

import random
Enter fullscreen mode Exit fullscreen mode

Random.Seed

This function makes your random number reproducible. What does that mean? Every time you call for a random number without it, Python will generate a different number than the previous occasion. Meaning, that random number is unique to that instantaneous request. Sometimes though, you want the SAME random number each time (reproducibility). If you're running the same code over and over for debugging/development/whatever, you want to verify that you're getting the CORRECT result, say, 42.

This is where .seed comes in. You're planting a seed, so to speak, so that every time you generate a random number, it's NOT unique to that compiling instant.

The code is simple:

random.seed(42)
Enter fullscreen mode Exit fullscreen mode

Now we will get reproducibility in our data. Let's move on to generating actual data.

Generating Data

Let's give common examples of code to get a number or a sequence of numbers or elements.

Generating a Number (Ints or Floats)

random.randint (a,b)

Returns a random integer between a,b inclusively. If I send the arguments (4,9), it returns 8

>>> random.randint(4,9)
9
Enter fullscreen mode Exit fullscreen mode

random.randrange(start, stop, step)

It will return an integer between two numbers, accounting for the step.

>>>random.randrange(2,12,5)
7
Enter fullscreen mode Exit fullscreen mode

random.random ()

This generates a random float between 0.0 and 1.0.

>>> random.random()
0.11133106816568039
Enter fullscreen mode Exit fullscreen mode

random.uniform (a,b)

This generates a random float between the numbers you send it.

>>> random.uniform(3, 6)
5.224651499279499
Enter fullscreen mode Exit fullscreen mode

Please note from the library "The end-point value b may or may not be included in the range depending on floating-point rounding in the equation a + (b-a) * random()."

random.choice(seq)

This is exactly how it sounds - you're getting a random element from a sequence that you provide. It's an illusion of course. Life isn't random - but predetermined by time. A sequence could be an array a tuplet, anything. Let's see an example:

>>> #tummy_prob is a sequence of seven numbers representing the
>>> # probability I will have tummy trouble on a given day of the week
>>> tummy_prob = [0.24, 0.35, .01, .05, .81, 0.36, .06]
>>> random.choice(tummy_prob)
0.35
Enter fullscreen mode Exit fullscreen mode

Yikes! I'm staying home today...

Working with Many Elements

random.shuffle(x)

It will randomly shuffle a sequence that you send it. You send it X, it gives you a X, in a different order. Let's try a safer example with a deck of 5 cards.

>>>cards = [3,5,8,7,9]
>>>random.shuffle(cards)
>>>cards
[3, 7, 8, 5, 9]
Enter fullscreen mode Exit fullscreen mode

random.sample(population, k)

Can return of list of k unique elements. Used for random sampling without replacement. You send it a population - it returns a list.

>>>people_heights = [5.25, 6.0, 6.2, 5.75, 5.5, 5.9]
>>>random.sample(people_heights, 2)
[6.0, 5.75]
Enter fullscreen mode Exit fullscreen mode

random.choices(population)

This returns a random element from a population. A population is one or more sequences.

>>>people_heights = [5.25, 6.0, 6.2, 5.75, 5.5, 5.9]
>>>random.choices(people_heights)
[6.2]
Enter fullscreen mode Exit fullscreen mode

Notice how it returned a list, so it could have returned multiple elements.

Other Random Notes

If you've made it to the end, you're obviously a dedicated, patient reader who is ever so curious about the...

Number 42

It's from Guardians of the Galaxy. At the end of the book, the computer, Deep Thought, when asked what the "Answer to the Ultimate Question of Life, the Universe, and Everything." responds with 42.

I hope this helps. That's all for now.

SOURCES

Python Library https://docs.python.org/3/library/random.html

Turtle Image https://dev-to-uploads.s3.amazonaws.com/uploads/articles/lhbmlz3pitjm1hv1dv50.jpg

Top comments (0)