Babak

Posted on Aug 21, 2019

An algorithm for picking random numbers in a range without repetition

#algorithms #python #javascript #shuffle

Introduction

Picking random numbers in a range without repetition is a common task in many scenarios from cryptography to games. There are mathematical means for achieving this, for example, pseudo-random number algorithms like linear congruential generators (LCGs). These methods have their pros and cons. Some of the pros include that they can be fast and operate in constant O(1) space complexity. Common cons include that they may not generate all permutations possible and can be tricky to setup.

This article explores the problem of picking random numbers, covering all possible permutations, with O(M) space and time complexity, where M is the desired number of generated numbers, given by 0 <= M <= N and where N is the length of the range. Let's explore how we might do this:

A naïve approach might go like this:

Pick an index at random.
Check to see if we've picked that number before in a set S
If we have picked the number before, go to the first step
Otherwise, use the picked number and record it so we don't pick it again.

The problem with this approach is step 3--having to pick another number. This will slow down our algorithm, with more repetitions at steps 3 as the number of item we pick reaches the length of the array.

One solution to this is to just apply the Fisher-Yates algorithms for just the number of item we want, then return shuffled slice. The problem here is that if we want to pick M numbers within the range 0 to N without mutating the original array, we have to make a copy of the entire array and apply Fisher-Yates for just M items.

The Algorithm

Something we can do is use a hash map to track swipes that would otherwise occur in-place in Fisher-Yates. This idea is similar to using a Sparse Matrix, recording only the changes made to our array. Furthermore, we will keep that information around only for as long as we need. Let's see how this might work.

Let's suppose we have an array with 5 integers.

1 2 3 4 *5

Starting at the end, lets choose a number from 1 to 5. Let suppose we pick 3. At this point we would create a table like so

MAPPINGS
3 => 5

Then we would return 3

Next, we move to 4, choose from 1 to 4. Let's say 2. Record the mapping and return 2.

1 2 3 *4 5

MAPPINGS
3 => 5
2 => 4

return values: 3, 2

Now we are at 3. Let's suppose now we pick 3. We see 3 maps to 5. So we return 5. We delete 3, because we are done with it.

1 2 *3 4 5

MAPPINGS
2 => 4

return values: 3, 2, 5

Ok, this one will be tricky. For index 2 we'll pick 1. We could write 1 => 2, but 2 already appears as a mapping, so we will resolve that and set 1 => 4. Then we return 1. Here is how our state now appears:

1 *2 3 4 5

MAPPINGS
1 => 4

return values: 3, 2, 5, 1

Finally, for index 1, we resolve to 4. Our array looks like this:

3 2 5 1 4

The Code

Let's how the code for this might look like. Here it is in plain JavaScript


// Helpers
function randInt( num ) {
  return Math.floor( Math.random() * ( num + 1 ) )
}
function getMapOrElse( map, key, fallback ){ 
  return map.has( key ) ? map.get( key ) : fallback
}
let getMapKeyOrValue = ( map, key ) => getMapOrElse( map, key, key )

// Virtual Shuffle

export function* virtualShuffle( maxLength, quantity, randRange = randInt  ) {
  // set up hash map to record swaps
  let swapMap = new Map()

  // bind swapMap to helper function
  let swapMapValueOrKey = getMapKeyOrValue.bind( null , swapMap )

  // boundaries for quantity
  if (quantity > maxLength) quantity = maxLength
  if (quantity < 0) quantity = 0

  // define start and end points
  let start = maxLength - 1
  let end = start - quantity

  for( let pointerIndex = start; pointerIndex >= end ; pointerIndex--) {
    // select a random index up to the current pointer
    let randomIndex = randRange( pointerIndex )

    // yield the current randomIndex unless we have a
    // swap recorded. If so, use the recorded swap 
    yield swapMapValueOrKey( randomIndex )

    // set the swap map at the selected random index
    // unless there is a recorded swap to use instead
    swapMap.set( randomIndex, swapMapValueOrKey( pointerIndex ) )

    // we will no longer need the recorded swap at the
    // current pointer
    if( swapMap.get( pointerIndex )) {
      swapMap.delete( pointerIndex )
    } 

  }
}

For good measure, here is a Python version as well. The Python language is well suited for expressing algorithms, often reading like pseudo-code.

import random

# Helpers
def dict_value_or_key( dict, key ): 
  return dict[ key ] if key in dict else key

# Virtual Shuffle
def virtual_shuffle( max_length, quantity, seed = None ):
  # hash map to keeo track of swapped indexes
  swap_map = {}

  # optional seeding of random generator
  if( seed != None ): random.seed( seed )

  # quantity is bound to range
  if quantity > max_length: quantity = max_length
  if quantity < 0: quantity = 0

  # define range start/end points
  start = max_length - 1
  end = start - quantity

  # main algorithm
  for pointer_index in range( start, end, -1 ):
    random_index = random.randint( 0, pointer_index )

    # yield random_index unless swap_map has a swap value
    # in the that case, use the recorded swap value
    yield dict_value_or_key( swap_map, random_index )

    # set swap_map for the randomly selected index to
    # the current pointer_index unless swap_map has
    # a record swap, in that case, use the swap value
    swap_map[ random_index ] = dict_value_or_key( swap_map, pointer_index )

    # we will no longer need recorded swaps recorded
    # for the current index
    if pointer_index in swap_map:
      del swap_map[ pointer_index ]

Like Fisher-Yates, we pick a random number up to pointer_index where pointer_index starts at maximum value in our range and decrements down by one each iteration. For each cycle we do three key things:

yield either the random index we choose or the swap value if we find a mapping for the random index in our dictionary
record a mapping for the randomly selected index to the current pointer
delete a stored mappings for the current pointer, since all future iterations will be less than this value

A helper that really simplifies the code is dict_value_or_key. This is a function that looks up a key in a hash table / dictionary, and returns the value if the key is found; otherwise, it returns the key.

This method is capable of selecting the desired amount of numbers from all permutations (N!).

Conclusion

By using a hash map, we are able to simulate the Fisher-Yates algorithm, picking M elements in a range 0 to N with O(M) space and time complexity.

Top comments (6)

dllmusic • May 29 '20

This works...

//Array to get non repetitive random elements from
var colors = ["Blue", "Red", "Green" , "Yellow", "Orange", "Gray"];
var colorsindx = []; //Indexes (to shuffle)
var randomcolors = []; //Shuffled Array

//init indexes (or push colors.length if adding elements to colors)
for (i=0; i < colors.length; i++) colorsindx.push(i);

//get unique random element and add it to shuffled Array
for (i=0;i < colors.length; i++) {
var rr = Math.floor(Math.random() * (colors.length - i));
var pickedcolor = colors[colorsindx[rr]]; //unique random element
colorsindx.splice(rr,1); //remove picked item index
randomcolors.push(pickedcolor); //push to Shuffled Array
}

Jonathan • Apr 4 '20

paste.ofcode.org/6ZRiMsnxwnbK6UnAq...

this somehow doesnt work, ignore the range for now

Babak • Apr 4 '20

Hey! One could just wrap the existing function to create the effect. max_length is the difference between high and low. Then add the low value to the output.

Jonathan • Apr 5 '20

I was rather looking for a function that would return x random unique samples from an arbitrary range than a shuffle function and stumbled on your post.
I really like the method but sadly couldnt get it quite to work.
I was hoping to find something that would use pure arithmetical operations to pick a unique sample from rand range.
There has to be a way.

Jonathan • Apr 4 '20

What if on 5 we roll 3 -> return 3.
Then on 4 we roll 3 -> return 5, delete map.
Then on 3 we roll 3 -> return?, no more map, index to swap with is also 3.

Babak • Apr 4 '20

Test these assumptions :-)