Skip to content
loading...
re: Hi James, thanks for your article. Can you explain why you need to check if the generated random index is in the output's indexes in your Fisher Y...
 

Good question Amin!

Imagine the following code:

$input = [1, 3, 4, 6, 2, 6];

print_r(
  fisherYates($input) // Example output: [1, 4, 3, 2, 6, 6]
);

There is a chance that we get a notice which starts off like this PHP Notice: Undefined offset: 6.... This becomes understandable when we go through what is happening on each line of the fisherYates implementation.

Let's look at our example code from above in action assuming we didn't have the array_key_exists conditional with a focus on the first loop iteration:

function fisherYates([1, 3, 4, 6, 2, 6]) {
  $output = [1, 3, 4, 6, 2, 6];

  for($index = 5; 5 >= 0; $index--) { 
    $randomIndex = rand(0, 6); // Let's say 6 comes back

    $tmp = $output[5]; 
    $output[5] = $output[6]; // Notice triggered here!
    $output[6] = $tmp;
  }

  return $output;
}

Clearly in this example 6 is not a valid offset in the array since arrays are indexed from 0 and thus the array in this example can only go up to offset 5. We still need the $index + 1 to be passed to the rand() function call though to increase the chances and possibilities for a swap, especially at lower indexes.

To resolve this we add the check for array_key_exists for the minority of cases where the rand() call returns an invalid offset and move to allow iteration to continue without a notice being triggered.

From my testing this case with the notice appearing happens around once in every 4 fisherYates calls if the array_key_exists conditional isn't part of the implementation but ofcourse it's hard to accurately averange since it depends on the output of the rand() call which is inherintly random. Either way with the implementation I wrote it will never happen anyway but hopefully you understand why it is there now.

Hopefully that makes sense and was helpful to you!

 

I see, indeed that makes totally sense, thanks for the clarification.

I still don't understand why we need to increase the index to increase our chances but there must be something I'm missing in the Fisher Yates' algorithm.

Don't worry, that's a simple one too!

Imagine the following:

$index = 1;
$randomIndex = rand(0, $index);

In this case only index 0 or 1 would be valid to come back but if we do $index + 1 instead then the possibile indexes for swapping are 0, 1 and 2.

This way we have more possible indexes that can be used to swap with the current item and more than that it also helps us to make the output a little more random during the shuffle since more indexes can be made available for the swap during each iteration.

Anytime, glad to of been able to clear that up for you 😊. Thanks for your comments and for stopping by to read the article! 📚

code of conduct - report abuse