I ran across a practical example of algorithm analysis when I wrote a method that generates unique random values inside a provided range. Figured I would share it...
The code in this article was included in my https://github.com/ZacharyPatten/Towel project. Please check it out if you want to see more code like it. :)
Note: all code examples are in C#.
Introduction
When you want to generate unique random numbers, there are two approaches you can take: 1) you can track the remaining possible values of generation or 2) you can track the values that have already been randomly selected. For variables/parameters, let count
be the number of values to generate, minValue
be the inclusive lower bound of generation, and maxValue
be the exclusive upper bound of generation.
Algorithm #1/A
After a bit of googling, option #1 seems to be more common. You often see people throw all the possible values into an array/list and remove a randomly generated index. Here is an implementation of that algorithm:
public static IEnumerable<int> NextUniqueA(this Random random, int count, int minValue, int maxValue)
{
// Algorithm A: Θ(range + count)
int pool = maxValue - minValue;
int[] array = new int[pool];
for (int i = 0, j = minValue; j < maxValue; i++, j++) // Θ(range)
array[i] = j;
for (int i = 0; i < count; i++) // Θ(count)
{
int rollIndex = random.Next(0, pool);
int roll = array[rollIndex];
array[rollIndex] = array[--pool];
yield return roll;
}
}
Algorithm #2/B
Now, here is what option #2 can look like. For this implementation, I use a linked list to store all the values that have already been generated in sorted order:
public static IEnumerable<int> NextUniqueB(this Random random, int count, int minValue, int maxValue)
{
// Algorithm B: O(.5*count^2), Ω(count), ε(.5*count^2)
Node head = null;
for (int i = 0; i < count; i++) // Θ(count)
{
int roll = random.Next(minValue, maxValue - i);
Node node = head;
Node previous = null;
while (!(node is null)) // O(count / 2), Ω(0), ε(count / 2)
{
if (node.Value > roll)
break;
roll++;
previous = node;
node = node.Next;
}
yield return roll;
if (previous is null)
head = new Node() { Value = roll, Next = head, };
else
previous.Next = new Node() { Value = roll, Next = previous.Next };
}
}
internal class Node
{
internal int Value;
internal Node Next;
}
Analyzing Algorithm Complexities
These two implementations have different runtime algorithm complexities. The #1/A option has a runtime complexity of Θ(range + count)
while the #2/B option has a runtime complexity of O(.5*count^2), Ω(count), ε(.5*count^2)
where O
is the worst case, Θ
is every case, Ω
is the best case, and ε
is the expected average between the worst and the best cases. Long story short, when the count
is relatively low you want to use algorithm #2/B. When the count
is relatively high you want to use algorithm #1/A. So if you want to make a faster all-around general purpose method, you need to combine the two approaches. But how do you do that? You need to find the intersection of the two runtime complexities of each method.
If you treat both algorithm complexities as linear functions and do some simple math to find the intesection, you will find that they intersect at about √(maxValue - minValue)
the square root of the range of random generation relative to count
. Now that you know the intersection point, you can combine the two algorithms into a single function that is optimized for more use cases:
Combination of #1/A and #2/B
public static IEnumerable<int> NextUnique(this Random random, int count, int minValue, int maxValue)
{
if (count > Math.Sqrt(maxValue - minValue))
{
// Algorithm A
}
else
{
// Algorithm B
}
}
Benchmarks
Here are some benchmarks so you can compare theory with practical results. Note that for these benchmarks I used a random generation range of 0..10000
. So minValue = 0
and maxValue = 10000
. From the theory, I knew that the intersection of the runtime algorithms is roughly √(maxValue - minValue) = √(100000 - 0) = √(100000) = 100
. So when count
is less than ~100
the #2/B algorithm should be faster, but when count
is greater than ~100
algorithm #1/A should be faster.
Benchmarks Source Code
Note: This code uses the BenchmarkDotNet nuget package. It also uses the Plots exporter to generate the graphs, which requires R be installed.
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Linq;
using System.Collections.Generic;
using BenchmarkDotNet.Exporters;
using BenchmarkDotNet.Configs;
namespace UniqueRandomGeneration
{
class Program
{
static void Main(string[] args) => BenchmarkRunner.Run<Benchmarks>();
}
[RPlotExporter]
[Config(typeof(Config))]
public class Benchmarks
{
public class Config : ManualConfig
{
public Config() => Add(RPlotExporter.Default);
}
const int MinValue = 0;
const int MaxValue = 10000;
[Params(25, 50, 75, 100, 125, 150, 175, 200, 225, 250)]
public int Count;
readonly Random random = new Random();
[Benchmark] public void RandomUniqueA() => random.NextUniqueA(Count, MinValue, MaxValue).ToArray();
[Benchmark] public void RandomUniqueB() => random.NextUniqueB(Count, MinValue, MaxValue).ToArray();
[Benchmark] public void RandomUnique() => random.NextUnique(Count, MinValue, MaxValue).ToArray();
}
public static class Extensions
{
internal class Node
{
internal int Value;
internal Node Next;
}
public static IEnumerable<int> NextUnique(this Random random, int count, int minValue, int maxValue)
{
if (maxValue < minValue)
throw new ArgumentOutOfRangeException(nameof(minValue) + " > " + nameof(minValue));
if (count < 0)
throw new ArgumentOutOfRangeException(nameof(count) + " < 0");
if (maxValue - minValue < count)
throw new ArgumentOutOfRangeException(nameof(count) + " is larger than " + nameof(maxValue) + " - " + nameof(minValue) + ".");
if (count > Math.Sqrt(maxValue - minValue))
{
// Algorithm A: Θ(range + count)
int pool = maxValue - minValue;
int[] array = new int[pool];
for (int i = 0, j = minValue; j < maxValue; i++, j++) // Θ(range)
array[i] = j;
for (int i = 0; i < count; i++) // Θ(count)
{
int rollIndex = random.Next(0, pool);
int roll = array[rollIndex];
array[rollIndex] = array[--pool];
yield return roll;
}
}
else
{
// Algorithm B: O(.5*count^2), Ω(count), ε(.5*count^2)
Node head = null;
for (int i = 0; i < count; i++) // Θ(count)
{
int roll = random.Next(minValue, maxValue - i);
Node node = head;
Node previous = null;
while (!(node is null)) // O(count / 2), Ω(0), ε(count / 2)
{
if (node.Value > roll)
break;
roll++;
previous = node;
node = node.Next;
}
yield return roll;
if (previous is null)
head = new Node() { Value = roll, Next = head, };
else
previous.Next = new Node() { Value = roll, Next = previous.Next };
}
}
}
public static IEnumerable<int> NextUniqueA(this Random random, int count, int minValue, int maxValue)
{
if (maxValue < minValue)
throw new ArgumentOutOfRangeException(nameof(minValue) + " > " + nameof(minValue));
if (count < 0)
throw new ArgumentOutOfRangeException(nameof(count) + " < 0");
if (maxValue - minValue < count)
throw new ArgumentOutOfRangeException(nameof(count) + " is larger than " + nameof(maxValue) + " - " + nameof(minValue) + ".");
// Algorithm A: Θ(range + count)
int pool = maxValue - minValue;
int[] array = new int[pool];
for (int i = 0, j = minValue; j < maxValue; i++, j++) // Θ(range)
array[i] = j;
for (int i = 0; i < count; i++) // Θ(count)
{
int rollIndex = random.Next(0, pool);
int roll = array[rollIndex];
array[rollIndex] = array[--pool];
yield return roll;
}
}
public static IEnumerable<int> NextUniqueB(this Random random, int count, int minValue, int maxValue)
{
if (maxValue < minValue)
throw new ArgumentOutOfRangeException(nameof(minValue) + " > " + nameof(minValue));
if (count < 0)
throw new ArgumentOutOfRangeException(nameof(count) + " < 0");
if (maxValue - minValue < count)
throw new ArgumentOutOfRangeException(nameof(count) + " is larger than " + nameof(maxValue) + " - " + nameof(minValue) + ".");
// Algorithm B: O(.5*count^2), Ω(count), ε(.5*count^2)
Node head = null;
for (int i = 0; i < count; i++) // Θ(count)
{
int roll = random.Next(minValue, maxValue - i);
Node node = head;
Node previous = null;
while (!(node is null)) // O(count / 2), Ω(0), ε(count / 2)
{
if (node.Value > roll)
break;
roll++;
previous = node;
node = node.Next;
}
yield return roll;
if (previous is null)
head = new Node() { Value = roll, Next = head, };
else
previous.Next = new Node() { Value = roll, Next = previous.Next };
}
}
}
}
Benchmarks Results Table
BenchmarkDotNet=v0.12.0, OS=Windows 10.0.18363
Intel Core i7-4790K CPU 4.00GHz (Haswell), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.1.100
[Host] : .NET Core 3.1.0 (CoreCLR 4.700.19.56402, CoreFX 4.700.19.56404), X64 RyuJIT
DefaultJob : .NET Core 3.1.0 (CoreCLR 4.700.19.56402, CoreFX 4.700.19.56404), X64 RyuJIT
Method | Count | Mean | Error | StdDev |
---|---|---|---|---|
RandomUniqueA | 25 | 7.406 us | 0.1576 us | 0.1876 us |
RandomUniqueB | 25 | 1.307 us | 0.0256 us | 0.0414 us |
RandomUnique | 25 | 1.300 us | 0.0046 us | 0.0043 us |
RandomUniqueA | 50 | 7.761 us | 0.0263 us | 0.0246 us |
RandomUniqueB | 50 | 2.893 us | 0.0084 us | 0.0079 us |
RandomUnique | 50 | 3.022 us | 0.0755 us | 0.0707 us |
RandomUniqueA | 75 | 8.387 us | 0.1317 us | 0.1232 us |
RandomUniqueB | 75 | 5.180 us | 0.1070 us | 0.1001 us |
RandomUnique | 75 | 5.298 us | 0.0634 us | 0.0593 us |
RandomUniqueA | 100 | 9.151 us | 0.1800 us | 0.2340 us |
RandomUniqueB | 100 | 8.013 us | 0.0994 us | 0.0930 us |
RandomUnique | 100 | 8.246 us | 0.1205 us | 0.1127 us |
RandomUniqueA | 125 | 9.335 us | 0.1533 us | 0.1434 us |
RandomUniqueB | 125 | 11.291 us | 0.1093 us | 0.1023 us |
RandomUnique | 125 | 9.394 us | 0.1773 us | 0.1658 us |
RandomUniqueA | 150 | 9.509 us | 0.0323 us | 0.0302 us |
RandomUniqueB | 150 | 14.486 us | 0.0653 us | 0.0610 us |
RandomUnique | 150 | 9.355 us | 0.0428 us | 0.0400 us |
RandomUniqueA | 175 | 9.972 us | 0.0319 us | 0.0298 us |
RandomUniqueB | 175 | 18.778 us | 0.0792 us | 0.0741 us |
RandomUnique | 175 | 9.789 us | 0.0593 us | 0.0554 us |
RandomUniqueA | 200 | 10.270 us | 0.0297 us | 0.0278 us |
RandomUniqueB | 200 | 23.384 us | 0.0623 us | 0.0520 us |
RandomUnique | 200 | 10.112 us | 0.0499 us | 0.0466 us |
RandomUniqueA | 225 | 10.724 us | 0.1754 us | 0.1465 us |
RandomUniqueB | 225 | 28.748 us | 0.0872 us | 0.0816 us |
RandomUnique | 225 | 10.612 us | 0.0430 us | 0.0381 us |
RandomUniqueA | 250 | 11.144 us | 0.0518 us | 0.0484 us |
RandomUniqueB | 250 | 34.320 us | 0.1249 us | 0.1107 us |
RandomUnique | 250 | 10.891 us | 0.0826 us | 0.0772 us |
Benchmarks Results Bar Graph
Probability/Validity Testing
Here is some validity testing to hopefully make sure I am not full of shit. In this code, I generate 5
values in the 0..26
range. I do this 1000000
times, and I calculate the mean that each values was randomly selected. All the values should be selected an average of 5 / 26 = 0.1923...
times.
Validity Testing Source Code
using System;
using System.Linq;
using System.Collections.Generic;
namespace ValidityTesting
{
class Program
{
static void Main()
{
Console.WriteLine(nameof(Extensions.NextUnique));
PrintPriorities(CalculateProbabilities(Extensions.NextUnique));
Console.WriteLine();
Console.WriteLine(nameof(Extensions.NextUniqueA));
PrintPriorities(CalculateProbabilities(Extensions.NextUniqueA));
Console.WriteLine();
Console.WriteLine(nameof(Extensions.NextUniqueB));
PrintPriorities(CalculateProbabilities(Extensions.NextUniqueB));
Console.WriteLine();
}
internal static void PrintPriorities(Dictionary<int, double> results)
{
foreach (var keyVal in results.OrderBy(x => x.Key))
Console.WriteLine(keyVal.Key + ": " + keyVal.Value);
}
internal static Dictionary<int, double> CalculateProbabilities(Func<Random, int, int, int, IEnumerable<int>> algorithm)
{
Random random = new Random();
const int size = 5;
const int min = 0;
const int max = 26;
const int iterations = 1000000;
var counts = new Dictionary<int, int>();
for (int i = min; i < max; i++)
counts.Add(i, 0);
for (int i = 0; i < iterations; i++)
foreach (int j in algorithm(random, size, min, max))
counts[j]++;
var results = new Dictionary<int, double>();
foreach (var keyVal in counts)
results.Add(keyVal.Key, keyVal.Value / (double)iterations);
return results;
}
}
public static class Extensions
{
internal class Node
{
internal int Value;
internal Node Next;
}
public static IEnumerable<int> NextUnique(this Random random, int count, int minValue, int maxValue)
{
if (maxValue < minValue)
throw new ArgumentOutOfRangeException(nameof(minValue) + " > " + nameof(minValue));
if (count < 0)
throw new ArgumentOutOfRangeException(nameof(count) + " < 0");
if (maxValue - minValue < count)
throw new ArgumentOutOfRangeException(nameof(count) + " is larger than " + nameof(maxValue) + " - " + nameof(minValue) + ".");
if (count > Math.Sqrt(maxValue - minValue))
{
// Algorithm A: Θ(range + count)
int pool = maxValue - minValue;
int[] array = new int[pool];
for (int i = 0, j = minValue; j < maxValue; i++, j++) // Θ(range)
array[i] = j;
for (int i = 0; i < count; i++) // Θ(count)
{
int rollIndex = random.Next(0, pool);
int roll = array[rollIndex];
array[rollIndex] = array[--pool];
yield return roll;
}
}
else
{
// Algorithm B: O(.5*count^2), Ω(count), ε(.5*count^2)
Node head = null;
for (int i = 0; i < count; i++) // Θ(count)
{
int roll = random.Next(minValue, maxValue - i);
Node node = head;
Node previous = null;
while (!(node is null)) // O(count / 2), Ω(0), ε(count / 2)
{
if (node.Value > roll)
break;
roll++;
previous = node;
node = node.Next;
}
yield return roll;
if (previous is null)
head = new Node() { Value = roll, Next = head, };
else
previous.Next = new Node() { Value = roll, Next = previous.Next };
}
}
}
public static IEnumerable<int> NextUniqueA(this Random random, int count, int minValue, int maxValue)
{
if (maxValue < minValue)
throw new ArgumentOutOfRangeException(nameof(minValue) + " > " + nameof(minValue));
if (count < 0)
throw new ArgumentOutOfRangeException(nameof(count) + " < 0");
if (maxValue - minValue < count)
throw new ArgumentOutOfRangeException(nameof(count) + " is larger than " + nameof(maxValue) + " - " + nameof(minValue) + ".");
// Algorithm A: Θ(range + count)
int pool = maxValue - minValue;
int[] array = new int[pool];
for (int i = 0, j = minValue; j < maxValue; i++, j++) // Θ(range)
array[i] = j;
for (int i = 0; i < count; i++) // Θ(count)
{
int rollIndex = random.Next(0, pool);
int roll = array[rollIndex];
array[rollIndex] = array[--pool];
yield return roll;
}
}
public static IEnumerable<int> NextUniqueB(this Random random, int count, int minValue, int maxValue)
{
if (maxValue < minValue)
throw new ArgumentOutOfRangeException(nameof(minValue) + " > " + nameof(minValue));
if (count < 0)
throw new ArgumentOutOfRangeException(nameof(count) + " < 0");
if (maxValue - minValue < count)
throw new ArgumentOutOfRangeException(nameof(count) + " is larger than " + nameof(maxValue) + " - " + nameof(minValue) + ".");
// Algorithm B: O(.5*count^2), Ω(count), ε(.5*count^2)
Node head = null;
for (int i = 0; i < count; i++) // Θ(count)
{
int roll = random.Next(minValue, maxValue - i);
Node node = head;
Node previous = null;
while (!(node is null)) // O(count / 2), Ω(0), ε(count / 2)
{
if (node.Value > roll)
break;
roll++;
previous = node;
node = node.Next;
}
yield return roll;
if (previous is null)
head = new Node() { Value = roll, Next = head, };
else
previous.Next = new Node() { Value = roll, Next = previous.Next };
}
}
}
}
Validity Testing Console Output
NextUnique
0: 0.192031
1: 0.19208
2: 0.192785
3: 0.192287
4: 0.192593
5: 0.192686
6: 0.191937
7: 0.191998
8: 0.19244
9: 0.192112
10: 0.19212
11: 0.192707
12: 0.192877
13: 0.192086
14: 0.192277
15: 0.191907
16: 0.192366
17: 0.192386
18: 0.192423
19: 0.19229
20: 0.192017
21: 0.191901
22: 0.19285
23: 0.19282
24: 0.191902
25: 0.192122
NextUniqueA
0: 0.192705
1: 0.191956
2: 0.192313
3: 0.192613
4: 0.193418
5: 0.191957
6: 0.192116
7: 0.192217
8: 0.192773
9: 0.191862
10: 0.19233
11: 0.192145
12: 0.192044
13: 0.192749
14: 0.192097
15: 0.19224
16: 0.192526
17: 0.19225
18: 0.191653
19: 0.192546
20: 0.192156
21: 0.191686
22: 0.192908
23: 0.19174
24: 0.192605
25: 0.192395
NextUniqueB
0: 0.192408
1: 0.191776
2: 0.191809
3: 0.192623
4: 0.192399
5: 0.192451
6: 0.192135
7: 0.1931
8: 0.192594
9: 0.19232
10: 0.192281
11: 0.191723
12: 0.192358
13: 0.19216
14: 0.192697
15: 0.192975
16: 0.192357
17: 0.192507
18: 0.191768
19: 0.191874
20: 0.192803
21: 0.191934
22: 0.192075
23: 0.192704
24: 0.192155
25: 0.192014
Top comments (0)