Some years ago, a customer requested in my job to generate a million unique codes for a promotion. Each code had to be a sixteen digits alphanumeric value following somes rules like having a number of digits and so on.
It was not a difficult task, but I was impressed when the customer told another company would need 24 hours of processing to generate the codes. 24 hours??? really?
May be the codes were being generated using this algorithm:
while count < 1000000
create a code
check if code exists by check one by one on a database
if not exists, store in the data base
count++
end
In that time I could not know how was it done, but seemed an O^2 algorithm.
While contemplating how to improve the algorithm, an idea occurred to me: what about using a hash? Each code could be stored as a key, and accessing a key in a hash is an O(1) operation (really it depends on handling collisions, which data structure is used, etc).
This idea seemed promising, so I decided to give it a try and try by writing and implementing this algorithm in python:
values = {}
while count < 1000000
create a code
if values[code] not exists
values[code] = true
count++
end
end
It took around one minute to generate the million codes, and taking into account it was executed in an old Dell XPS laptop, was not bad at all.
Once codes were generated, a coworker did a test to ensure unicity: he loaded the codes into a database and executed a query like SELECT DISTINCT(code) FROM values
It just worked: there were a million of different codes.
He looked amazed by the results.
Obviously, there are more details to take into consideration: how to generate an unique code (generate random string libraries), code format, etc, but worked really nice.
As mentioned earlier, I did it some years ago, but my curiosity lead me to try again using different languajes, like Ruby and Rust, and check the execution time for them.
These are the results using a Ryzen 5 3600 CPU running in Debian:
Ruby
require 'securerandom'
h = {}
count = 0
while count < 1000000
random_string = SecureRandom.hex
if !h[random_string]
h[random_string] = 1
count += 1
end
end
doc@Mimzy:~/tmp/keys $ time ruby keys.rb
real 0m1,526s
user 0m1,188s
sys 0m0,336s
Rust
use std::collections::HashMap;
use rand::distributions::{Alphanumeric, DistString};
fn main() {
let mut map = HashMap::new();
let mut count = 0;
while count < 1_000_000 {
let s = Alphanumeric.sample_string(&mut rand::thread_rng(), 16);
if !map.contains_key(&s.clone()) {
map.insert(s.clone(), 1);
count += 1;
}
}
}
doc@Mimzy:~/tmp/keys $ time ./target/release/keys
real 0m0,461s
user 0m0,449s
sys 0m0,012s
As you can see, there is a great improvement, but take into consideration that Rust code is compiled and optimized, while Ruby code is interpreted.
In conclusion: explore new and different ways to solve problems, employing creative thinking in the process.
Top comments (0)