loading...

Randomness with JavaScript

ksankar profile image Kailash Sankar ・3 min read

A quick guide to writing a random generator in javascript.

We'll start with Math.random(), it returns a pseudo-random number between 0 and 1.

> Math.random();
"0.9352792976305455"

In a lot of cases what we need is a mix of characters and numbers, to get that we can use toString().

> Math.random().toString(36);
"0.e8inoxu3leo"

toString(36) converts the generated random number to base36 0-9,a-z.

Take a substring to get rid of the '.'

>Math.random().toString(36).substring(2);
"8yx36o08vqq"

What about generating long random strings?
We can call the above snippet multiple times. But what if you need to control the character set?
for example,

  • only characters
  • only numbers
  • characters + numbers + symbols

Let's write a function which will accept charset and size to generate a random string. The random string should contain a combination of characters from the charset, the order in which we pick these characters is where we use Math.random.

Let's say we need a random string of size 10, we start with "" and in each iteration we'll pick a random character from the charset and append it to our string.

Math.random gives us a number between 0-1, which when multiplied by 10 (charset_size) and floored will give us an index between 0 and 10.

const idx = Math.floor(Math.random() * 10);
// at max 0.99 => 9.9 => 9

The full function,

function generateId(size, charset) {
  const max = charset.length;
  let rstr = "";

  for (let i = size; i > 0; i--) {
    let idx = Math.floor(Math.random() * max);
    rstr += charset[idx];
  }
  return rstr;
}

> generateId(10,"0123abcxyz-_");
"3x-b-yz1x1"

> generateId(4,"0123456789");
"0973"

Define default parameters to cover common use-cases for ease of use.

// don't keep the symbols if you want a url friendly string
const _CHARSET =
 "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-_@%$#*&";
const _SMP_SZ = 10
function generateId(size = _SMP_SZ, charset = _CHARSET) {
 // rest is same as above
}

> generateId();
"BZXQ3CH9Lr"

> generateId(30);
"8Uk9JN8-tP59m*yKtLCoaUnkP#x_Ak"

Randomness does not mean uniqueness

Even if the chances are less given a large charset and size, we do not check/ensure the generated output is unique. To make sure a generated string is unique, the easiest way is to keep track of the previous outputs.

For that we can write a wrapper function around generateId and use it to maintain the history of outputs using closure.

Let's declare a hist object in the wrapper function and every time a random id is generated we can check if it's in hist, if not then add it to hist and return the id, otherwise try again. We do have to keep a retry limit to avoid infinite loops.

function uniqueIdFactory(retryLimit = 5) {
  // list of previous id's in closure
  const hist = {};
  return (size = _SMP_SZ, charset = _CHARSET) => {
    let retryCounter = retryLimit;
    // retry until a non-duplicate id is found
    // break after retryLimit is hit
    while (retryCounter > 0) {
      const r = generateId(size, charset);
      if (r in hist) {
        retryCounter--;
      } else {
        hist[r] = true;
        return r;
      }
    }
    // let the caller do the rest
    // change dataset or increase size
    return null;
  };
}

Test the function out by giving a small size, charset and running it in a loop.

const genUniqueId = uniqueIdFactory();


> genUniqueId();
"I4fOEqwj4y"

// you will see null after a few runs
for (let i = 0; i < 25; i++) {
  console.log("->", genUqid(3, "abc"));
}

The history is maintained in memory only as long as the function is alive, this approach is fine for light usage but don't use it for scenarios where you feel the hist object is gonna get too big.

Usage depends on the scenario, if you are looping through 1000 records and want to assign random unique id (other than an index) then yes, this would work. But if you need unique ids only occasionally, across a timeline then you can also rely on just epoch timestamp + a short random string.

function epochId() {
  const epochStr = new Date().getTime();
  const randStr = generateId();
  return `${epochStr}-${randStr}`;
}

> epochId();
"1592166792073-kIVGNaPlYQ"

All the above code is available here

For production usage, consider packages like nanoid, shortid or equivalent
(updated from inputs on comments, do check out the implementations)

Further reading,

  • Lot of cool and quirky approaches in this gist
  • For crypto safe random values use Crypto

That's all folks.

Posted on by:

ksankar profile

Kailash Sankar

@ksankar

I'm a full stack web developer, jack of many and master of none.

Discussion

markdown guide
 

I actually use Math.random().toString(36).substr(2) quite often, when I don't want to name things.

But when I want to ensure no collision, I generally use nanoid or uuid/v4. shortid is nice, but may have more collisions than nanoid, as well as slower to generate. (Actually, I read that some parts of the code between shortid and nanoid are shared.)

 

Thanks for the input! Updated post to link nanoid.
Yeah, looks like it's published by the same folks and shortId has a dependency on nanoid.