Massimo Artizzu

Posted on Sep 10, 2021

Let's develop a QR Code Generator, part IX: structuring larger versions

#javascript #tutorial #qrcode

The cover image will make sense later, I swear! 😅

In the last part, we came to know how to split our data and error correction codewords for QR Codes of larger versions. But how can we choose the right version for our content?

QR Code capacity

The answer lies in that big table we've seen. Thanks to that, we can get to know how many codewords are reserved for data for a given version and error correction level and, given the encoding mode, compute the maximum length of the content we can write.

Let's have a look at the columns of that table:

Number of error correction codewords per block
Number of blocks in Group 1
Number of data codewords in blocks of Group 1
Number of blocks in Group 2
Number of data codewords in blocks of Group 2

Let's recall that the data codewords of a QR Code are split into blocks, and each block belong to group 1 or 2 depending on their size. For each data block there's an error correction block.

We also know that:

the value in (5) is just the value in (3) plus 1;
the value in (3) is actually the number of data codewords divided by (2) + (4) (i.e., the total number of blocks), rounded down to the previous integer;
the number of data codewords is the total number of codewords minus the number of error correction codewords;
the number of error correction codewords is (1) multiplied by the number of blocks;
(4) is actually the number of data codewords modulo the number of blocks.

In order to get the total number of codewords, we can use our function getAvailableModules from part 8 and divide the result by 8 (or shift to the right by 3).

In the end, for every version and error level, we just need two values:

the number of error correction codewords per block;
the number of blocks.

In the end, this should be our table:

const EC_TABLE = [
  { L: [7, 1],   M: [10, 1],  Q: [13, 1],  H: [17, 1] },
  { L: [10, 1],  M: [16, 1],  Q: [22, 1],  H: [28, 1] },
  { L: [15, 1],  M: [26, 1],  Q: [18, 2],  H: [22, 2] },
  { L: [20, 1],  M: [18, 2],  Q: [26, 2],  H: [16, 4] },
  { L: [26, 1],  M: [24, 2],  Q: [18, 4],  H: [22, 4] },
  { L: [18, 2],  M: [16, 4],  Q: [24, 4],  H: [28, 4] },
  { L: [20, 2],  M: [18, 4],  Q: [18, 6],  H: [26, 5] },
  { L: [24, 2],  M: [22, 4],  Q: [22, 6],  H: [26, 6] },
  { L: [30, 2],  M: [22, 5],  Q: [20, 8],  H: [24, 8] },
  { L: [18, 4],  M: [26, 5],  Q: [24, 8],  H: [28, 8] },
  { L: [20, 4],  M: [30, 5],  Q: [28, 8],  H: [24, 11] },
  { L: [24, 4],  M: [22, 8],  Q: [26, 10], H: [28, 11] },
  { L: [26, 4],  M: [22, 9],  Q: [24, 12], H: [22, 16] },
  { L: [30, 4],  M: [24, 9],  Q: [20, 16], H: [24, 16] },
  { L: [22, 6],  M: [24, 10], Q: [30, 12], H: [24, 18] },
  { L: [24, 6],  M: [28, 10], Q: [24, 17], H: [30, 16] },
  { L: [28, 6],  M: [28, 11], Q: [28, 16], H: [28, 19] },
  { L: [30, 6],  M: [26, 13], Q: [28, 18], H: [28, 21] },
  { L: [28, 7],  M: [26, 14], Q: [26, 21], H: [26, 25] },
  { L: [28, 8],  M: [26, 16], Q: [30, 20], H: [28, 25] },
  { L: [28, 8],  M: [26, 17], Q: [28, 23], H: [30, 25] },
  { L: [28, 9],  M: [28, 17], Q: [30, 23], H: [24, 34] },
  { L: [30, 9],  M: [28, 18], Q: [30, 25], H: [30, 30] },
  { L: [30, 10], M: [28, 20], Q: [30, 27], H: [30, 32] },
  { L: [26, 12], M: [28, 21], Q: [30, 29], H: [30, 35] },
  { L: [28, 12], M: [28, 23], Q: [28, 34], H: [30, 37] },
  { L: [30, 12], M: [28, 25], Q: [30, 34], H: [30, 40] },
  { L: [30, 13], M: [28, 26], Q: [30, 35], H: [30, 42] },
  { L: [30, 14], M: [28, 28], Q: [30, 38], H: [30, 45] },
  { L: [30, 15], M: [28, 29], Q: [30, 40], H: [30, 48] },
  { L: [30, 16], M: [28, 31], Q: [30, 43], H: [30, 51] },
  { L: [30, 17], M: [28, 33], Q: [30, 45], H: [30, 54] },
  { L: [30, 18], M: [28, 35], Q: [30, 48], H: [30, 57] },
  { L: [30, 19], M: [28, 37], Q: [30, 51], H: [30, 60] },
  { L: [30, 19], M: [28, 38], Q: [30, 53], H: [30, 63] },
  { L: [30, 20], M: [28, 40], Q: [30, 56], H: [30, 66] },
  { L: [30, 21], M: [28, 43], Q: [30, 59], H: [30, 70] },
  { L: [30, 22], M: [28, 45], Q: [30, 62], H: [30, 74] },
  { L: [30, 24], M: [28, 47], Q: [30, 65], H: [30, 77] },
  { L: [30, 25], M: [28, 49], Q: [30, 68], H: [30, 81] }
];

Using this table, we can compute the amount of codewords reserved for data:

function getDataCodewords(version, errorLevel) {
  const totalCodewords = getAvailableModules(version) >> 3;
  const [blocks, ecBlockSize] = EC_TABLE[version - 1][errorLevel];
  return totalCodewords - blocks * ecBlockSize;
}

This is not enough, though, because part of these data codewords are reserved for:

the encoding mode block;
the content length. While the former always take 4 bits/modules, the latter is variable in bit length, so we'll use the function getLengthBits that we've created back in part 2.

In the end we have a certain amount of available bits, but as we've seen in part 7, each encoding mode uses those bits differently.

Let's imagine we have 4 different functions (one for each encoding mode) that, given a certain amount of bits, returns the length of the content that can be cointained in those bits for a certain encoding mode:

const capacityFnMap = {
  [0b0001]: getNumericCapacity,
  [0b0010]: getAlphanumericCapacity,
  [0b0100]: getByteCapacity,
  [0b1000]: getKanjiCapacity
};

We'll end up with something like this:

function getCapacity(version, errorLevel, encodingMode) {
  const dataCodewords = getDataCodewords(version, errorLevel);
  const lengthBits = getLengthBits(encodingMode, version);
  const availableBits = (dataCodewords << 3) - lengthBits - 4;
  return capacityFnMap[encodingMode](availableBits);
}

Again, this is a pure function that we can memoize, but we can also precompute a table that we can use later.

Numeric mode capacity

As we've seen in part 7, we can store 3 digits in 10 bits, two digits in 7 and one digit in 4. Se we need to compute the bits modulo 10 and add the remainder digits at the end:

function getNumericCapacity(availableBits) {
  const remainderBits = availableBits % 10;
  return Math.floor(availableBits / 10) * 3 +
    (remainderBits > 6 ? 2 : remainderBits > 3 ? 1 : 0);
}

Alphanumeric mode capacity

Similarly to numeric mode, we can store two characters in 11 bits and one in 6:

function getAlphanumericCapacity(availableBits) {
  return Math.floor(availableBits / 11) * 2 +
    (availableBits % 11 > 5 ? 1 : 0);
}

Byte mode capacity

This is easy, as 1 character = 8 bits, flat.

function getByteCapacity(availableBits) {
  return availableBits >> 3;
}

Kanji mode capacity

This is also easy, as each pictogram needs 13 bits:

function getKanjiCapacity(availableBits) {
  return Math.floor(availableBits / 13);
}

The best QR Code

Now we've got everything to know which version we must choose for our content: we aim for the smallest version and highest error correction possible. The only additional complexity may come from the fact that we want a certain minimum error correction level.

For example, if we have a 54-digit long number (like the 10th perfect number), we could use a version 2 QR Code with medium error correction (as getCapacity(2, 'M') === 63), but if we want a high correction we have to use version 3 (since getCapacity(3, 'H') === 58).

So we can use something like this:

function getVersionAndErrorLevel(encodingMode, contentLength, minErrorLevel = 'L') {
  // The error levels we're going to consider
  const errorLevels = 'HQML'.slice(0, 'HQML'.indexOf(minErrorLevel) + 1);
  for (let version = 1; version <= 40; version++) {
    // You can iterate over strings in JavaScript 😁
    for (const errorLevel of errorLevels) {
      const capacity = getCapacity(version, errorLevel, encodingMode);
      if (capacity >= contentLength) {
        return [version, errorLevel];
      }
    }
  }
}

If it doesn't return anything, it means the content is too long.

Shuffling the codewords!

Let's suppose we have to encode… a snippet of JavaScript code, for a change:

['give you up','let you down','run around and desert you'].map(x=>'Never gonna '+x)

It's 83 bytes long, but we want a QR Code with quartile error correction at minimum. We get getVersionAndErrorLevel(0b0100, 83, 'Q') === [7, 'Q'], so we're going to need a version 7 QR Code.

We also know that getDataCodewords(7, 'Q') === 88, and we'll have to split these 88 codewords reserved for data into 2 blocks of 14 codewords (group 1), then other 4 blocks of 15 codewords each (group 2). Using the getData function from the last part, we get:

> getData(snippet, 8, 88)
< Uint8Array(88) [69, 53, 178, 118, 118, 151, 102, 82, 7, 150, 247, 82, 7, 87, 2, 114, 194, 118, 198, 87, 66, 7, 150, 247, 82, 6, 70, 247, 118, 226, 114, 194, 119, 39, 86, 226, 6, 23, 38, 247, 86, 230, 66, 6, 22, 230, 66, 6, 70, 87, 54, 87, 39, 66, 7, 150, 247, 82, 117, 210, 230, 214, 23, 2, 135, 131, 211, 226, 116, 230, 87, 102, 87, 34, 6, 118, 246, 230, 230, 18, 2, 114, 183, 130, 144, 236, 17, 236]

These codewords should be split like this (hex values):

Block	Bytes
G1-B1	`45 35 B2 76 76 97 66 52 07 96 F7 52 07 57`
G1-B2	`02 72 C2 76 C6 57 42 07 96 F7 52 06 46 F7`
G2-B1	`76 E2 72 C2 77 27 56 E2 06 17 26 F7 56 E6 42`
G2-B2	`06 16 E6 42 06 46 57 36 57 27 42 07 96 F7 52`
G2-B3	`75 D2 E6 D6 17 02 87 83 D3 E2 74 E6 57 66 57`
G2-B4	`22 06 76 F6 E6 E6 12 02 72 B7 82 90 EC 11 EC`

Now, instead of placing them one after the other, we take the first codewords from each block (first from group 1, then group 2), then the second codewords, and so on, until the 15th codewords, which are followed by the 16th codewords of the blocks of group 2. In short, we need to interleave the blocks. In the end, we'll end up with this sequence:

45 02 76 06 75 22 35 72 ... 57 EC 57 F7 E6 F7 66 11 42 52 57 EC

In code

We can either modify getData, or keep it as it as but we'll need another helper function to reorder the codewords we got. This function should take:

the codewords returned from getData;
the number of blocks we should use to split the data.

Something like this:

function reorderData(data, blocks) {
  /** Codewords in data blocks (in group 1) */
  const blockSize = Math.floor(data.length / blocks);
  /** Blocks in group 1 */
  const group1 = blocks - data.length % blocks;
  /** Starting index of each block inside `data` */
  const blockStartIndexes = Array.from(
    { length: blocks },
    (_, index) => index < group1
      ? blockSize * index
      : (blockSize + 1) * index - group1
  );
  return Uint8Array.from(data, (_, index) => {
    /** Index of the codeword inside the block */
    const blockOffset = Math.floor(index / blocks);
    /** Index of the block to take the codeword from
      If we're at the end (`blockOffset === blockSize`), then we take
      only from the blocks of group 2 */
    const blockIndex = (index % blocks)
      + (blockOffset === blockSize ? group1 : 0);
    /** Index of the codeword inside `data` */
    const codewordIndex = blockStartIndexes[blockIndex] + blockOffset;
    return data[codewordIndex];
  });
}

This function is supposed to be used like this:

const rawData = getData(snippet, 8, 88);
const orderedData = reorderData(rawData, 6);

Error correction

The error correction part is similar to the data part, in that also error correction codewords are split into blocks. It's just a little easier because all the error correction blocks have the same size.

So, for a 7-Q QR Code, the table above says we have 18 codewords for each error correction block. These blocks are computed using the respective data block. So, the first error correction block is comprised by the error correction codewords for the codewords of the first data block of group 1. Basically, it's this:

const rawData = getData(snippet, 8, 88);
const firstBlock = rawData.subarray(0, 14);
// => 69 53 178 118 118 151 102 82 7 150 247 82 7 87
const firstECBlock = getEDC(firstBlock, 14 + 18);
// => 63 102 26 192 65 106 117 90 107 88 138 42 103 127 227 86 189 1

The final part consists in interleaving the error correction blocks, and we're done.

In code

Given the instruction above, we can come up with the following helper function that wraps and replaces the old getEDC:

function getECData(data, blocks, ecBlockSize) {
  /** Codewords in data blocks (in group 1) */
  const dataBlockSize = Math.floor(data.length / blocks);
  /** Blocks in group 1 */
  const group1 = blocks - data.length % blocks;
  const ecData = new Uint8Array(ecBlockSize * blocks);
  for (let offset = 0; offset < blocks; offset++) {
    const start = offset < group1
      ? dataBlockSize * offset
      : (dataBlockSize + 1) * offset - group1;
    const end = start + dataBlockSize + (offset < group1 ? 0 : 1);
    const dataBlock = data.subarray(start, end);
    const ecCodewords = getEDC(dataBlock, dataBlock.length + ecBlockSize);
    // Interleaving the EC codewords: we place one every `blocks`
    ecCodewords.forEach((codeword, index) => {
      ecData[index * blocks + offset] = codeword;
    });
  }
  return ecData;
}

For our example, we should get the following result:

const rawData = getData(snippet, 8, 88);
const ecData = getECData(rawData, 6, 18);
// => 63 55 231 201 50 250 102 104 ... 7 15 1 181 202 64 199 23

for a total of 6*18 = 108 error correction codewords.

Wrapping everything up

So we have all that we need for data and error correction:

function getCodewords(content, minErrorLevel = 'L') {
  const encodingMode = getEncodingMode(content);
  const [version, errorLevel] = getVersionAndErrorLevel(
    encodingMode,
    content.length,
    minErrorLevel
  );
  const lengthBits = getLengthBits(encodingMode, version);

  const dataCodewords = getDataCodewords(version, errorLevel);
  const [ecBlockSize, blocks] = EC_TABLE[version - 1][errorLevel];

  const rawData = getData(content, lengthBits, dataCodewords);
  const data = reorderData(rawData, blocks);
  const ecData = getECData(rawData, blocks, ecBlockSize);

  const codewords = new Uint8Array(data.length + ecData.length);
  codewords.set(data, 0);
  codewords.set(ecData, data.length);

  return {
    codewords,
    version,
    errorLevel,
    encodingMode
  };
}

The function above should return the codewords - both data and error correction - ready to be placed in our QR Code! 🙌

And we're… not done?

Unfortunately there's still a small step to do, and we're going to see it in the next part. We have to fix the functions that return the sequence of module placements in the matrix and that actually place the modules, then also add the format information areas.

See you then and happy coding! 👋🎉

DEV Community

Let's develop a QR Code Generator, part IX: structuring larger versions

QR Code capacity

Numeric mode capacity

Alphanumeric mode capacity

Byte mode capacity

Kanji mode capacity

The best QR Code

Shuffling the codewords!

In code

Error correction

In code

Wrapping everything up

Top comments (0)

Read next

5 Signs You’ve Built a Secretly Bad Architecture (And How to Fix It)

Apply CSS in Next.js with StayedCSS

Optimizing React Apps with useMemo and useCallback: A Complete Guide

Azure Functions (dotnet): The Right Way to Work with Queue Storage