Now comes the hard part.

Most of the Math in QR codes in performed in the Galois Field of order 2^{8} = 256. In this set, denoted as GF(256):

- includes the numbers from 0 to 255;
- has an "addition" operation, which is actually the binary XOR and not the usual sum (so the "sum" of two elements will still be part of GF(256));
- has a "multiplication" operation, which is
*similar*to the usual arithmetic multiplication but with some differences so that multiplying two elements will still give us an element of GF(256) (the neutral element is still 1).

The algorithm chosen for EDC in QR codes is the Reed-Solomon error correction, which is widely used for streaming data (e.g. CDs, wireless communications) because it allows to correct errors found in *bursts*, rather than single isolated cases. I won't go into details, but we're stuck with this kind of odd arithmetic.

## Operations on GF(256)

The "addition" (XOR'ing) is quite simple. The neutral element with relation to XOR is still 0, as *a* ^ 0 = *a*. Also every element is the *opposite of itself*, since *a* ^ *a* = 0.

And since "subtraction" is defined as adding the opposite of the second term, this also means that the "subtraction" is equivalent of the "addition"! In fact: *a* - *b* = *a* ^ (-*b*) = *a* ^ *b*.

Now, about the multiplication. A Galois Field is *cyclic*, meaning that every non-zero element can be expressed as the power of a "primitive element" *α*. So, in GF(256), if *a* = *α*^{n} and *b* = *α*^{m}, then *a* ⋅ *b* = *α*^{n} ⋅ *α*^{m} = *α*^{n + m}.

But, as we said, a Galois Field is cyclic, so *α*^{256} = *α*. This means that we can take the exponent *n* + *m* *modulo* 255, so we can simplify our computations a bit. In the end, *a* ⋅ *b* = *α*^{(n + m) % 255} (if both *a* and *b* are non-zero; the result is of course 0 otherwise).

This also means that for every *a*, *a*^{256} = *a*, and then *a*^{255} = 1, therefore *a*^{254} = *a*^{-1}, i.e. is the *inverse* of *a*. So now we have a way to do divisions: *a* / *b* = *α*^{n} / *α*^{m} = *α*^{n}(*α*^{m})^{254} = *α*^{(n + m * 254) % 255}.

## Operations in code

XOR'ing is no sweat for JavaScript or any other capable language, but multiplication is another story. The easiest thing to do is creating logarithmic and exponential tables, so it'll be easy to convert a number from and to its exponential notation.

But how do we find *α*? It's not so hard, as there are *φ*(255) = 192 primitive elements in GF(256), where *φ* is Euler's totient function. For the sake of simplicity, we can take *α* = 2.

Since we're dealing with values all below 256, we can use JavaScript's `Uint8Array`

s, but if you wish you can use just regular arrays:

```
const LOG = new Uint8Array(256);
const EXP = new Uint8Array(256);
for (let exponent = 1, value = 1; exponent < 256; exponent++) {
value = value > 127 ? ((value << 1) ^ 285) : value << 1;
LOG[value] = exponent % 255;
EXP[exponent % 255] = value;
}
```

We just start at 1, then double `value`

at each iteration (shifting by 1 to the left). If `value`

goes over 255, we XOR it with 285. Why 285? I won't go into details (if you're curious, you can find them here), as it has something to do with the relation between elements of a Galois field and polynomials, but rest assured that we'll get all 255 non-zero elements like this.

In the end we'll have:

```
> LOG
< Uint8Array(256) [0, 0, 1, 25, 2, 50, 26, 198, 3, 223, 51, 238, ...]
> EXP
< Uint8Array(256) [1, 2, 4, 8, 16, 32, 64, 128, 29, 58, 116, 232, ...]
```

Now we can implement the functions for multiplication and division:

```
function mul(a, b) {
return a && b ? EXP[(LOG[a] + LOG[b]) % 255] : 0;
}
function div(a, b) {
return EXP[(LOG[a] + LOG[b] * 254) % 255];
}
```

But how will that serve us for error correction? Let's see...

## Polynomials in GF(256)

Yes, the Reed-Solomon algorithm uses polynomials! You've probably seen them since high school, and have this form:

*a*_{n}*x*^{n} + *a*_{n - 1}*x*^{n - 1} + ... + *a*_{1}*x* + *a*_{0}

where *a*_{0}, ..., *a*_{n} are the *coefficients*, while *x* is the *variable*. You've probably seen (and solved, in the form of equations) them in the field of *real numbers*, with either real or complex solutions.

But coefficients, exponents and variables could be meant to be in any other field (ring would be enough, actually), even GF(256), inheriting its operations too. So, the "addition" is GF(256)'s addition, i.e. XOR, while multiplication is the one seen above. Exponentiation is just repeated multiplication by itself, as usual.

The good news here is that, as long as our concern is just generation, we do *not* need to solve any equation!

### Polynomial multiplication

Addition is a *commutative* operation, meaning that *a* + *b* = *b* + *a*. It is in GF(256) too, because *a* ^ *b* = *b* ^ *a*. And multiplication is too, but it's also *distributive* over the addition, meaning that *a*(*b* + *c*) = *ab* + *ac*. And this holds in GF(256) too.

This basically means that we can multiply *polynomials* between them like we used to do with polynomials on real numbers. Suppose we have

*p*_{1}(*x*) = *a*_{n}*x*^{n} + *a*_{n - 1}*x*^{n - 1} + ... + *a*_{1}*x* + *a*_{0}

*p*_{2}(*x*) = *b*_{m}*x*^{m} + *b*_{m - 1}*x*^{m - 1} + ... + *b*_{1}*x* + *b*_{0}

Take the first term of *p*_{1}(*x*), i.e. *a*_{n}*x*^{n}, then multiply it with all the terms of *p*_{2}(*x*):

*a*_{n}*x*^{n}⋅*p*_{2}(*x*) = *a*_{n}*b*_{m}*x*^{n + m} + *a*_{n}*b*_{m - 1}*x*^{n + m - 1} + … + *a*_{n}*b*_{1}*x*^{n + 1} + *a*_{n}*b*_{0}*x*^{n}

Then do the same with the second term of *p*_{1}(*x*), then the third, and so on. Finally, sum them all together. If this makes your head spin, let's start with and example: *x*^{2} + 3*x* + 2 and 2*x*^{2} + *x* + 7. As we've said above, we have to do the following:

(*x*^{2} + 3*x* + 2)(2*x*^{2} + *x* + 7)

= *x*^{2}(2*x*^{2} + *x* + 7) + 3*x*(2*x*^{2} + *x* + 7) + 2(2*x*^{2} + *x* + 7)

= 2*x*^{4} + *x*^{3} + 7*x*^{2} + 6*x*^{3} + 3*x*^{2} + 21*x* + 4*x*^{2} + 2*x* + 14

= 2*x*^{4} + (6 + 1)*x*^{3} + (7 + 3 + 4)*x*^{2} + (21 + 2)*x* + 14

= 2*x*^{4} + 7*x*^{3} + 14*x*^{2} + 23*x* + 14

We end up with a polynomial with 5 terms, which is the sum of the amount of terms of both polynomials, minus 1.

#### In code

We can represent a polynomial with the array of its coefficients, so that *x*^{2} + 3*x* + 2 could be translated to `[1, 3, 2]`

. Again, since the coefficients can't go over 255, we can use `Uint8Array`

to optimize performances.

Of course all the operations are meant to be done in GF(256), so we're using XOR for addition and the `mul`

function defined above.

Please read the comments in the code snippet below carefully 😁

```
function polyMul(poly1, poly2) {
// This is going to be the product polynomial, that we pre-allocate.
// We know it's going to be `poly1.length + poly2.length - 1` long.
const coeffs = new Uint8Array(poly1.length + poly2.length - 1);
// Instead of executing all the steps in the example, we can jump to
// computing the coefficients of the result
for (let index = 0; index < coeffs.length; index++) {
let coeff = 0;
for (let p1index = 0; p1index <= index; p1index++) {
const p2index = index - p1index;
// We *should* do better here, as `p1index` and `p2index` could
// be out of range, but `mul` defined above will handle that case.
// Just beware of that when implementing in other languages.
coeff ^= mul(poly1[p1index], poly2[p2index]);
}
coeffs[index] = coeff;
}
return coeffs;
}
```

### Polynomial divisions

Ooooh boy. Remember long divisions in high school? Same thing here. (Except we'll just need the *rest* of the division, not the quotient, but let's save that for later.)

Let't take a *dividend* polynomial 4*x*^{3} + 4*x*^{2} + 7*x* + 5, and a *divisor* polynomial 2*x* + 1. Basically these are the steps:

- divide the
*first term*of the dividend polynomial (4*x*^{3}) with the*first term*of the divisor (2*x*, and get 2*x*^{2}); - multiply the divisor polynomial by the above quotient (you'll get 4
*x*^{3}+ 2*x*^{2}); - get the rest by subtracting the result from the dividend (you'll get 2
*x*^{2}+ 7*x*+ 5); - if the degree of the rest is lower than the degree of the divisor, you're done; otherwise, the rest becomes your new dividend and you go back to step 1.

For the division above (in the field of real numbers), you'll get a polynomial quotient of 2*x*^{2} + *x* + 3, and a rest of 2. Now let's do this in JavaScript, and in GF(256).

#### In code

The quotient polynomial will always be long the difference in length of the dividend and the divisor, plus one.

But it turns out that we don't *need* the quotient for the Reed-Solomon error correction algorithm, just the rest. So we're defining a function that returns only the rest of the division. The size of the quotient is needed just to count the steps to do.

The code below *should* be self-explanatory given the example above (it really just does the steps above), but if it's not feel free to ask in the comments:

```
function polyRest(dividend, divisor) {
const quotientLength = dividend.length - divisor.length + 1;
// Let's just say that the dividend is the rest right away
let rest = new Uint8Array(dividend);
for (let count = 0; count < quotientLength; count++) {
// If the first term is 0, we can just skip this iteration
if (rest[0]) {
const factor = div(rest[0], divisor[0]);
const subtr = new Uint8Array(rest.length);
subtr.set(polyMul(divisor, [factor]), 0);
rest = rest.map((value, index) => value ^ subtr[index]).slice(1);
} else {
rest = rest.slice(1);
}
}
return rest;
}
```

## Now what?

Theory says that a Reed-Solomon error correction data sequence spanning *n* codewords allows to recover up to *n*/2 unreadable codewords, they being among the data sequence *or* in error correction sequence itself (!). Kinda cool, is it?

Remember the error correction table from the first part?

Level | Letter | Data recovery |
---|---|---|

Low | L | ~7% |

Medium | M | ~15% |

Quartile | Q | ~25% |

High | H | ~30% |

Those percentages are not results, but rather *goals*: for example, we want the quartile level of correction to be able to recover 25% (a quarter) of the codewords. This means that for this level of correction, **there must be as many error correction codewords as data codewords.**

For example, a version 2 QR code contains 44 codewords in total. We want to recover up to 11 (25%) of them, which means that we must reserve 22 codewords for EDC. If it looks expensive, it's because it is... but necessary if we want our QR codes to be readable even when damaged.

(The above applies for smaller QR codes. For larger ones, data is often split into two *groups*, and each group into several *blocks* - up to 67. Each block has its own error correction sequence, but while data blocks for the second group are always one codeword larger then the blocks from the first group, error correction sequences are all long the same and sized for the *larger* block, so even for quartile level EDC sequences could be slighly more in total codewords than data. We'll discuss about spliting data into blocks later in the series.)

From this, it's also clear we can't do much better than level H of error correction. If, for example, we wanted 18 codewords to be recoverable out of 44, then we had to use 36 codewords just for error correction, leaving just 8 codewords for data - i.e. less than 18! It's clear it makes little sense, as we'd be better off just repeating the data.

Now let's focus on how to get those error correction codewords out of our data.

## Working with (big) polynomials

In the second part, we've sequenced our data (the string `https://www.qrcode.com/`

) into an array of bytes (or codewords, in QR code jargon). Now we've treated polynomials as arrays of values between 0 and 255, so basically using `Uint8Array`

s for both of them. And that's handy, since for error correction we have to view our data *as a polynomial with the codewords as coefficients*. Perfect!

Basically, we have our data that becomes this polynomial, called the **message polynomial**:

65*x*^{27} + 118*x*^{26} + 135*x*^{25} + 71*x*^{24} + … + 17*x* + 236

But we have 44 total codewords in our version 2 QR code, so we have to multiply this by *x* to the power of the error correction codewords, i.e. 16. In the end we have:

65*x*^{43} + 118*x*^{42} + 135*x*^{41} + 71*x*^{40} + … + 17*x*^{17} + 236*x*^{16}

Now that we have our big polynomial, we have to divide it by... something, and take the rest of this division: the coefficients of the rest polynomial are going to be our error correction codewords!

But what's this divisor polynomial? Also called…

### The generator polynomial

If we have to fill *n* codewords with error correction data, we need the generator polynomial to be of degree *n*, so that the rest is of degree *n* - 1 and so the coefficients are exactly *n*. What we're going to compute is a polynomial like this:

(*x* - *α*^{0})(*x* - *α*^{1})(*x* - *α*^{2})…(*x* - *α*^{n - 2})(*x* - *α*^{n - 1})

Now, as we've said, in GF(256) subtraction is the same as addition, and we've also chosen *α* to be 2. Finally, there are 16 codewords for medium correction in a version 2 QR code, so our generator polynomial is this one:

(*x* + 1)(*x* + 2)(*x* + 4)(*x* + 8)(*x* + 16)(*x* + 32)(*x* + 64)(*x* + 128)(*x* + 29)(*x* + 58)(*x* + 116)(*x* + 232)(*x* + 205)(*x* + 135)(*x* + 19)(*x* + 38)

The values in the factors are basically the ones from the `EXP`

table computed before. Anyway, let's get our `polyMul`

function rolling!

```
function getGeneratorPoly(degree) {
let lastPoly = new Uint8Array([1]);
for (let index = 0; index < degree; index++) {
lastPoly = polyMul(lastPoly, new Uint8Array([1, EXP[index]]));
}
return lastPoly;
}
```

Normally, you'd want to pre-compute or cache these polynomials instead of generating them each time. Anyway, our polynomial will be this one:

```
getGeneratorPoly(16);
// Uint8Array(17) [1, 59, 13, 104, 189, 68, 209, 30, 8, 163, 65, 41, 229, 98, 50, 36, 59]
```

*Finally*, we're getting our EDC codewords, by dividing our message polynomial with the generator polynomial:

```
function getEDC(data, codewords) {
const degree = codewords - data.length;
const messagePoly = new Uint8Array(codewords);
messagePoly.set(data, 0);
return polyRest(messagePoly, getGeneratorPoly(degree));
}
```

In the end:

```
const data = getByteData('https://www.qrcode.com/', 8, 28);
getEDC(data, 44);
// Uint8Array(16) [52, 61, 242, 187, 29, 7, 216, 249, 103, 87, 95, 69, 188, 134, 57, 20]
```

And we're done! 🙌 It's been a long, but fundamental chapter.

… at least for now. Because *much* still has to be done in order to create a working QR code.

Stay tuned for the next part, which will be a shorter one. We'll define some details around error correction, and learn how to actually displace all the codewords in the grid. In the following part, we'll talk about masking.

## Top comments (8)

Very nice article! A question though: IIUC, GF(2^8) marks a set of polynomials with coefficients taken from GF(2), i.e. the coefficients can only be 0 or 1. Yet the article keeps referring to 255 as being the top value for a coefficient. How come?

Where did you get that the coefficients can only be 0 or 1?

Someirreducible polynomials used for the algorithm have only 0 or 1 as coefficients, but not in the general case.I was reading up on finite fields (in preparation for QR code generation, obviously :)) and in these lecture notes - engineering.purdue.edu/kak/compsec/ (Finite Fields PART 1-4) - the field GF(2^8) is described as comprising elements that are polynomials (up to a certain degree) with coefficients in the set {0,1} (because the coeffs themselves are from GF(2)).

The

irreduciblepolynomials will always have 0,1 as coeffs, IIUC, (see e.g. codyplanteen.com/assets/rs/gf256_p...) but thegeneratorpolynomial(s) given in the QR standard have coefficients that are powers of the primitive element (2, i.e. the polynomial x). Which is confusing to me. Maybe it boils down to the same thing in the end, if one were to rewrite the generator polynomials without the primitive element.I guess I'm having a bit of trouble merging the view of the fields and their arithmetics as given in the lecture notes with that of, it seems, the implementations I've found so far, with regards to some aspects.

I'm impressed the lengths you're covering to fully understand the algorithm behind - I have a degree in Mathematics and yet I stopped at some point 😅 Props to you!

And now I notice how miserably worded this phrase is:

What I mean that some polynomials have only 0 and 1 coeffs, and happen to be irreducible.

Thank you! And props to you too! I must say it's very nice to have trailblazers such as you that not only do the implementation but also write about it!

Ah, I think I am beginning to understand the source of my confusion. I interpreted the (QR) standard's wording on the RS encoding as:

"Treat the message codewords as coefficients of a polynom that is an element of GF(2^8)"

But it should be, I guess, (although quite obtusely worded, IMHO):

"Use the message codewords (as coefficients) to form a polynomial with coefficients that are elements of GF(2^8)".

That would explain much. Such a relief. I thought I was going to have to go mad.

Yes! I agree, that's kind of obscure.

But then again, the whole algorithm is kind of obscure haha

I'm at a point in my implementation where I get the same result for the polynomial division as in this tutorial (for the very same example), and I must take a moment to give your credit for the clean code you have produced. My own code is generally blargh, in comparison, with complicated constructs of indices and unnecessary array constructs (very nice use of slice in the polyRest function!). And ofc, overall impressive how you seem to be so confident with the arithmetics involved. Very valuable to follow your example. I imagine it would have taken me three days to figure out which k the standard refers to when it says "if you do this by long division, you must multiply the data codewords by x^k". I'll see if I arrive at the same symbol in the end... :)