Floating Point Quirks (2 Part Series)
In short: Don't use the language-provided equality test, and don't use language-provided "epsilon" constants as your "tolerance" for errors. Instead, choose your own tolerance.
Take this ("bad") code, which addresses the classic floating point problem of
(0.1 + 0.2) == 0.3 returning false:
let f1 = 0.1 + 0.2; let f2 = 0.3; console.log(Math.abs(f1 - f2) < Number.EPSILON); // 'True - Yippeee!!!'
Ok, so far so good. But it fails with other inputs:
let f1 = 1000000.1 + 0.2; let f2 = 1000000.3; console.log(Math.abs(f1 - f2) < Number.EPSILON); // '!!!!!! false !!!!!!!'
The basic pattern being used is sound: avoid a direct equality comparison, and check that your two numbers are within some tolerable difference. However, the tolerance used is badly chosen.
It's actually very dangerous to use Number.Epsilon as a "tolerance" for number comparisons.
Other languages have a similar construct (the .Net languages all have it as double.Epsilon for example). If you check any sound documentation for such constants, they tend to come with a warning not to use the "floating point epsilon" for comparisons.
The "epsilon" provided by the language is simply the smallest possible "increment" you can represent with that particular floating point type. For IEEE double-precision numbers, that number (Number.EPSILON) is minuscule!
The problem with using it for comparisons is that floating point numbers are implemented like scientific notation, where you have a some small(ish) number of significant digits, and an exponent which moves the decimal point left or right (possibly a loooooooooooong way left or right).
Number.EPSILON is waaaaay smaller than .00001 - so while the first example works with a "tolerance" of Number.EPSILON (because the numbers being compared are all smaller than 1.0), the second example breaks.
If you go hunting online, there's a fair bit of discussion on how to choose a suitable epsilon (or tolerance) for performing comparisons. After all the discussion, and some very clever code that has a good shot at figuring out a "dynamically calculated universal epsilon" (based on the largest number being compared) it always ends up boiling back down to this:
YOU need to choose the tolerance that makes sense for your application!
The reason dynamically calculated tolerances (based on the scale of the numbers being compared) aren't a universal solution is that when a collection of numbers being compared vary wildly in size it's easy to end up with a situation that breaks one of the most important rules of equality: "equality must be transitive". i.e.
if a == b, and b == c, then a == c must evaluate as TRUE as well!
Using a tolerance that changes with every single equality test in your program is a very good route to having a != c somewhere when you would reasonably expect a and c to be equal. You can also guarantee this will happen at annoyingly "random" times. Thar be the way to Bug Island me-hearties: enter if ye dare and may the almighty have mercy on yer soul ... arrrrrrrr**!!!
** actually ... "arrrghhhhhhhh!!!" is more appropriate
So, how do you select a suitable tolerance for your program? I'm glad you asked! ...
Let's assume you're holding dimensions of a building in millimetres (where a 20 metre long building would be 20,000). Do you really care if that dimension is within .0000000001 of a millimetre of some other dimension when you're comparing? - probably not!
In this case a sensible epsilon (or tolerance) might be .01 or .001**. Plug that into the
Math.abs(f1 - f2) < tolerance expression instead.
Definitely do NOT use
** things will tend to work out even cleaner if you use tolerances that can be represented precisely in binary. Some nice simple options are powers of two. e.g. 0.5 ( 2-1 ), 0.25 ( 2-2 ), 0.125 ( 2-3 ), 0.0625 ( 2-4 ) etc.
Incidentally, if you didn't care whether your measurements in the previous example were any closer than 1mm to each other, then you should probably just use an integer type and be done with it.
Likewise (still on the "building model" example above), if you only cared whether your measurements were within 0.1mm of each other, then you could use a "decimal" type (if your language supports it), or just store all your measurements internally as integers representing tenths of millimetres (e.g. 20 metre building = 200,000 "tenth-millimetres" internally)
Floating point numbers are great for what they were designed for (complex modelling of real-world measurements or coordinates), but they introduce weirdness into calculations involving money, or other things we expect to "be nice and even".
If you're not already familiar with the old 0.1+0.2 != 0.3 mind-bender, then I've thrown together a quick primer on the way floating point numbers work, which will shed some light on the madness.