Kevin Lamping

Posted on

# Stop lying to yourself when testing

The New York Times had a quiz a while back with a simple task: figure out the mathematical rule behind a sequence of numbers.

As a reader, you would enter your own numbers and see if they pass or fail your guess at the rule.

In my mind, this is very similar to the way we unit test. Provide an input, validate the output. Here, you provide the input (the numbers), and the output is whether the number sequence matches the rule.

You can imagine the unit test for it:

``````function checkRule (numbers) {
// some magical formula we’re supposed to figure out
};

describe('my mathematical equation', function () {
it('should pass when doubling each number', function () {
expect(checkRule([1,2,4])).toBeTrue;
expect(checkRule([2,4,8])).toBeTrue;
expect(checkRule([3,6,12])).toBeTrue;
expect(checkRule([5,10,20])).toBeTrue;
});
})
``````

Looking at this code, it’s easy to assume that the rule is “Each number should double the previous one”. After all, our four assertions pass, so we’ve got green tests!

The trick with the quiz is that the mathematical equation is very simple: each number must be larger than the previous one.

This broad rule means that it’s easy for people to assume their complex solution is the correct one. Every input they give to validate their rule returns true, so it must be right.

Yet there’s a flaw to this testing methodology, as the article points out:

Remarkably, 78 percent of people who have played this game so far have guessed the answer without first hearing a single no. A mere 9 percent heard at least three nos — even though there is no penalty or cost for being told no, save the small disappointment that every human being feels when hearing “no.”

The article attributes this as “confirmation bias” which partially applies. But a better description is a lesser known bias called Congruence Bias. (one I was unaware of before hearing about this article on The Skeptic’s Guide to the Universe).

This bias is “the tendency to test hypotheses exclusively through direct testing, in contrast to tests of possible alternative hypotheses.”

In our tests above, we’re only checking for positive results. We never ask “does this fail if I provide data which contradicts the rule?”

Every suite of unit tests should have negation checks. A simple `expect(passesRule([2,4,6])).toNotBeTrue;` would have broken our tests, showing us that the rule passes despite 6 not being twice that of 4.

Again, from the article:

When you want to test a theory, don’t just look for examples that prove it. When you’re considering a plan, think in detail about how it might go wrong.

That second part rings especially true for testing.

It’s easy to assume that because your tests pass, the code and the tests are working as expected. But we must remember what Edsger Dijkstra said long ago:

Testing shows the presence, not the absence of bugs.

Think about confirmation and congruence bias next time you’re testing your code.

Keep in mind the phrase “fail fast”. Prove that your code really is what it says it is and always keep a skeptical mind when coding.

Don’t wait until it’s too late to learn the harsh truth.

The first principle is that you must not fool yourself — and you are the easiest person to fool.

For more analysis of the quiz itself, and thoughts on congruence bias, check out these discussions:

Header Photo by Hans-Peter Gauster on Unsplash

Joe DuVall

Great post!

One thing I find easy to forget about TDD is that you do not merely start with a test. You start by writing a FAILING test. Witnessing a test fail is essential so that you see the test doing its job. If you never saw your tests fail, how do you know they work?

Jason C. McDonald • Edited

Excellent post, and quite timely! Just made the same point in a comment elsewhere, and then came across this in my feed.

I mentioned in an article earlier, I recommend "testing blind". Even if one writes tests first, rewriting them later without looking at your code helps overcome that little subconscious bias we have of tiptoeing around error traps.

Also, as Jeff Atwood says, Do Terrible Things To Your Code.

ItsASine (Kayla)

To a certain extent, I love unhappy paths, but I see too often when someone who does that manually incorrectly applies it to code.

I know some QAs that would very happily write that the sky isn't green and the sky isn't purple and the sky isn't red. When they could just write that the sky is blue and be done with it.

While that may seem silly with that example, consider an api integration tests that does like

`expect(Object.keys(getResponse)).not.toContain('links');`

Lovely, we no longer use HATEOAS links. Good job, team. But also, that getResponse doesn't contain `'puppies'` or `'kittens'` properties either. It might have a `'link'` property because someone got cheeky and changed the name on us.

`expect(Object.keys(getResponse)).toEqual(['thing', 'stuff', 'bunnies']);` gives use better info. The negative would be testing that being unauthorized gives us `['error', 'message']` but too many people seem to think it's looking for puppies and kittens and verifying it's not there.

Somewhat related, but using De Morgan's Laws in tests is fun, too, especially during code review. `.not.toBeLessThan` is the same as `.toBeGreaterThanOrEqual` but that second function is only in more recent versions of Jasmine.

Kevin Lamping

Some very good points! I think this would make a great follow up post, going through different scenarios and how to ensure you've covered the major scenarios, but not gone too far in testing unneeded scenarios.

ItsASine (Kayla)

I think it stems from more tests being assumed to always mean better. And having lots of tests obviously means we'll never have bugs ever.

Bugs will always happen, but good tests mean they get caught earlier and hopefully before things get merged to the main codebase, so yay! But that doesn't mean there are no bugs. And that doesn't mean that there are no missing requirements. It just means tests passed :P

And there will always be someone writing a test to verify the sky isn't red, and someone looking at the metrics and being happy test count and coverage increased.

Jilles van Gurp

A good test is like a scientific experiment. You are not trying to prove your code works but you are trying to falsify assumptions that need to be true for it to not work. Then when it breaks later, it will be because of something you did not think of. It will be an assumption you made that turned out to be wrong that you did not test.

Jason Steinhauser

This post is a realization of how I feel about testing. Don't just do happy path! Life doesn't follow a happy path always, and neither should your testing. Great write-up!

AoDev

People have come up with an interesting approach for this, mutation testing. For example check stryker. stryker-mutator.io/

Ben Halpern

Yes, this is an incredibly important point. Part concepts we should pay attention to, part basic self-awareness and bias attention.

Pretty cool article and yes, sometimes we should break a head before we find a negative test case to pass. That helps you better to understand a code

Hila Berger

Great article, Kevin! Brilliant!