Lars Frantzen

Posted on Apr 14, 2019

Nondeterminism In Testing - How To Do It Wrong

#testing #webdev #beginners #computerscience

One of the most misunderstood concepts in testing is nondeterminism. So let's first quickly clarify some notions. We are testing a system by executing test cases on it. The test cases are derived from some kind of specification or model of the system. A test case gives inputs to the system, and observes outputs. Based on the outputs, the test case can either pass or fail. Failing means that an output has been observed, which does not conform to the specification out of which the test case has been derived.

A system is always in one or more states. Such a state comprises all aspects which are relevant for the observable behavior of the system. For instance, a coffee machine may be turned on and having enough water and beans to serve a coffee. That is a state of the coffee machine. If I now press the coffee button I expect to get a coffee. If, instead, the machine is turned off, or there is no water, I expect to not get a coffee when pressing the button. These are other states of the coffee machine.

A system is called deterministic when it is always in exactly one state. A system is called nondeterministic when it can be in several states at the same time. And this is where the confusion starts.

People may say "every system is deterministic by definition". This is true in some sense, but very wrong in very many pragmatic testing situations. For instance, imagine that the coffee machine is some commercial one in some office where you do not even see if there is enough water and beans in it. So even though the machine is in one specific state, you simply do not know that state. But you can still deal with it: you press the button, and when you do not get coffee, you ask the service to refill the water and beans (or you just kick the machine and walk away). Doing so is what you can call a nondeterministic test case. It is a test case, which can deal with several states of the system, meaning depending on what the test case observes (coffee or no coffee), the test case proceeds in one direction or another.

Let's look at some more common examples. For instance, consider you are testing an online shop. So you query a product. Now you may get two answers (i.e., outputs): the item is in stock, or it is sold out. When it is in stock, you proceed with testing the order process. If it is sold out, you try another product. Also here, the system is in one state, but you do not know that state. To be able to deal with both states, you need nondeterminism.

Or, consider you are testing a system at some observation point, where you do not know the exact ordering in which events occur. This is very common for complex systems. But you also simply do not care in which ordering some events occur. If, when booting your operating system, your messenger starts first, and then the calendar application, or the other way round, does not matter. But both must start, that matters. Also here you need nondeterminism to deal with it.

So, nondeterminism is a very important and useful means to specify and test systems. Unfortunately, many tools are not capable of dealing with nondeterminism. You can only specify one allowed output in your test case, and all others are considered as wrong. This is a very fundamental limitation of these tools.

But, even worse, when the tool allows to do nondeterminism, people tend to use it wrong. Look at an example from a project I worked for as a test architect: we were developing a web page which allows two ways to log into the site, either via some customer number or via email. You can switch between the two options by clicking some GUI element in the login module. Initially when loading the page it is always in the "login via customer number" option.

While I was working on a test case which first logs a member in (via customer number) I checked the test code for this login method, and saw there basically something like this:

function loginMemberWithCustomerNumber($member) 
{
  if (the login module expects an email)
  {
    click the GUI element to switch to login via customer number;
  }
  login the $member via its customer number;
}

Why did the tester do this? Simple, this function is robust in the sense that it can deal with both cases:

the login module expects an email
the login module expects a customer number

Take a second to think why this may be bad testing code, though!

First of all, this code introduces nondeterminism to the test. We can now deal with a system being in two different states. Depending on which state we are in, we act differently. But we have also said, that this is only useful when we do not know in which state the system is! But, for our web page example, we know in which state the login module is, since initially it is always in the "login via customer number" state. And this is even a requirement since most customers use this variant to log in, so it must be the default. So, it actually is even a defect when we load the page and the login module is not in the "login via customer number" state, but in the "login via customer email" state. The problem is now, that we lost the power to find this defect by adding nondeterminism to the test code. If the login module is in the wrong state, the test case corrects it by switching to the correct login method.

Let me end this example by a very little bit of theory for all this. We are in the lucky position to have very solid test theories at hand to give all this considerations some foundation. The next picture shows the above example by referring to transition system based theories, like the most famous one - ioco - by Jan Tretmans.

You see here a specification saying "!x must come first" (see !x as the login module is in the "login via customer number" state). Next to that you see two implementations. Impl1 doing it right, Impl2 doing it wrong by starting with "!y must come first" (see !y as the login module is in the "login via customer email" state). In a conformance relation like ioco, of course, Impl1 would be correct, Impl2 wrong. But now look at the nondeterministic test case. It does what our function does, it allows both states ("tau" means an unobservable step). But having this test case, also Impl2 is correct now, we lost our power to spot that Impl2 is wrong.

This may appear a bit nitpicking, but I find this wrong approach all the time, and I always need to argue since the tester thinks she or he wrote a very fancy method, which can deal with all kind of states of the system. And this is true! But it sucks for testing when we do not really need it, since it removes defect detection power.

Another very common example is test data. When testing a system, you should do everything for being able to control your test data (like which customers are in the system, what kinds of transactions has the customer done, which offers are shown to one specific customer, etc.). Only when you really cannot control this, you need nondeterminism - and then it is very very useful! But when you can control your test data, forget about nondeterminism here! And again, testers tend to write very fancy methods to somehow discover the available test data by analyzing the web page or database, trying for instance to find a specific customer transaction to continue with the test, etc. And then you always see disappointing faces when saying: we do not need that! We know the test data, we can even hardcode it in our tests, we know the state of the system, we do not need to learn it! Since learning always means allowing more states than just one.

Nondeterminism always adds complexity to your test code, you get all these conditionals, maybe exception handling and quite complex SQL statements, etc. So you definitely need to start writing tests for your tests and you lose stability and readability of your test code.

Summarized:

keep test code as simple as possible
control as much as you can
avoid nondeterminism whenever possible
love nondeterminism when you really need it

Join us for AWS Security LIVE!

Discover the future of cloud security. Tune in live for trends, tips, and solutions from AWS and AWS Partners.

Learn More

Top comments (3)

amorganPD • Apr 16 '19

I think this article was interesting, but I'm not sure I agree with your definition of a deterministic system:

A system is called deterministic when it is always in exactly one state. A system is called nondeterministic when it can be in several states at the same time. And this is where the confusion starts

I would refer to: en.m.wikipedia.org/wiki/Determinis...

Which states

a deterministic system is a system in which no randomness is involved in the development of future states of the system

I don't think this negates the content you discussed, but it seem like the article is more about removing the dependency of state where possible.

Essentially replacing "nondeterminism" with "state dependency", I think, would make this article more clear.

Lars Frantzen • Apr 16 '19

Hi @amorganpd , thanks for your comment. You are fully right that there are several definitions of (non)deterministic systems. I am not too happy with the Wikipedia definition for the purpose of testing, since "randomness" is a bit vague here. It does e.g. not differentiate between choices done by the environment (external choice) or choices performed by the system (internal choice). This distinction is very clear when you model systems e.g. via process algebras / Labelled Input Output Transition Systems.

But there is no need to get too formal here, a proper notion (for testing) of a deterministic system says that whenever you provide an input (sequence) to the system, you always observe the same outputs. At the end the "randomness" or "internal choice" of the system is just observed randomness. In reality it will never be random, but dependent on internal details you abstract away from when modeling the system.

Another common definition of a nondeterminisitic test case is a test case that sometimes passes and sometimes fails (see Martin Fowler ). This is a different, but related topic. Such a test case is also referred to as a flaky test. Being flaky could caused by two things:

the system really shows different behaviour when running the test several times, even though the environment and system setup is unchanged (so it is nondeterministic in the sense of this post)
the test case is not able to bring the system in each run always in the same state for each step it takes

The flaky notion of a nondeterministic test case as used by Fowler refers to the latter meaning, not to the first meaning I use in this post.

So, the notion of nondeterminism is quite overloaded unfortunately. So adding something like your suggestion of "state dependency" could definitely help.

amorganPD • Apr 17 '19 • Edited

For sure, I was even having a discussion with a colleague the other day in regards to deterministic behavior and had to align on whether or not we were discussing the time domain or just the result of the output.

If we are talking about result / output, I would simply define it as: If the output is always repeatable, given the same inputs, then it is deterministic.

DEV Community

Nondeterminism In Testing - How To Do It Wrong

Join us for AWS Security LIVE!

Top comments (3)

Read next

What is a Turing Machine, exactly?

Using React as Static Files in Django: Step-by-Step Guide

Testing a GraphQL Application with Jest and SuperTest

Why Are Skip Lists Less Commonly Used Than BSTs?