When dealing with test automation - especially when doing web testing with a full-blown browser interface - one of the biggest issue is what is called flaky tests. You may also call it unstable tests, and often this notion is overloaded with several meanings. Let us define a flaky test as a test, which leads to different verdicts when being run several times against an unchanged environment and system.
Having such a behavior could mean two different things:
- the system really shows different behaviour when running the test several times, even though the environment and system setup is unchanged. So it is nondeterministic in the sense of this post:
- the test case is not able to bring the system in each run always in the same state for each step it takes
The first option means that the system shows a nonderterministic behaviour. This is a special topic, and usually not what is meant with having a flaky test. With flaky test we usually refer to the second option. And that option can have very many root causes. A common (bad) approach to deal with flaky tests is to rerun failed test cases, and see if they pass the second (or later) time. This takes time, and is not the way to go. Instead, make tests stable. I will note here some practical findings why test cases may be flaky, and what can you do about it.
The root cause I am looking at in this post is having external dependencies within your test cases. In web testing that could mean that your test cases do not only access the web pages and corresponding systems you want to test, but also other pages or services you utilise to execute your tests.
As an example, I was asked to have a look at a test case which was quite flaky. The test case was migrated from a former test tool to our recent, PHP-based tool Codeception. The test case has to compute a hash code for some string. Since the former test tool had no option to compute hashes, the writer of the test case solved the issue by accessing an external web page to compute the hash there. Since this test case was migrated exactly in this manner to Codeception, it did the same there.
Using this external page was the root cause for the flakiness. It was very unstable to wait for, and access the computed result on that external page. So people started adding more waits, and all kinds of weird stuff, which ended up in the flaky helper method to compute hashes. Furthermore you add an dependency to an external page. If that page is down, your tests cannot run.
So what was the solution? In this case, simply use plain PHP to compute the hash. This is one of the great advantages of having a test tool which uses a common programming language. You have the full power of the language available for your tests. This, on the other hand, is also a danger, since people may introduce too complex and unreadable code. But that is a different topic.
So, at the end, I could just replace this messy code with one line of PHP:
This may appear as a very obvious and easy to solve case. And it is, but in general having external dependencies can contribute a lot to unstable tests. They often involve network latency, unforeseen changes of the external system, and so on. So, get rid of external dependencies as much as possible.
There are many more root causes for flaky tests, like wrong session handling, different browser behaviors, inability to deal right with failed test cases, and so on. I will write about those topics in upcoming posts.