Intro to Property-Based Testing
Jason Steinhauser Jun 9
Property-based tests make statements about the output of your code based on the input, and these statements are verified for many different possible inputs. - Jessica Kerr (@jessitron)
Property-based testing is generative testing. You do not supply specific example inputs with expected outputs as with unit tests. Instead, you define properties about the code and use a generative-testing engine (e.g., QuickCheck) to create randomized inputs to ensure the defined properties are correct.
What is the purpose of generative testing?
Generally speaking, property-based tests require only a few lines of code (like unit tests), but unlike unit tests they test a different set of inputs each time. Because of this, you end up covering more domain space with roughly the same amount of test code.
Property-based testing also promotes a more in-depth understanding of the function under test. Sure, we know that for addition 2 + 2 = 4, but how can you say that you truly have implemented an addition function correctly with that one example?
From a business standpoint, a better understanding of the functionality of your code will lead to more accurate determination of requirements being met in your code.
Does it replace unit tests?
While it is tempting to say that unit tests could be completely replaced by high quality property-based tests, unit tests still have their place in the software development cycle. Example-based testing is fantastic for early stages of TDD. These serve as anchor points to ensure that your development efforts are proceeding as desired. For example:
- You can ensure that your sine function is generating proper values (
sin(0) = 0,
sin(π/2) = 1, etc.)
HTTP POSTs to your
/loginroute without proper authentication headers/cookies results in a
However, eventually these example-based tests end up being passive tests; the tests become part of a regression suite and provide no new information about the functionality. Property-based tests, however, are always active tests as they generate new data each time the test suite is run. This can help root out issues that developers overlook. For example, you may expect that the output of your sine function for any floating-point representation to be in the set [-1,-1], but the developer may not think to test for ±∞ or NaN. To this end, when a failing case is identified, this unexpected error can be used as a future unit test to validate fixes were made. This reason alone makes example-based testing an effective practice for enhancing property-based test suites.
"Don't write tests. Generate them!" - John Hughes, co-author of Haskell's original QuickCheck
What are some common properties?
If you correlate software functions with mathematical functions, then you can apply some standard mathematical properties to some expected functionality. However, some of those properties aren't as apparent in non-mathematical applications. Here are a few example mathematical properties translated to function properties, ignoring performance differences:
a + (b + c) = (a + b) + c
users.Sort().Filter(x => !x.IsAdmin) = (users.Filter(x => !x.IsAdmin)).Sort()
a + b = b + a
image.flipX().flipY() = image.flipY().flipX()
a(b + c) = ab + ac
title.ToUpper() + author.ToUpper() = (title + author).ToUpper()
a = a * 1 = a * 1 * 1
title.Trim() = title.Trim().Trim()
list.Sort() = list.Sort().Sort()
There are additional, commonly-tested properties that are not necessarily rooted in math, but are equally as useful:
- Bilbo Testing (aka, There And Back Again)
list = list.Reverse().Reverse()
obj = JSON.parse(JSON.stringify(obj))
- No Unexpected Changes
list.Length = list.Sort().Length
- Hard to Prove, Easy to Verify
- Sorting: each element should be greater than or equal to the previous element
- Tokenizing: concatenating tokens with separator interposed should equal the original string, number of tokens should equal number of separators in original string - 1
Is PBT actually used in the "real world"?
In short, yes. Property-based testing is absolutely used to solve real-world problems.
Volvo has employed QuviQ's QuickCheck to validate third-party components conforming to communication bus standards. John Hughes and his team analyzed 3k pages of requirements, wrote 20k lines of QuickCheck tests, and applied those properties to 1MM+ lines of vendor code. By checking their properties generated around the specifications, they found 200 issues that had slipped through the test fixtures that the vendors had generated... and 100 of those issues were in the actual specifications themselves. By refining the code analyzed (as well as the defining specifications), property-based testing has saved lives.
Clojure (a LISP variant on the JVM) uses immutable data structures by default. However, mutable datatypes are available for performance increases. Using property-based testing, an issue was found when converting to/from transient (mutable) and persistent (immutable) structures that none of the example-based tests (re: unit tests). However, property-based testing was able to readily reproduce the issue. A diff patch was provided to Cognitect (the developers/maintainers of Clojure), and it was incorporated in the Clojure 1.6 release.
I've also used property-based testing on a project where we were calculating network performance by various metrics (QoS, packet type, packet size, etc.). I had to prove that my packet classification and matching algorithm wouldn't provide false matches, so I set my CI server up to generate 100 million different packets, distributed between the different packet types we were interested in. As of now, we have randomly generated over 10 billion packets and have not had a single false matches.
What are some awesome parts to look for in PBT suites?
While large inputs may produce the errors, several property-based test suites (most QuickCheck variants) will attempt to shrink the input sequence to the smallest possible that will reproduce the error. The smaller the input, the easier it is to reproduce and fix.
Race conditions are notoriously hard to set up via example-based testing. Some PBT suites (notably, QuviQ's QuickCheck in Erlang) can perform many actions in parallel in order to test if the actions, performed in any serial combination, would produce the same output. If the parallelized version can not be matched to a serialized version, then the race condition is identified and shrunk to the smallest possible set of operations to reproduce the error.
You may want to limit the input domain you are testing, or have data structures constructed in a particular fashion. Most PBT suites will allow you to create custom generators, and many common custom ones are generally included as well – limiting strings to printable characters, constraining floating point values, etc.
If you're migrating from one algorithm to another, a PBT test suite that passes the same input to both the old and new implementations can be used to ensure that the new implementation produces the same output.
Where can I learn more?
Thankfully, there are several good learning resources for property-based testing lurking around on the Internet. Here are some of the ones I've found to be extremely helpful.