Discussion on: My Tests Are Being Maintained by Artificial Intelligence

View post

I would struggle to trust the results of something like this. I defer to my bible of unit testing - Osherove's The Art of Unit Testing:

UPDATED AND FINAL DEFINITION 1.2 A unit test is an automated piece of code
that invokes the unit of work being tested, and then checks some assumptions
about a single end result of that unit. A unit test is almost always written using a unit testing framework. It can be written easily and runs quickly. It’s trustworthy, readable, and maintainable. It’s consistent in its results as long as production code hasn’t changed.

If both your code-under-test and the test itself are changing at the same time, how do you know which change is causing the pass/fail result?

Ben Klein • Oct 17 '18

This isn't a unit test, it's a end 2 end integration test. So the unit test definition doesn't apply here :) Just sayin #noharm

setagana • Oct 18 '18

I was hoping that the reason Osherove included the bold sentence in his definition would be clear to everyone, but apparently that needs to be explained to people who are more interested in semantics.

The principle can and should be abstracted to any automated test and even beyond computer science. One-Factor-at-a-Time (OFAT) is a paradigm of experiment design that sees use in almost every branch of science and engineering. The primary arguments against OFAT are that:

1) It fails to identify interaction effects that result from combined inputs in a multi-factor system.
2) It's inefficient in situations where data is costly.

Point 2 doesn't apply to automated testing because the cost of acquiring more data is simply waiting for your tests to run.

Point 1 makes an interesting case for why I would argue against self-changing tests. In the case of self-changing tests we have two factors that vary - the code-under-test and the tests themselves. We could state that we don't want there to be an interaction between the tests and the code because we don't want our code to perform one way in testing and another way when given to users.

But how could you check that no such interaction exists? You would need to have a representative sample of the range of values that both factors could assume, and analyze the results of the varying combinations. In the case of self-changing tests, you have no ability to make the system try out various values and present you with its findings for a given value of code-under-test, nor do you have any way to conceptualize what range of parameters the system is considering changing.

At least when you write tests yourself you can form some idea of what the range is of possible test parameters and make use of your knowledge of the domain to tease out any possible interactions between your test set and the code-under-test.