What we should expect from tests?

#testing

The other day was my first time as a speaker in a meetup. It went well and I thought it could be adequate to be my first time here in Dev.to as well. It is about testing. As it is too big for one post, I will split it into three different ones:

This one, the first, is a brief introduction to the purpose and characteristics of tests.
The second is about how to efficiently work with tests.
The third is about how to build better tests, avoiding some of the most common antipatterns.

Here we go.

What we should expect from tests?

Tests provide confidence in what we code, both because it works and because it does not break previously working code.

In our first steps as developers, we go all in with manual testing. But this is not a valid strategy when an application grows even just a little.

When there is some complexity, manual testing becomes not only inefficient, but also insecure. There are too many things to keep in mind in a complex application: background processes that are not easy to trigger, interactions between components, asynchronous processes, concrete data setups, etc. It is also extremely time consuming.

This does not mean that manual testing is a waste of time. It simply has other purposes than checking that all our code behaves as expected.

That's why we want a good suite of tests which provides us with fast feedback loops and detect defects soon, when they are still cheap. That's the whole point of tests: getting feedback about the correctness of our application, and getting it fast.

Characteristics of automated tests

Taking for granted an automated test is, well, automated. They have another two characteristics that we want to enforce – being deterministic and fast. Let's summarise what these three characteristics mean and imply:

Being automated means that it should be easy and fast to set up and that we can run our tests with the minimum possible human intervention. This is the key to taking the most from our tests – running them often and introducing it in CI processes.
Being deterministic means that our tests always produce the same result with the same input, no matter how many times we run them. To accomplish this, tests must deal with factors that we cannot control in real life, such as randomness, vendor APIs and so on. Although some types of tests can deal with some lack of determinism, more is better. We will talk about this in a moment.
Being fast means that they are executed in a short time period, so we can get our feedback as soon as possible. We trend to lose focus when something takes too long – so, if our tests are slow to run, we simply will not run them.

Most likely, the less deterministic and fast one type of test is, the smaller number of it we want in our test suite.

The test pyramid

Automated and deterministic are quite objective concepts. But fast needs some reference to compare with.

Compared to manual testing, all (decent) automated tests are fast. However, there are tests fairly faster than others. There is also a correlation between how fast a test is executed, its cost and the type of functionality that it covers. This is summarised in the so-called test pyramid

The test pyramid is a simplistic representation of the amount of each type of tests that we usually want to have for our applications. However, although it does not cover all complexity inherit to testing automation, it provides a good picture for an introduction like this.

A simplistic test pyramid. Yes, I drew it. And yes, my phone camera sucks

The main message is that the cheaper it is to write and the more resilient to change it is, the more of it we want. Usually, this means we want a pyramid composed of, from bottom to top, unit, integration and functional tests:

Unit tests: They cover the behaviour of a single unit of our code, hence the name. What unit means is something not trivial that can have an important impact on the quality and consistency of our test suite. We will talk about this topic in the third post of this series.

Unit tests are blazingly fast to execute – just a few milliseconds. A big suite should be executed in no more than 1-2 seconds. It is hard to speak of absolutes here, though, metrics vary a lot between setups. If your unit tests take too long to run, consider a review of what is happening there – maybe they are not unit tests at all!

They are at the bottom of the pyramid because they are the cheapest to code and faster to run. That is why we want lots of them – they are the quickest ones to provide feedback.
Integration tests: They cover how two different components of the system work together. The classic example is the interaction with database, but other examples could be interacting with another service that we own (e.g. a payments service permission management) or with a vendor API (e.g. signing in with Facebook).

Integration tests are usually way slower than unit tests because they rely in some costly service being up and running (or, at least, costlier than a regular class in the same package), such as a database, a message broker or a web server.

We want less of them than units because of they are more time consuming and because they cover a type of interaction with less representation in the application. There should be more logic in the components themselves than in the interaction between them.
Functional tests: They mimic real final user interactions with the system, whatever final user means (a human, a push notification triggering stuff, a hippopotamus, etc.), testing the system as whole from within one of its entry points (some UI, an API endpoint, etc.).

They are slow, compared to the lower parts of the pyramid: a lot of services to are set up just for a single test, and there is wait for long flows to be completed.

Because there are so many things happening, they require a lot of setup and they are prone to breaking for no easy-to-spot reasons. There is no decent way of dealing with determinism, there is no longer a button with a certain ID in the UI.

Good advice is to only functional test the most important parts of the application, and not deal with edge cases. Functionally test what makes the money. This is sometimes called smoke testing.

Summarising

Testing is about getting fast feedback about if our application behaves as expected.
Manual testing is costly and not reliable to assure an application behaves as expected. That's why we want automated tests.
We must strive for our automated tests to be deterministic and fast to execute.
There are three main types of tests, from faster/cheaper to slower/expensive: unit, integration and functional.

In the next post, we will see ways to improve how tests play a core role in our development workflow (because they do, right?)

Testing 101 is a longer and way more precise introduction. Worth a read. This post feels like a TL;DR version of it, actually.
The book Growing object oriented software guided by tests by Steve Freeman and Nat Pryce, besides being amazing and focused on TDD, includes a great introduction to testing and object communication. It also spoils a big part of the next posts in this series!
The practical test pyramid by Ham Voker is a long and complete article about testing.
This test pyramids post by Alister Scott has plenty of pyramid graphs.

Join us for AWS Security LIVE!

Discover the future of cloud security. Tune in live for trends, tips, and solutions from AWS and AWS Partners.

Learn More

Top comments (3)

Jason C. McDonald • Jan 18 '19 • Edited

For whatever reason, I've never bought into TDD philosophy, although I'm glad it works for some people, and won't begrudge them for following it. I do test, quite a lot, actually, but my philosophy is a bit different: I write my tests to follow the abstraction layers of my application.

This means I'm probably testing important features four different ways, but if goal X works through layers A and B, but fails through layer C, then I know the problem is C. Technically, some TDD expert could go through and classify my tests as "integration" or "unit," but I think that overcomplicates things, and leaves me more vulnerable to compartmentalization. I just think, "if A does X, Y, Z (in any manner), I'll test X, Y, Z through A's interface. Then, if B does W, X, Y, Z, test those through B's interface." I ensure that what is supposed to work works, and what isn't supposed to work fails.

That also means that my current test suite takes about four seconds to run on one current Python project...but I also have caught things that typical unit testing would have probably missed. By time I'm done, I forsee that being somewhere about a ~30 second runtime when the project is done, but I come from C++, where I spend that much time or more just waiting for code to compile. I'm patient, and don't tend to skip things that take time.

Side note: my main beef with TDD per se is "write the tests, then the code," which is the opposite of how I'm hardwired to think. I figure out what I want to happen, and refine it as I work. I write the code and tests in tandem with one another, modifying both as my design changes. (These days, that takes place in a split editor.) That doesn't work for everyone, but it's served me well for many years.

Carlos Gándara • Jan 19 '19

In my experience a test first approach produces better code and tests -there will be a few words about it in the next posts of the series-, but this does not have to work the same way for all of us. Not following TDD is a totally fair approach as long as your code ends up well tested.

However, TDD and the type of test you end up writing and running are not correlated. The type of a test, at least in the sense pointed in the post, is related to the type of interaction tested, no matter you wrote it before, after or along with the implementation.