DEV Community

Cover image for Saving time with contract testing

Saving time with contract testing

Nicola Apicella
Software dev engineer at AWS. Java, js and container enthusiast. Love automation in general. Opinions are my own.
・5 min read

Hi everyone! Today's topic is Contract driven testing.

The way we spend our time is important.
It is important in our life and it is important for the company we work with.
I believe it's in our best interest to minimize time we invest in unproductive activities and maximize the one we spend working on building new ideas.

Ok, this is pretty general and someone would even say it's just common sense. What does it have to do with Contract testing?

Here I am gonna argue that in many circumstances, we can write unit tests instead of writing an end to end test or doing manual tests.
We can and should speed up the feedback loop (the amount of time needed to build confidence that the code we wrote actually works as expected), by testing the integration of different components as part of the unit tests suite.

Contract testing 101

When two entities communicate, changes to the supplier’s API can cause failures for all its consumers.

I have partially touched this topic when I discussed about Robustness Principle, indeed I think Contract testing is a way to make our services more Robust without giving up on development speed.

What do we do about it? We write integration tests/end to end tests/manual tests.
It is unquestionable that those kind of tests help to catch bugs before they end in production, but there are cons.

  1. Running them requires setting up infrastructure, deploy the application, etc. They are way slower than unit tests, since this is the place where we call other services, we make network calls and use the database. Since we know that they are slow and require all the things ready, we cannot run them as often as we normally do with unit tests.
  2. An implication of the first point is that they increase the dev feedback loop.
  3. Since we share the same pipeline with other developers, a failing integration test does not necessarily mean we broke the pipeline. So, yet more time required to investigate what happened.

Before looking into a concrete example of contract tests, let's take a look at the test pyramid.

Test pyramid

The pyramid is a graphical representation of how much testing of each type we should be doing. At the bottom we have unit tests which means in proportion we should write more unit tests than anything else:

  • they make sure our code runs correctly when it runs in isolation
  • they are easy to write and fast to run

We write lots of them and execute them every time we make a code change to the codebase or when we update one of our dependencies.

Based on what I said about integration tests and end to end tests, it's not a big surprise that they have been placed at the top of the pyramid.


Let's see a concrete example of using contract tests instead of end to end test.

Example 1

Context: client to service communication
Scenario: A ReactJs application which manages a ToDo list for our users. The ToDo list is serialized and sent to a server, which stores the information to S3.
What we want to test: no change in the code causes a regression in the system, namely we are still able to deserialize the ToDo list we receive from the server and show it in a react component.

This is how the ToDo list might look like:

export class TodoList {
    items: Item[]

    constructor(items: Item[] = []) {
        this.items = items;

// ......
// And this an item of our TodoList

export class Item {
    constructor(public description: string = '', 
                public isChecked: boolean = false) {}
Enter fullscreen mode Exit fullscreen mode

Somewhere in the code we make an http request to get the TodoList, deserialize it and update the state of the view.

Approach 1 (Bad)

We could write an end to end test which:

  • open the browser (using protractor, selenium or similar)
  • request the react application
  • add some items to the ToDo list
  • save the ToDo list
  • fetches the ToDo list again
  • assert that the information in the ToDo widget is displayed correctly and no other error occurred.

This is exactly the kind of things we want to avoid to write; it is slow and fragile.
We can avoid the end to end tests by using contract testing.

Approach 2 (Good)

First let's create a contract for our ToDo list.
Tools we are going to use:

Let's define an utility function which create the contract the first time we run the test:

getOrCreateContract = (instance, filename) => {
    if (schemaDoesNotExist(filename)) {
        // TJS comes from the Typescript-json-schema lib
        const program = TJS.getProgramFromFiles([resolve(filename)], {}, basePath);
        const schema = TJS.generateSchema(program, instance, settings);
        saveSchema(CONTRACT_FOLDER, filename);

        return schema;

    return getSchema(CONTRACT_FOLDER, filename);
Enter fullscreen mode Exit fullscreen mode

The contract generated for our ToDo list looks like this:

    "$schema": "",
    "definitions": {
    "Item": {
        "properties": {
            "description": {
                "default": "",
                    "type": "string"
            "isChecked": {
                "default": false,
                    "type": "boolean"
        "type": "object"
    "properties": {
    "items": {
        "items": {
            "$ref": "#/definitions/Item"
        "type": "array"
    "type": "object"
Enter fullscreen mode Exit fullscreen mode

Now, let's write the contract test:

describe('ToDo List', () => {
    test('respect contract', () => {
        let todo = new TodoList([
            new Item('contract tests', true)

        let contract = getOrCreateContract(todo, 'TodoList.schema.json');
        let contractValidator = new Validator();
        let respectContract = () => {
            contractValidator.validate(todo, contract);


Enter fullscreen mode Exit fullscreen mode

This test gives us exactly the same confidence than we might get from the end to end test above, but it's way faster and does not need to communicate with the real dependencies.

Obviously, there are cases in which we want to update the contract. For example we could add a command line argument to override the existing contract.

Example 2

Context: service to service communication

I figured the post is getting a bit too long. The consequences of having a service to service communication do require some more concepts to introduce, so I'll write the example in a follow up post.


Integration tests have an important place in the test pyramid, but sometimes we overuse them.
Contract tests can save us time!

If you are interested in the topic, let me know!
This will help me understand if I should write a follow up post with service to service communication.


Discussion (2)

jillesvangurp profile image
Jilles van Gurp

Put differently, in the test pyramid, your most valuable tests are at the top and bottom. I tend to emphasize making the tests at the top count because they are expensive. So I focus on breadth and scope of those tests to cover as much as possible of the scenarios I'm interested in. With unit tests I want to cover logic. Anything that looks like an algorithm: unit test it. Regular expressions? Unit test those. Complex if/else? Unit test it. You can run thousands of unit tests in seconds and they tend to be a few lines of code at most.

People use confusing terminology like integration tests to mean lots of different things. The strict interpretation is simply: "not a unit test", where unit tests are strictly defined as test that cover small units like procedures, individual class methods, algorithms, lambda functions, etc. This however is not very useful because there are many shades of grey when it comes to integration tests.

A more useful distinction is white box and blacbox testing. Unit tests are a good example of pure white box tests: they are tied to and based on the implementation. You want every code path accounted for and you focus on covering relevant ranges of parameters and any exceptions that might spoil your day. The goal is catching logic bugs. Given that complexity of each unit is low (i.e. your design doesn't suck), you can easily cover all relevant code paths. The flip side is you don't test how units interact (queue endless amounts of gifs where unit tests passed but integration tests did not).

Integration tests near the bottom of the pyramid tend to be also white box tests. They tend to be confused with unit tests even by less experienced developers (not the same thing). Simply put, if it writes to a database and you are poking around in it; it's a white box test. If it relies on different parts of your code base and you are stubbing mocking some bits out, it's a whitebox test. A whitebox test gets you poor coverage (combinatorial explosion of codepaths and parameters) and at the same time poorly reflects what your users will do (use contractually defined APIs to interact with your system).

Also, whitebox tests are deeply tied to your implementation. Meaning that if you refactor, your whitebox tests will need attention. Having a lot of them can become an obstacle for that. I've seen projects where refactoring was infeasible because any meaningful change would require hundreds of tests to be refactored. Bad stuff. Avoid such tests and minimize their use. If you have a lot of whitebox tests that look like developers are copy pasting bits of tests: you have a huge problem in the form of technical debt.

Contract and e2e tests are typically black box tests. You test the system against published APIs (aka it's contract); typically from the outside, and preferably in a way that is similar or close to how this will be done in a production deployment. The big difference is whether you stub out other microservices, infrastructure components and third party stuff or not. In a way contract testing is similar to microservice level unit testing.

It's useful but it doesn't replace e2e or scenario oriented (or manual) testing where you verify your constellation of microservices actually get a job done end to end for the user. Queue before mentioned gifs; exactly the same problems.

For me the benchmark in this is testing things like user signup flows. This typically involves writing to databases, interacting with queues, external mail services, SMS, and time sensitive things like tokens. You might have all your contract tests passing and still have users unable to signup because something went wrong with external services, something timed out, got firewalled off, etc. You cannot discover this with just contract tests. E2e tests are the only automated way you can ensure this keeps working. Essentially all projects I've been on that had a user signup flow at some point ended up with repeated production issues related to users not being able to get into the system. In some of these projects we ended up with browser based tests (selenium) and/or deep e2e tests against realistically configured systems designed to mimic as close as possible the sequence of events in real life. E2e tests are complex and expensive; you don't want to have a lot of them but you typically need some.

napicella profile image
Nicola Apicella Author • Edited

Great overview of the test pyramids :)
I completely share your thoughts about the need of having some e2e tests which cover all the layer of the system.
Having e2e is important to make sure all the pieces work nicely together (configs, certificates, throttling, etc).
This is definitely something you can only test by running the whole stack.

In many cases though, there is an abuse of e2e and integration tests for what apparently seem to be a lack of abstraction in the system.

In other cases, it s about not being accostumed to a richer set of testing strategies which can speed up the development process when used in the right context.
For example, I am pretty sure that lots of integration tests I wrote couple of years ago, could ve been contract tests XD