DEV Community

Cover image for Delivering untested code with safety
Matheus Mesquita
Matheus Mesquita

Posted on • Updated on

Delivering untested code with safety

Intro

While ago my team at B2W Digital, the biggest e-commerce from Latin America, made the decision of removing the e2e testing step from one of our project's CI pipeline, our tests, running on a selenium-based architecture, were taking too much time to complete and was harming our development cycle.

This project is the oldest one in our team, more than 10 developers have made changes to its codebase already, and more will help in the future. It is the core part of the entire e-commerce process, helping thousands of customers to complete their shopping daily. We're talking about the Checkout screen, a place where errors can make our company lose money.

Our decision of dropping the e2e testing was a risky one, moving the responsibility of testing to someone in the team often our PM or Team Leader, was creating a dependency on how much we can trust in the error-prone human testing it.

Without a doubt, we lost this battle many times but in compensation without the useless tests, we had a greater delivery ratio, decreasing the pressure we were and letting us to treat the other technical debts the project — which was created in a time where SPA's were new and Angular 1 was starting to be a thing — had.

Looking back at our decisions, some questions came up.

  1. How important are e2e tests?
  2. How can we add e2e testing to our workflow without it being a burden to us?
  3. Which tool should we use for keeping our tests suite decoupled from the project's framework?

I'll try to express my opinions around those three questions in this post series, I hope you enjoy it. :)

How important are e2e tests?

Definitions

Before we start diving on our discussion, lets set some meanings to our words in a way that everyone will be reading in the same language.

As said by Simon Stewart at this blog post naming test types is a hard job, there isn't a consensus on the software developing community and, most probably, everyone reading this post has heard multiple names for the same concept.

  • End-to-end tests are the ones which verify that all parts of our applications are working together, from side to side. Here we test every aspect of the user interaction in our application, e.g. in a login page the user clicks on the email field, types in his email address, clicks on the password field, types in his password, clicks on login button, the page sends a request to an API and it answers with a 200 status code.
  • UI tests are similar to e2e(end-to-end) tests, but they will exclusively test how our interface displays the data gathered from an API and the user interactions in the UI. Note that we don't necessarily use the real API's response, we can achieve the same result using testing doubles(stubs and spies), e.g. in a login page, after focusing and inserting the values in our inputs, when the user clicks on the login button our spy ensures that the request was sent and our stub answer with the 200 status code.

With the definition cards visible to us, let's take a look in an old concept around tests.

The Good Old Test Pyramid

UI and E2E tests are commonly overlapped on the top triangle from the Test Pyramid, which is a good place to start seeking for the first question answer*(How important are e2e tests?).* Even though it doesn't show LITERALLY all the possible testing categories, it resumes the most common and used ones in modern software development, and that will be enough for us.

https://martinfowler.com/articles/practical-test-pyramid.html#TheTestPyramid

As seen above, our pyramid creates the perfect analogy to how we should prioritize our tests cycle, from bottom to top. We must rather test the smallest parts of our system, the units, which are more isolated, provide quick feedback and needs fewer iterations, than the biggest one, the UI tests, which have a slower feedback cycle in addition of being hard to be crafted because of the inherent nature of user interactions being complex.

It easily shows that if we focus only on Unit Tests our development cycle becomes faster, which in the world of agile is usually preferred over quality.

https://watirmelon.blog/testing-pyramids/

The worst scenario is when our pyramid meets Stranger Things and turns upside down. If this anti-pattern happens our application starts playing against us. E2E and UI tests have a slower execution time making our entire CI/CD process to slow down, impacting our team productivity directly. Besides, depending on how you implemented your tests, they can prove to be a nest of false positive alarms, removing all the confidence your team had on it when you started developing.

E2E and UI tests aren't an easy task, not as much as it is to lose the control over your testing flows and creating a gigantic technical debt in the worst place possible, the place which was supposed to bring safety to our application, our tests tasks.

This was the scenario we experienced, till the point we've just removed it because fixing it won't be valuable enough in the current project condition.

Are unit tests enough?

So with all those downsides looks like we already got our answer about the highest aspect of testing, right? Let's just drop it!!!

Problem solved.

https://victor.kropp.name/blog/100-percent-code-coverage/

No! Every time you think unit tests are enough you've just made your application an unsafe place. :(

How can we add e2e testing to our workflow without it being a burden to us?

Let's take a look at the problem from another point of view.

When developing software we should always aim to have orthogonal systems, if we keep the parts of our application decoupled this brings safety and predictability to us. Taking this affirmation as an axiom, we should reevaluate both alternatives for the upper place of our pyramid.

End-to-End Testing

As mentioned in Definitions, e2e tests our application side-to-side, including the API's responses and complex steps to reproduce a specific scenario, like those really side cases where the state created by the user was possible only after a meaningful number of different interactions in our UI.

Does anything sound weird to you? If you said yes, you are correct, having a test in a front-end project verifying a real API response is coupling your assertion to the matters of an API being working or not, and in a case where the API become off-line, your test job will fail with a false positive from a problem that your current project/code has nothing related with. To be clear, in this scenario your UI would break in a local test suite in a production environment, the problem is located at another system, the API.

The front-end test suit shouldn't be dependent on the back-end environment, although it is extremely necessary toward production, for testing purposes we can stub the responses and trust that the team responsible for the API will deliver the data correctly to our front-end application and guarantee the up status.

By this perspective, E2E testing doesn't look like a good approach for modern front-end applications since our view layer is really decoupled from our data layer, sometimes even deployed in different environments.

UI Testing

What really matters to a front-end application is testing if the behavior of our components are correct, if the data displayed is in the right place and if our actions are being called with user input.

UI Testing addresses exactly those points, no more, no less. Stubing our requests and having access to our state, we can imperatively enforce scenarios which a normal user session*(like the ones we would implement in e2e tests)* would take seconds or even minutes to be achieved because of the interaction with our API's.

Which parts we should test them?

The correct amount of tests

It's hard to choose which scenarios we must test, remember from Pareto's Principle that 80% of our effort will be probably spent on writing the first 20% of the tests, and spending the last 20% of effort won't be enough for the last 80% of tests, hence being counter-productive, not to mention that growing our coverage won't guarantee more quality.

Note, I'm not telling you that 20% is enough, it depends on your project and only you with your team can decide the right amount of coverage you should target. Just keep in mind that numbers bigger than 40% are high and maybe you should reevaluate what you are testing.

For the task of choosing which parts to test I have some suggestions.

Avoid duplication

Although there are applications where a big number of interactions are necessary, we should see if this is a problem of our test's design, it can happen easily with test duplication, e.g. You have two tests reproducing the login interaction before internal pages. Remember, The DRY principle should also be applied to our tests, if you have a behavior that must be accomplished in multiple scenarios, create a separate module for it, and import on the necessary places.

Furthermore, when you duplicate your tests, you create a hard job for some future coworker or even you, when the behavior of that specific part changes, multiplying the effort to correct those tests by the times it was replicated in your test base.

If it is hard for you, most probably, it is hard for them too

There is no reason for a scenario where only 1% of your users get through to break your application pipeline. Sure you want to have a 100% coverage from scenarios, what better way for guaranteeing safety then testing everything? Amiright? No! You shouldn't be focusing on it and potentially having incidental coverage in your project. 100% coverage should be a benefit from testing your specifications well or developing just enough software in a TDD cycle.

With that in mind, we should focus on the core features of our product, take an e-commerce application by example, the user must finalize an order.

A new hope

Ok, running after a hundred percent test coverage is an anti-pattern, but wouldn't it be wonderful if something warned us of our errors before someone came to our desk or e-mail us telling something that used to work stopped working after the last feature update?

Of course It shouldn't happen in well-designed software, but sometimes you haven't worked in the early stages of the software you are working right now, and bad decisions made by you or your coworkers led to a status where one change in a module that was supposed to be localized, triggers an unexpected behavior on another place of the application.

Analyzing the root cause of this problem we will be faced with technical debt, which has its ways for being "paid back", but as my experience with legacy projects showed me, even when paying it back we often see some buggy behaviors that were previously fixed to come back and haunt us because some change has just triggered it again.

Was it a problem in our fix? Or someone just had unfixed our fix?

I would guess none of that. We will be again looking at the wrong place when searching for someone to blame, look for culprits will not make our project better nor fix the bug, it will bring only new problems and can be harmful to the relationship between your team members, and to other the teams in your company.

UI Tests as part of your development cycle

Keep your enemies close but your enemies closer - Michael Corleone

We learned from previous topics that testing every UI aspect of our application can be harmful enough to let us creating paranoias about our tests suite, also it can be counter-productive thus harming our entire development process, slowing our deliveries, unmotivating our day and stopping our workflow. However just dropping our tests will remove any safety we had about our application, making us deliver less-quality products and decreasing the company's trust in our product with continuous bugs popping from every deployment we make.

I'm proposing here that you should keep aiming at 100% coverage for your UI tests splitting core from non-core. Text divergence shouldn't stop your pipeline but missing essential buttons should.

After splitting your two test suites, which I'll discuss how in the next post, you can make the core a mandatory step to your CI cycle and the other will be run after your code was shipped to production, in a post-production step. The reason for this is quite simple, if our non-core scenarios takes longer than 30min to run, for example, in the worst case we will be warned 30min after it was shipped and we will be able to rollback to the last stable deploy before any person reach us telling about the features we broke with the last feature we deployed.

After fixing the warned bug or any reported bug, remember to create a new UI test to prevent it from happening again. Don't stop delivering features to map scenarios from your project, it should happen in a progressive way if you follow the boy scout rule, always leave the code you found a little better, doing it your coworkers, clients and stakeholders will thank you, and you are going to be able to deliver "buggy code" without causing big harms or slowing down your workflow.

Looking Ahead

In the next post of this series, I'm going to discuss which tool we can use to keep our business logic decoupled from the tools we are using to accomplish it in our project. Also, explain how you can do it in Gitlab CI which is an awesome open-source tool that we use in my company to create CI/CD pipelines.

I hope you enjoyed it, it was a pleasure for me to write after so long, I used to have a blog but stopped writing when an impostor syndrome hit me and made me frustrated about the stuff I used to write about. So thank you, for your time, and patience reading this article, if you have any question, criticism or wants just to say hello I'll be happy to talk about it in the comments section.

If you want to be notified when the next post comes up you should just follow me and the dev.to will do the rest. Thank you again and have a good week. :)

References

https://en.wikipedia.org/wiki/Human_error
https://martinfowler.com/articles/practical-test-pyramid.html#UiTests
https://testing.googleblog.com/2010/12/test-sizes.html
https://dzone.com/articles/should-you-aim-for-100-percent-test-coverage
https://betterexplained.com/articles/understanding-the-pareto-principle-the-8020-rule/
https://jasonrudolph.com/blog/2008/06/17/testing-anti-patterns-incidental-coverage/
https://en.wikipedia.org/wiki/Test-driven_development
https://en.wikipedia.org/wiki/Don't_repeat_yourself
https://dev.to/caiorcferreira/using-technical-debt-as-your-next-tool-1bp6
https://medium.com/@biratkirat/step-8-the-boy-scout-rule-robert-c-martin-uncle-bob-9ac839778385
https://www.youtube.com/watch?v=DfHJDLoGInM
https://about.gitlab.com/product/continuous-integration/

Oldest comments (0)