DEV Community

Cover image for 5 control points to implement serverless integration tests like a boss πŸ‘”
Corentin Doue for Serverless By Theodo

Posted on

5 control points to implement serverless integration tests like a boss πŸ‘”

Serverless applications are composed of many services interacting with one another. A lot more than traditional web applications.

Service configurations replace some application code. Parts of the application are black box managed by a cloud provider. Those configuration and their effects can only be tested with integration tests.

That’s why it is crucial for all serverless applications to have relevant integration test coverage.

Without integration tests, you expose your application to hard-to-fix bugs because they don’t come from your application code but from a misconfiguration.

πŸ€” Intent

There is no unique perfect way to implement integration tests. It depends on the context such as the application or the team.

But after 2 years of experimentation on serverless integration tests on several projects, I have seen some patterns emerge. I summed them up in 5 control points.

I will illustrate each one of them with OK βœ… and KO ❌ real-life examples. Those examples are meant to help you forge your own opinion on the subject and help you implement your own integration tests according to your context.

🎯 The 5 controls points

  1. Write one test case per integration ⛓️
  2. Test deployed resources πŸš€
  3. Keep the feedback loop below 1 min ♻️
  4. Don’t create side effects πŸ’₯
  5. Run integration tests in CI before merge 🚦

Let's detail them.

1. Write one test case per integration ⛓️

To ensure good coverage without writing too many tests. Determine what services are communicating and test every interaction once.

✍️ Real-life example

Let’s take an async lambda triggered by event-bridge which gets an item in dynamoDb and then put a generated file in a S3 bucket.

Lambda triggered by sqs in interaction with dynamo and S3

There are 5 configurations to tests:

  • invoke: SQS ➑️ Lambda
  • getItem: Lambda ➑️ DynamoDB
  • putObject: Lambda ➑️ S3
  • retry: SQS ➑️ Lambda
  • onMaxRetry: SQS ➑️ DLQ
❌ Bad

Forget to test that a wrong message ends up in the DLQ

β‡’ ❌ The retry and onMaxRetry configurations are not tested.

Test that different files are generated if you put different fixtures in the table

β‡’ The first test tested invoke & getItem & putObject

β‡’ ❌ Subsequent tests add no value because they will test again the same configurations and an integration test shouldn't test all business logic.

β‡’ ❌ Moreover it slows down your test phase because integration tests are slower than unit ones. If you write 10 integration test cases per lambda each time 2 are required, your whole testing suite will be 5 times slower than it could be.

πŸ’‘ You should use unit tests to test the business logic

βœ… Good

Two test cases:

  • A first test with a fixture item in the table, a real event that triggers the lambda, and an assertion on the object creation in S3

β‡’ βœ… Test invoke & getItem & putObject

  • A malformed event that triggers the lambda and ends up in the DLQ after 3 retries

β‡’ βœ… Test retry & onMaxRetry


2. Test deployed resources πŸš€

In order to test the effect of a configuration on a service, it must be deployed in real conditions.

✍️ Real-life example

❌ Bad

Mock / emulate a service locally

integration tests with mocked services schema

β‡’ ❌ Open source mock can’t perfectly implement the behavior of a cloud provider service and keep up with new features.

Import and execute a lambda handler locally in the test

integration tests with local execution schema

β‡’ ❌ the upstream event is not tested.

β‡’ ❌ the lambda configuration is not tested.


βœ… Good

Deploy your resources in a real stack and let your test code interact with the interfaces

integration tests schema

β‡’ βœ… The configuration of the upstream service is tested.

β‡’ βœ… The configuration of the lambda is tested.

β‡’ βœ… The code of the lambda is tested.

β‡’ βœ… The configurations of the downstream services are tested.

πŸ’‘ For lambda interacting with external services, use real services if possible. If they are not stable or too expensive, consider replacing the url of the external service with the url of a mock

3. Keep the feedback loop below 1 min ♻️

Tests must be easy to write and debug if you want your team to write and maintain them.

A new change must be deployed as quickly as possible to be tested without delay.

A test must be quick to run without pain.

✍️ Real-life example

❌ Bad

Use separate stacks to e2e and integration testing

split test and dev stack

β‡’ ❌ Keeping two stacks synced is a pain for developers.

πŸ’‘ This doesn’t mean you should use the same resources for your integration tests and your e2e usage. But they should be in the same stack to synchronize the deployment.

Trigger a full deployment between two development iterations

For example, use sls deploy each time you change your code.

β‡’ ❌ Those deployments are too long to iterate quickly.


βœ… Good

Use real-time deployment tools

Serverless frameworks provide tools to keep your dev stack synced with your code in real-time:

  • βœ… sst start for SST
  • βœ… cdk watch for CDK
  • ⚠️ sls deploy -f myFunctionName for serverless framework
ℹ️ sls deploy -f requires you to manually trigger the deployment between each change

4. Don’t create side effects πŸ’₯

To run your tests in parallel and without breaking your development environment, you need to isolate the tests from the rest of the app.

✍️ Real-life example

❌ Bad - Code Level

Require an empty state to run

// ❌ Requires running the tests synchronously
beforeEach(async () => truncateAllData());

it('creates a user', async () => {
  await axios.post(`${URL}/users`, user);

  const users = await User.findAll();
  // ❌ Breaks if another user is in the database
  expect(users).toEqual([user]);
});
Enter fullscreen mode Exit fullscreen mode

βœ… Good - Code Level

Create isolated data

const userId = uuidV4(); // βœ… No collision const user = { userId, /_ ... _/ };

afterEach(async () => deleteUser(userId)); // βœ… Delete only test data

it('creates a user', async () => {
  await axios.post(`${URL}/users`, user);

  const users = await User.findAll();
  // βœ… Another user from another test or e2e data can be in the list
  // without impacting the test result
  expect(users).toContainsEqual(user);
});
Enter fullscreen mode Exit fullscreen mode

❌ Bad - Infrastructure level

Let async test events propagate and break downstream components in event driven architectures

ℹ️ This only concerns event driven architectures

event driven architecture with side effects 1

β‡’ Test 1 will create events that trigger the lambda 2 & 3 in an uncontrolled context.

β‡’ ❌ If want to control the context of lambda 2 & 3 while testing lambda 1, the complexity explodes with the size of the async flow 🀯. It becomes an e2e test, that's not what we want here.

β‡’ ❌ If you let the events propagate, you will end with lots of β€œnormal” errors that will prevent you to detect real errors

event driven architecture with side effects 2 β‡’ ❌ If the async flow can affect upstream data, race conditions will appear in the tests


βœ… Good - Infrastructure level

Breaks the async flows in event driven architectures to isolate tests

ℹ️ This only concerns event driven architectures

event driven architecture with duplicated resources

β‡’ Duplicate instances of event services and plug the lambda into the rights services thanks to the environment variables.

β‡’ βœ… The side effects created by a lambda are not propagated to the other lambdas.

⚠️ This can be a little scary but the other alternatives are: letting the event propagate (❌ cf above) or modifying production code to filter out tests events
πŸ’‘ I’m working on the automation of the duplication and the wiring of environment variables to abstract this away from the developer. Stay tuned πŸ”Š

5. Run integration tests in CI before merge 🚦

Integration tests break often. And it’s good, they catch for you all sorts of mistakes. You need to run them as soon as possible in your automated tests suite.

Even if it requires more configuration at first, take the time to make them run in your CI for every pull request/merge request.

You also need to think about keeping your integration test job fast enough to be awaited without pain.

✍️ Real-life example

❌ Bad

Run automated integration tests in preprod/staging

  1. Write a new service integration & the corresponding test on your developer environment
  2. Push your code & open a pull request/merge request
    1. run units tests in CI
    2. review code
  3. After the merge on the staging branch, deploy the code on the staging environment
  4. Run all integration tests on the staging environment

β‡’ Developers will not run every test on every change in their developer environment, sometimes unexpected side effects will break unexpected tests.

β‡’ ❌ In that case the tests will fail in the staging branch which will impact all other developers who need this environment as stable as possible.

β‡’ ❌ The fix must be done quickly & is expensive because it requires starting the development flow from scratch: create a new branch, make the fix, test it properly, open a new pull request, pass the CI, pass the code review and then merge.


βœ… Good

Deploy and run integration test on every pull request/merge request on a dedicated environment

ℹ️ Serverless heavily relies on infrastructure as code and pay-as-you-use resources. Therefore it’s easy and cheap to deploy multiple environments
  1. Write a new service integration & the corresponding test on your developer environment
  2. Push your code & open a pull request/merge request
    1. run units tests in CI
    2. deploy a dedicated test environment & run all integration tests on it
    3. review code
  3. After the merge on the staging branch, deploy the code on the staging environment

β‡’ βœ… If a test fails, you only need to push a fix in the existing branch.

β‡’ βœ… It checks that the infrastructure as code is working properly and you are sure that staging deployment will succeed.

πŸ’‘ To speed up test environment deployment, you can work with a pool of test stacks and only deploy changes on them.

β‡’ More details about how to implement it:

GitHub logo theodo / test-stack-orchestrator

Orchestrate your serverless test stacks

stack-orchestrator

An easy way to manage the availability of multiple serverless stacks.

Use case

This API helps to implement integration or e2e testing per feature branch It enables to request a stack for a specific branch, deploy the app on this stack, test on it, then release the stack for the next feature branch.

A stack is a group of ressources that could be identified by a string, its stack name.

If you use the serverless framework you can deploy your app for a specific stack using serverless deploy --stage $stackName Most of the ressources created will be marked with the stackName.

Routes

Postman documentation

Request stack

Gets an available stack, locks it and return its stack name and last deployed commit.

The returned stack is

  1. one of the same branch if it exists and is available
  2. the older stack available (based on the last requested date)
  3. a…

Conclusion

I hope those 5 control points will help you to write better tests and to have a better development experience.

Keep them in mind and try to apply them as soon as possible in a project life cycle, tests are a pain to implement/refactor afterward. The bad examples mostly come from projects I've worked on. I know how painful it is to have to refactor tests because your team is stuck. Don't wait until it's too late πŸ˜‰

Feel free to share your thoughts and questions in the comments below. I will be happy to answer them.

Top comments (1)

Collapse
 
w00fx profile image
Raphael Moura

Freaking amazing, thanks a lot!