Discussion on: Effective Test Strategies for Front-end Applications using LaunchDarkly Feature Flags and Cypress. Part2: testing

View post

Hi, I'm a Solutions Architect at LaunchDarkly. Thanks so much for writing this! I love how thoroughly you explore the options and explain what you're doing on the way.

Automated testing is a complex topic, and the use of flags piles on even more complexity. We recently published a general guide about this, but it doesn't go into nearly as much detail as yours. I'm interested to hear what you think of it: docs.launchdarkly.com/guides/best-...

Murat K Ozcan • Apr 2 '22 • Edited

Thanks Yoz. I saw your comment on Gleb's blog, and the statelessness concern inspired the solution.

I will read the guide and send you my comments. If you want to use this guide for LD benefits, you have full copyright. You can also have access to the repo.

I am in contact with Sophia C. from LD. Perhaps we can meet up in an online meeting.
Here is my linkedIn linkedin.com/in/murat-ozcan-3489898/ and work email murat@extend.com .

edit: I read the guide. We should definately meet and talk about testing feature flags. I will have a more complete idea once we tackle the backend/service testing.

In a traditional development process, a quality assurance (QA) team does testing on staging,

I understand where the overview is going, but this sentence de-railed me.

This has be a very traditional development process. Most the industry strives to shift left, be it localhost (for apps), localstack (AWS) / ephemeral instance, at worst a sandbox before deployment to default branch on dev. Staging is right after dev. I would advise against skipping testing on any deployment, but perhaps a subset of a test suite can be covered. There are may ways to go about that; from "best guess" to theoretical math.

Full support for "testing prod before prod".

Testing types, understood.

Instead, choose a defined set of scenarios to limit the number of different flag states to test.
As you saw in the blog post, we can model the combinations instead of a random choice of flag states. Combinatorial explosion is not a problem that way; the bigger the data the better. Check out this book post github.com/NoriSte/ui-testing-best... .

I understand the bullet points and what to test, but I believe I may have convinced you about a different approach. In short it was:

CT model + unit test a lean & confident subset
Full on flags with Cypress ui-integration tests (stub the network, test in full). This solves a few problems like any deployment being able to run the same version of the test. I believe it also helps with not tapping the MAUs needlessly.
CT model + ui-e2e test, and even a subset is possible with coverage analysis.

Unit & Mock testing: at this level you cannot connect to LD SDK, can you? All you can do is test the various flag states of the component, local to the code. I only used it in React so far, unit seems irrelevant to that. I related to mock testing and this sentence while reading; "You can also use a test data source to mock the behavior of a LaunchDarkly SDK so it has predictable behavior when evaluating flags." Effectively we stub out the hook and test the flag states.

Integration (reading flags from a file): I think this 1:1 approach with ui-(component)integration tests in the blog post. Integration is an inflated term. Here is the test terminology I live by dev.to/muratkeremozcan/mostly-inco....

Using a dedicated LD environment: Ok if idempotent (reading only), but if writing flags it's a state nightmare. I figured out the way in the UI because of the LD feature, LD writing a user to localhost while there is no user targeting, but I am very curious how we would handle this statefulness challenge with our Node / lambda backend. If we cannot write to LD state, then we have to do conditional testing per deployment because the same suite cannot be applied to different deployments. I.e. get the flag and run the test suite accordingly, or only run this spec with this flag. We haven't gotten there yet, maybe by next month. You will know!

The further I read, the more I think that there needs to be a test guide per LD usage environment. The rules are just different; browser testing isn't the same as service testing.

Cypress testing: you can use the part of the blog about making the tests stateless. Feel free to refer to it. I have not figured out what we would do if we had to have user targeting, because then LD does not create a user in local storage. Maybe we can discuss this further "How to make Cypress LD control plugin tests stateless when having LD user targeting".

Maybe there is a way to setup multiple users being targeted that you know of, then there are possibilities. Let's say you have 10 test users that are targeted. Randomly pick one at the start of the run. There is still a chance the concurrent tests pick the same test user. If there were 10 test users, and 2 entities running this same test randomly picked one of those users at the start of a run, the chance for them to clash would be (1 - .9^2) which is 19%. For 3 entities this would be (1 - .9^3) which is 27.1%. For 3 entities this would be (1 - .9^10) which is 65%. As you can tell, this approach does not scale well, but it is a possibilitiy. The proper solution to this challenge would be randomization. We would need a way for the cypress-ld-control to create an LD user on the fly, set the flag for that user, remove the flag and the user at the end of the test.
Gleb would need to add user crud support to the plugin, then you could add your own use and target that.