DEV Community

Staging Environments Are Overlooked - Here's Why They Matter

Jason Skowronski on May 31, 2019

Many development teams skip having a staging environment for their applications. They often submit a PR, potentially run tests in a CI system, merg...

Read full post

Austin Standing • May 31 '19 • Edited

👏Best👏in👏class👏behavior👏

This is the principle value of having a staging environment: Keeping your breaking changes from going directly to production by providing you a mirror environment to test and validate your changes in.

We do this at Impartner (staging environments), and I can't tell you how many headaches are prevented by sticking to the process. I might just have to write an article about my team's environment strategy because if you aren't convinced yet my work isn't done.

We have a Dev/Stage/Prod setup, (well Dev/Stage/UAT/Prod, but that will have to wait for the article) and I like to explain it this way:

Dev is my domain as the developer. Expect it to be "broken" half the time, expect users to have access to things they shouldn't, and expect to see features that aren't fully fleshed out yet. Never demo Dev.
Stage is your domain as product owner/project manager/internal stakeholder. This is what you look to as a barometer for progress, and where you go to demo functionality. It should mirror Prod except for the latest stable updates from Dev, but never touch Prod data.
Prod is the client's/user's domain. Don't touch it if you can help it.

In my current role all of my projects integrate with platforms as a backend, such as our PRM, or client CRMs. Because of this, it makes a world of difference when each environment I work in has a matching backend environment. Having that extra step between Dev and Prod can be the difference between finding out the easy way or the hard way that the sandbox I had to work with hadn't been updated in 6 months. It's best to have a second sandbox that is kept refreshed/updated as frequently as possible.

Lastly, limit anything environment specific like your app's URL to a config file! I have to admit item 1 gave me some heartburn.

Jason Skowronski • May 31 '19

I love your breakdown between environments and user personas! I would have loved to use stage for demos in a prior company but it was broken half the time .

By the way, Heroku also offers a concept called a "Review app" that is automatically created for each PR. This can be useful for product owners and testers to see changes in isolation, whereas the full staging environment would have the integrated and latest set of changes.

Also, good work on setting up matching backends for your CRM and PRM. If this uses non-prod data even better because you aren't exposing customer information when testing/demoing.

Kyle Boe • May 31 '19

We stress the importance of a staging environment to all of our clients. I'll definitely be referencing the thoughtful reasoning in this post when answering them in the future. Thanks for the helpful post!

Guru Kalle • Jun 20 '19

The right way to have the architecture is to have a fix environment for SIT, then have a UAT environment along with a UAT-fix environment and then have a Prod-Stage environment, Prod environment and a Prod-Fix Environment for emergency fixes. These things are required if you are a large organization with large environment.If you want to SIT it self as a SIT plus UAT environment, you would need a SIT-fix environment. This Sit-Fix itself would serve the purpose of applying fixes for issues found in SIT/UAT testing.

SyntaxSeed (Sherri W) • Jun 1 '19

My staging env doubles as a 'live' copy of the site where clients can play with & try out the new feature to approve it for lunch. I suppose if you don't have external stakeholders... but still seems risky to skip that step.

Jason Skowronski • Jun 4 '19

It can also be useful for product managers to demo new features and get feedback

Francisco Quintero 🇨🇴 • Jun 1 '19

Since I started working professionally and learnt about staging and production environments, in every project I've been part of there's always a staging environment.

For many of them, I was responsible for setting it up and keeping it working.

And there's no way in projects I worked everything go straight to production.

It is a no brainer. Dev -> staging -> production.

Not having one staging area represents a huge risk for project health because staging is the closest space where one could freely break things without fearing the results.

Nice article and ways to think about these environments.

Jason Skowronski • Jun 4 '19

I would say if you can do performance testing in a staging environment that's great! For this to make sense he would have to have similar hardware and similar data to test with. You can use apps like loader.io to do benchmarks. APM tools like NewRelic or AppOptics can also be added to staging and you can measure latency, etc. More testing is better, but you can't anticipate everything. Post-deployment monitoring is also equally important. You can do thinks like a canary or blue-green deployment to see how it handles load before deploying everywhere.

Lukáš Doležal • Oct 20 '19 • Edited

What do you think about using feature flags/feature toggles as sort of "staging" environment. I.e. having only one production environment and deploy risky and big changes under feature flag. Enablr flag only for test user, or small amount of beta customers.

And what about that staging can give "false sense of safety". Ive seen many times devs to test on staging thoroughly just to then think that they can "push to production and go home", while actually they missed bug that was only present on production, because of specific production data, setup or scale.

IMHO even with staging you still need a way to limit blast radius with production deployments. And if you have that, do you need staging any more?

Jason Skowronski • Oct 21 '19

Lukáš these are all great issues to point out! IMHO phased releases using future flags are a great way to test major functionality changes out. In fact there is a popular library called scientist that does just that in a controlled way github.com/github/scientist.

You're also right that's important to minimize the blast radius of missed problems. Using future flags exposes customers to potentially broken code, and possibly incompatible versions of dependent services. It can even lead to data loss or data corruption. Having a staging environment with proper test cases and that closely mirrors your production environment will hopefully catch the most obvious problems before they impact customers, and before your dev team leaves for the day.

It's great that your dev team trusts your staging environment so well that it gives them a false sense of safety. However, it's only as good as the test cases and how well it mirrors production. You can test for "known knowns" but tests miss "unknown unknowns". These missed cases can and do cause major production outages, even at established companies.

A common rule of thumb is to never make major changes and leave, especially on a Friday if you want to enjoy your weekend! I think it's important to have post-deployment monitoring in place. If there is a critical change in KPIs/SROs they should be alerted to fix the problem or rollback ASAP.

If you're at a small startup you might not have all this infrastructure and policies in place and its more common to cut corners for efficiency. It becomes more critical at larger companies where major outages or bugs can cause thousands or millions of dollars of damage through lost sales, damaged reputation, etc.