DEV Community

Discussion on: Staging Environments Are Overlooked - Here's Why They Matter

Collapse
 
mostlyjason profile image
Jason Skowronski

Lukáš these are all great issues to point out! IMHO phased releases using future flags are a great way to test major functionality changes out. In fact there is a popular library called scientist that does just that in a controlled way github.com/github/scientist.

You're also right that's important to minimize the blast radius of missed problems. Using future flags exposes customers to potentially broken code, and possibly incompatible versions of dependent services. It can even lead to data loss or data corruption. Having a staging environment with proper test cases and that closely mirrors your production environment will hopefully catch the most obvious problems before they impact customers, and before your dev team leaves for the day.

It's great that your dev team trusts your staging environment so well that it gives them a false sense of safety. However, it's only as good as the test cases and how well it mirrors production. You can test for "known knowns" but tests miss "unknown unknowns". These missed cases can and do cause major production outages, even at established companies.

A common rule of thumb is to never make major changes and leave, especially on a Friday if you want to enjoy your weekend! I think it's important to have post-deployment monitoring in place. If there is a critical change in KPIs/SROs they should be alerted to fix the problem or rollback ASAP.

If you're at a small startup you might not have all this infrastructure and policies in place and its more common to cut corners for efficiency. It becomes more critical at larger companies where major outages or bugs can cause thousands or millions of dollars of damage through lost sales, damaged reputation, etc.