We get a lot of questions from folks about how to better manage notifications and alerts for their services on PagerDuty. Dynamic Notifications and Support Hours will help you do exactly that, but you might have missed them on your first journey through PagerDuty. Let’s take a look at what they can do for your team.
Dynamic Notifications can help you better manage what actions are taken when an alert is received on a particular service. They are linked to the service itself, so your team can customize actions for different services to meet the specific needs of that service and prioritize the most important services for immediate response.
- There might be any number of reasons why a service doesn’t require 24x7 high-priority response:
- The service isn’t in production. Your definition of “production” can include internal applications and tools!
- The service has minimal use during off hours in the same timezone as your staff.
- The service is only in limited production use, maybe a beta or a limited-release feature for a subset of users.
- The service isn’t in a critical path for all users.
- The service has a graceful degradation state that doesn’t adversely affect user experience.
Resolving a fault state the following morning can be acceptable for these and any other reasons your team decides on. Establishing a good practice for incident urgency requires some setup in your alerts, your services, and in your team’s individual user profiles.
Don’t Alert for Low-Urgency Issues
Plenty of teams have monitoring services that are chatty - they produce notifications about potential problems in advance of anything having an impact on user experience. This is great; your team has an opportunity to fix potential issues before an emergency happens. It might not be so great if those warning notifications come in the middle of the night.
All alerts that come into PagerDuty can have an associated severity. This is a field in the alert that establishes how important the alert is. The severities available in PagerDuty are based on industry-standard levels (note: they are case sensitive!), so hopefully they’re already familiar to you:
PagerDuty uses these levels during the processing of event rules, and will also assign a default behavior to the alert if there are no applicable rules. Make sure your monitoring systems and other inputs are sending alerts at the appropriate severity level so your team isn’t bombarded with arbitrarily inflated alerts.
Don't Alert Overnight for Everything
One of the worst parts about owning services in production is that sometimes things happen when you really want to be asleep. Waking up to deal with incidents overnight is hard; it’s hard on your team to lose sleep, it’s hard to get folks mobilized, and it’s hard to resolve issues when folks are trying to wake up enough to concentrate on the issue. So you want to make sure that the problems the team is being alerted for overnight really require an immediate response.
Dynamic alerts are found in the service configuration under “Settings”. The first section of the page is “Assign and Notify”:
The default action here is to just assign the service to an escalation policy and move on, but there is so much more you can do. When you click the “Edit” button, you have an option with a dropdown menu: “How should responders be notified?”. Here’s where things get interesting.
These settings give you the ability to increase or decrease the urgency of the notifications related to this service. The default configuration is “High-urgency”, which might not be necessary for all of your services. Low urgency can absolutely be appropriate for some services. The third setting, “Dynamic notifications based on alert severity” will allow you to make use of the behaviors outlined above, assigning high urgency or low urgency based on the alert severity.
The bottom option is “Based on support hours”, and choosing that option presents another set of useful options.
Now the magic happens. You have the option to configure your “Support Hours”. These aren’t just for Support teams, they’re for any service your team has that doesn’t warrant urgent overnight alerts! The schedule here allows you to be flexible with the days and hours - though the hours will be the same each day. The default is M-F, 9:00am to 5:00pm in your default timezone.
You’ll also see the checkbox to “Raise urgency of unacknowledged incidents to high” when support hours start. Any alerts received during off hours that would have been high-severity and haven’t been acknowledged will be raised to your team the next working day, so you won’t miss things. Additionally, if the team did happen to ack the alerts during off hours, they won’t be re-raised in the morning.
Make Use of Non-Immediate Notifications
Once your input alerts are setting severities and your service notification rules are set to use them, your team can make judicious use of the low-urgency settings to make alerts more manageable. The behavior for low-urgency alerts is a user setting, which each user will find in their profile under Notification Rules:
Depending on your account configuration, your notifications can be a mix of email, SMS, push notifications to the PagerDuty App, or phone calls. Allowing your team to use less-distracting notifications like email for low-urgency alerts will help your team focus on regular work and urgent issues, as well as getting a good night’s rest.
Summary
Defend your team against burnout! Actively managing alert severity is a key tool for making sure your team is getting the right alerts at the right time. By lowering the severity of alerts that aren’t really high-urgency, you give your team back their time and make on-call duties less of a burden. Alert severity can always be adjusted if the settings aren’t helping you maintain your reliability goals. For more on incident priorities, check out some additional posts on Determining Incident Priority and Cutting Alert Fatigue in Modern Ops. Our support knowledge base has more on using Dynamic Notifications and service settings.
If you have questions about these features or others in PagerDuty, join our community forum!
Top comments (0)