DEV Community

Cover image for Introduction to Break-Glass in Cloud Environments
Eyal Estrin for AWS Community Builders

Posted on • Originally published at Medium

Introduction to Break-Glass in Cloud Environments

Using modern cloud environments, specifically production environments, decreases the need for human access.

It makes sense for developers to have access to Dev or Test environments, but in a properly designed production environment, everything should be automated – from deployment, and observability to self-healing. In most cases, no human access is required.

Production environments serve customers, require zero downtime, and in most cases contain customers' data.

There are cases such as emergency scenarios where human access is required.

In mature organizations, this type of access is done by the Site reliability engineering (SRE) team.

The term break-glass is an analogy to breaking a glass to pull a fire alarm, which is supposed to happen only in case of emergency.

In the following blog post, I will review the different alternatives AWS gives its customers to handle break-glass scenarios.

Ground rules for using break-glass accounts

Before talking about how each of the hyperscale cloud providers handles break-glass, it is important to be clear – break-glass accounts should be used in emergency cases only.

  • Authentication – All access through the break-glass mechanism must be authenticated, preferred against a central identity provider, and not using local accounts
  • Authorization – All access must be authorized using role-based access control (RBAC), following the principle of least privilege
  • MFA – Since most break-glass scenarios require highly privileged access, it is recommended to enforce multi-factor authentication (MFA) for any interactive access
  • Just-in-time access – All access through break-glass mechanisms must be granted temporarily and must be revoked after a pre-define amount of time or when the emergency is declared as over
  • Approval process – Access through a break-glass mechanism should be manually approved
  • Auditing – All access through break-glass mechanisms must be audited and kept as evidence for further investigation
  • Documented process – Organizations must have a documented and tested process for requesting, approving, using, and revoking break-glass accounts

Handling break-glass scenarios in AWS

Below is a list of best practices provided by AWS for handling break-glass scenarios:

Identity Management

Identities in AWS are managed using AWS Identity and Access Management (IAM).

When working with AWS Organizations, customers have the option for central identity management for the entire AWS Organization using AWS IAM Identity Center – a single-sign-on (SSO) and federated identity management service (working with Microsoft Entra ID, Google Workspace, and more).

Since there might be a failure with a remote identity provider (IdP) or with AWS IAM Identity Center, AWS recommends creating two IAM users on the root of the AWS Organizations tree, and an IAM break-glass role on each of the accounts in the organization, to allow access in case of emergency.

The break-glass IAM accounts need to have console access, as explained in the documentation.

Authentication Management

When creating IAM accounts, enforce the use of a strong password policy, as explained in the documentation.

Passwords for the break-glass IAM accounts must be stored in a secured vault, and once the work on the break-glass accounts is over, the passwords must be replaced immediately to avoid reuse.

AWS recommends enforcing the use of MFA for any privileged access, as explained in the documentation.

Access Management

Access Management

AWS recommends creating a break-glass IAM role, as explained in the documentation.

Access using break-glass IAM accounts must be temporary, as explained in the documentation.

Auditing

All API calls within the AWS environment are logged into AWS CloudTrail by default, and stored for 90 days.

As best practices, it is recommended to send all CloudTrail logs to a central S3 bucket, from the entire AWS Organization, as explained in the documentation.

Since audit trail logs contain sensitive information, it is recommended to encrypt all data at rest using customer-managed encryption keys (as explained in the documentation) and limit access to the log files to the SOC team for investigation purposes.

Audit logs stored inside AWS CloudTrail can be investigated using Amazon GuardDuty, as explained in the documentation.

Resource Access

To allow secured access to EC2 instances, AWS recommends using EC2 Instance Connect or AWS Systems Manager Session Manager.

To allow secured access to Amazon EKS nodes, AWS recommends using AWS Systems Manager Agent (SSM Agent).

To allow secured access to Amazon ECS container instances, AWS recommends using AWS Systems Manager, and for debugging purposes, AWS recommends using Amazon ECS Exec.

To allow secured access to Amazon RDS, AWS recommends using AWS Systems Manager Session Manager.

Summary

In this blog post, we have reviewed what break-glass accounts are, and how AWS is recommending to secure break-glass accounts (from authentication, authorization, auditing, and secure access to cloud resources).

I recommend any organization that manages cloud production environments based on AWS to follow AWS' security best practices and keep the production environment secured.

About the Author

Eyal Estrin is a cloud and information security architect, the owner of the blog Security & Cloud 24/7 and the author of the book Cloud Security Handbook, with more than 20 years in the IT industry.

Eyal is an AWS Community Builder since 2020.

You can connect with him on Twitter

Opinions are his own and not the views of his employer.

Top comments (0)