DEV Community

Cover image for Passing the AWS Certified DevOps Engineer - Professional exam

Passing the AWS Certified DevOps Engineer - Professional exam

Introduction

I recently passed the AWS Certified DevOps Engineer - Professional exam and I've put this post together to outline how I prepared for the exam and notes I took along the way. My primary motivation to take the exam was that my AWS Certified Developer - Associate certification was expiring in February 2024. Passing the DevOps Engineer - Professional exam would renew this and my AWS Certified SysOps Administrator - Associate certification for another 3 years. I like this approach to certifications as it allows to build on your experience and training as you progress with your AWS experience. Check out a previous article that I wrote on this:

https://dev.to/aws-builders/which-aws-certification-exam-should-i-sit-hah

In effect, passing this one exam would gain me one new certification and renew my Developer - Associate, SysOps Administrator - Associate and Cloud Practitioner certifications for another 3 years.

Where to begin

Every AWS certification has a page on the AWS certification website and I always find it the best place to start.

https://aws.amazon.com/certification/certified-devops-engineer-professional/

You will find details here about the exam with a study guide and sample questions. You'll also find links to the FAQs for the services and white-papers to read. You might be tempted to ignore these but they are worth reading.

From there, you can attend the free Exam Prep Standard Course: AWS Certified DevOps Engineer - Professional (DOP-C02 - English) provided by AWS on their skillbuilder site.

This course is not a complete guide to passing but it is useful in giving you an outline of where you should focus. The exam breaks down into following 6 domains and the exam readiness course goes through the services that you need to study in each domain.

Domain % of Exam
1.0 SDLC Automation 22%
2.0 Configuration Management and IaC 17%
3.0 Resilient Cloud Solutions 15%
4.0 Monitoring and Logging 15%
5.0 Incident and Event Response 14%
6.0 Security and Compliance 17%

Most of the links posted below come from this course.

The certification page also links to sample exam questions. Getting some practise with these questions will really help with your preparation for the exam.
And the AWS Certified DevOps Engineer - Professional Exam Guide also recommends these two courses.

Advanced Testing Practices Using AWS DevOps Tools

Advanced CloudFormation: Macros

I found the first one really good if you are looking for an overview of the different AWS CI/CD services with some hands on exercises at the end.

And if you are lucky enough to have a subscription to Skillbuilder, this practise exam is really good.

Exam Prep Official Practice Exam: AWS Certified DevOps Engineer - Professional (DOP-C02 - English)

It's 75 questions with a timer of 3 hours to help mimic the exam. I sat it 2 weeks before my exam date and it gave a good indication of where I needed to focus more time. You can review your answers and there are links to AWS from the answers where you dive into what you got wrong.

What follows are my notes and links to resources provided by AWS to help you pass. It is not a definitive guide to passing and should only be considered an aid by the reader.
The most important of what follows are the links. These come recommended from AWS training materials. I believe if you have relevant hands on experience with AWS services and use the links below to go deep into the theory of these, you should have enough to pass.
Everything outside the links are my own rough notes that I took as I studied. I have tidied them up as well as I can but apologies if you find them confusing.
My article covering the AWS Certified SysOps Administrator - Associate is still very relevant to a lot of areas covered in this professional exam.

https://dev.to/aws-builders/how-i-passed-the-aws-certified-sysops-exam-3if

Domain 1 - SDLC Automation (22%)

Services in scope

AWS Codepipeline - understand each stage
AWS CodeBuild
AWS CodeDeploy
AWS CodeCommit
AWS CodeArtifact
Amazon S3
Amazon Elastic Container Registry [Amazon ECR]
AWS Lambda
EC2 Image Builder
AWS Codestar
AWS Secrets Manager
AWS Systems Manager Parameter Store

Task Statement 1.1: Implement CI/CD pipelines

This domain is all about getting a good handle on the Amazon CI/CD offerings.

Image description

Source

AWS CodeCommit is AWS's hosted GitHub in the cloud. It is integrated with IAM and you should understand how different levels of permissions that can be granted to users of the service. Know what the AWSCodeCommitPowerUser policy is for.
You can receive notifications via SNS to invoke downstream actions in AWS CodeBuild, AWS CodePipeline or other competing services.

S3 and ECR can also be sources for CodePipeline.

Build & Test

AWS CodeBuild is a fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy. It is important to point out that you can use CodeBuild for running tests. Test results can be captured via report groups which can be accessed via the CodeBuild API or the AWS CodeBuild console. You can also export test results to S3.
If there is any build output, the build environment uploads its output to an S3 bucket.
CodeBuild can integrate with CodeCommit, GitHub, BitBucket, GitHub Enterprise and S3 as sources.

CodeBuild needs a build project to define how to gather application dependencies, run tests, and build the output to be used in preparing the deployment. A project includes information such as:

  • source code location
  • build environment to use
  • build commands to run
  • storage of build output

A build environment is the combination of operating system, programming language runtime, and tools used by CodeBuild to run a build.
You use buildspec.yml file to specify build commands. The buildspec file is a YAML-formatted file used by CodeBuild that includes a collection of build commands and related settings that CodeBuild uses to run a build.
You can include a buildspec as part of the source code or define a buildspec when you create a build project.
When included as part of the source code, the buildspec file is named buildspec.yml and is located in the root of the source directory. It can be overridden with create-project or update-project commands. There are several sections of a buildspec file:

  • Version
  • Phases
    • Install
    • Pre-Build
    • Build
    • Post-build
    • Finally blocks
  • Artifacts
  • Reports
  • Cache

Here is an example buildspec.yml file:

version: 0.2

env:
  variables:
    JAVA_HOME: "/usr/lib/jvm/java-8-openjdk-amd64"
  parameter-store:
    LOGIN_PASSWORD: /CodeBuild/dockerLoginPassword

phases:
  install:
    commands:
      - echo Entered the install phase...
      - apt-get update -y
      - apt-get install -y maven
    finally:
      - echo This always runs even if the update or install command fails 
  pre_build:
    commands:
      - echo Entered the pre_build phase...
      - docker login –u User –p $LOGIN_PASSWORD
    finally:
      - echo This always runs even if the login command fails 
  build:
    commands:
      - echo Entered the build phase...
      - echo Build started on `date`
      - mvn install
    finally:
      - echo This always runs even if the install command fails
  post_build:
    commands:
      - echo Entered the post_build phase...
      - echo Build completed on `date`

reports:
  arn:aws:codebuild:your-region:your-aws-account-id:report-group/report-group-name-1:
    files:
      - "**/*"
    base-directory: 'target/tests/reports'
    discard-paths: no
  reportGroupCucumberJson:
    files:
      - 'cucumber/target/cucumber-tests.xml'
    discard-paths: yes
    file-format: CucumberJson # default is JunitXml
artifacts:
  files:
    - target/messageUtil-1.0.jar
  discard-paths: yes
  secondary-artifacts:
    artifact1:
      files:
        - target/artifact-1.0.jar
      discard-paths: yes
    artifact2:
      files:
        - target/artifact-2.0.jar
      discard-paths: yes
cache:
  paths:
    - '/root/.m2/**/*'    
Enter fullscreen mode Exit fullscreen mode

You can use CloudWatch to monitor and troubleshoot progress of your CodeBuild project. And EventBridge and SNS for notifications.

CodeBuild concepts is a good resource to find out more.

CodeBuild Concepts

AWS CodeArtifact is a fully managed artifact repository service that can be used by organizations to securely store, publish, and share software packages used in their software development process. CodeArtifact can be configured to automatically fetch software packages and dependencies from public artifact repositories so developers have access to the latest versions.

Deploy

AWS CodeDeploy is a deployment service that automates application deployments to Amazon EC2 instances, on-premises instances, serverless Lambda functions, or Amazon ECS services. A compute platform is a platform on which CodeDeploy deploys an application. There are three compute platforms:

  • EC2/On-Premises
  • AWS Lambda
  • Amazon ECS

CodeDeploy runs deployments under an application which functions as a container for the correct combination of revision, deployment configuration, and deployment group are referenced during a deployment.
A deployment configuration is a set of deployment rules and deployment success and failure conditions used by CodeDeploy during a deployment.

  • Canary10Percent30Minutes
  • Canary10Percent5Minutes
  • Canary10Percent10Minutes
  • Linear10PercentEvery10Minutes
  • Linear10PercentEvery1Minute
  • Linear10PercentEvery2Minutes
  • Linear10PercentEvery3Minutes
  • AllAtOnce

A revision is a version of your application.

The storage location for files required by CodeDeploy is called a repository. Use of a repository depends on which compute platform your deployment uses. You can use S3 for all 3 compute platforms and GitHub and Bitbucket for for EC2/On-Premises deployments.

A deployment group is a set of individual instances. This can be EC2 or on-premise instances in the case of deployment for the EC2/On-Premises compute platform, an ECS cluster for ECS deployments.
In an Amazon ECS deployment, a deployment group specifies the Amazon ECS service, load balancer, optional test listener, and two target groups.
In an AWS Lambda deployment, a deployment group defines a set of CodeDeploy configurations for future deployments of an AWS Lambda function.
In an EC2/On-Premises deployment, a deployment group is a set of individual instances targeted for a deployment. A deployment group contains individually tagged instances, Amazon EC2 instances in Amazon EC2 Auto Scaling groups, or both. For both EC2 and on-premise instances, the CodeDeploy agent needs to be installed and tagged.
For EC2 instances, an IAM role must be attached to the instance and it must have the correct access permissions.
For on-premises instances, you can use an IAM role or user to register an on-premises instance with CodeDeploy with the role being the preferred method. Use of a role with AWS Security Token Service (AWS STS) is the preferred. Once access is in place the AWS CLI installed, you need to register the instance with CodeDeploy.
It is an involved process so worth reading over the official AWS instructions:
https://docs.aws.amazon.com/codedeploy/latest/userguide/on-premises-instances-register.html

Monitoring deployments in CodeDeploy

You can monitor CodeDeploy deployments using the following CloudWatch tools: Amazon CloudWatch Events, CloudWatch alarms, and Amazon CloudWatch Logs.
You can create a CloudWatch alarm for an instance or Amazon EC2 Auto Scaling group you are using in your CodeDeploy operations. An alarm watches a single metric over a time period you specify and performs one or more actions based on the value of the metric relative to a given threshold over a number of time periods.
You can use Amazon EventBridge to detect and react to changes in the state of an instance or a deployment (an "event") in your CodeDeploy operations. Then, based on rules you create, EventBridge will invoke one or more target actions when a deployment or instance enters the state you specify in a rule.
CodeDeploy is integrated with CloudTrail, a service that captures API calls made by or on behalf of CodeDeploy in your AWS account and delivers the log files to an Amazon S3 bucket you specify. CloudTrail captures API calls from the CodeDeploy console, from CodeDeploy commands through the AWS CLI, or from the CodeDeploy APIs directly.
You can add triggers to a CodeDeploy deployment group to receive notifications about events related to deployments or instances in that deployment group. These notifications are sent to recipients who are subscribed to an Amazon SNS topic you have made part of the trigger's action.

CodeDeploy Rolling Deployments

  • Oneatatime
  • Halfatatime
  • Custom

CodeDeploy can deploy your application on Amazon EC2 instances, Amazon Elastic Container Service (Amazon ECS) containers, Lambda functions, and even an on-premises environment.
When using CodeDeploy, there are two types of deployments available to you: in-place and blue/green. All Lambda and Amazon ECS deployments are blue/green. An Amazon EC2 or on-premises deployment can be in-place or blue/green.
When using a blue/green deployment, you have several options for shifting traffic to the new green environment.

Canary - You can choose from predefined canary options that specify the percentage of traffic shifted to your updated application version in the first increment. Then the interval, specified in minutes, indicates when the remaining traffic is shifted in the second increment.
Linear - Traffic is shifted in equal increments with an equal number of minutes between each increment. You can choose from predefined linear options that specify the percentage of traffic shifted in each increment and the number of minutes between each increment.
All-at-once - All traffic is shifted from the original environment to the updated environment at once.

An AppSpec file is a YAML or JSON file used in CodeDeploy. The AppSpec file is used to manage each deployment as a series of lifecycle event hooks, which are defined in the file.

Canary deployments are a type of segmented deployment. You deploy a small part of your application (called the canary) with the rest of the application following later. What makes a canary deployment different is you test your canary with live production traffic. With canary deployments in CodeDeploy, traffic is shifted in two increments.

  • The first shifts some traffic to the canary.
  • The next shifts all traffic to the new application at the end of the selected interval.

Will be one or two questions on ELB deployment options as per this table.
https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.deploy-existing-version.html

Elastic Beanstalk Deployments

It is worth understanding the different Elastic Beanstalk deployment options as per this table:
https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.deploy-existing-version.html

Orchestration

AWS CodePipeline is a fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates.

Image description

Concepts

Pipelines

A pipeline is a workflow construct that describes how software changes go through a release process. Each pipeline is made up of a series of stages.

Stages

A stage is a logical unit you can use to isolate an environment and to limit the number of concurrent changes in that environment. Each stage contains actions that are performed on the application artifacts. Your source code is an example of an artifact. A stage might be a build stage, where the source code is built and tests are run. It can also be a deployment stage, where code is deployed to runtime environments. Each stage is made up of a series of serial or parallel actions.

Actions

An action is a set of operations performed on application code and configured so that the actions run in the pipeline at a specified point. Valid CodePipeline action types are source, build, test, deploy, approval, and invoke. Each action can be fulfilled by multiple different providers.

Source can be Amazon S3, ECR, CodeCommit or other version control sources like Bitbucket Cloud, GitHub, GitHub Enterprise Server, or GitLab.com actions.
Build can be CodeBuild, Custom CloudBees, Custom Jenkins or Custom TeamCity.
Test can be CodeBuild, AWS Device Farm, ThirdParty GhostInspector, Custom Jenkins, ThirdParty Micro Focus StormRunner Load or ThirdParty Nouvola.
Deploy can S3, CloudFormation, CloudFormation Stacksets, CodeDeploy, ECS, ECS (Blue/Green), Elastic Beanstalk, AppConfig, OpsWorks, Service Catalog, Alexa or Custom XebiaLabs.
Approval can be Manual.
Invoke can be AWS Lambda or AWS Step Functions.

A crucial point to note here is that CodePipeline can be used without any of the other 3 aforementioned CI/CD services, CodeCommit, CodeBuild or CodeDeploy. Each can be an action provider in one or more stages but you can build a pipeline without any of them. A good example would be CloudFormation. You can use S3 as a source action and CloudFormation as a Deploy action.
Another would be ECR to ECS. ECR can be a source action and ECS a deploy action. It may use some of the 3 services in the background but you don't need to be aware of it.

Monitoring deployments in CodePipeline

EventBridge event bus events — You can monitor CodePipeline events in EventBridge, which detects changes in your pipeline, stage, or action execution status.
Notifications for pipeline events in the Developer Tools console — You can monitor CodePipeline events with notifications that you set up in the console and then create an Amazon Simple Notification Service topic and subscription for.
AWS CloudTrail — Use CloudTrail to capture API calls made by or on behalf of CodePipeline in your AWS account and deliver the log files to an Amazon S3 bucket.
Note: no integration with CloudWatch Logs, Metrics or Alarms.

Task Statement 1.2: Integrate automated testing into CI/CD pipelines

Image description

You'll need to know when to run different types of tests. For example, you should run your unit tests before you open a PR. This is where a service like Amazon CodeGuru Reviewer can be integrated. CodeGuru Reviewier provides automated code reviews for static code analysis. Works with Java and Python code. Also integrates with AWS Secrets Manager to use a secrets detector that finds unprotected secrets in your code.

Do not confuse with Amazon CodeGuru Profiler which provides visibility into and recommendations about application performance during runtime.

CodeBuild can also be used to run tests and save the results in the location specified in the reports section of the buildspec.

Task Statement 1.3: Build and manage artifacts

Use CodeArtifact, S3 and ECR as artifact repositories.

Know how to automate EC2 instances and container image build processes.

EC2 Image Builder simplifies the building, testing, and deployment of Virtual Machine and container images for use on AWS or on-premises.
It is a fully managed service to automate the creation, management and deployment of customized, secure and up-to-date server images that also integrates with AWS Resource Access Manager (RAM) and AWS Organisations to share the image within an account or organisation. Understand the concepts

  • managed image
  • image recipe
  • container recipe
  • base image
  • component document
  • runtime stages
  • configuration phases
  • build and test phases

EC2 Image Builder in conjunction with AWS VM Import/Export (VMIE) allows you to create and maintain golden images for Amazon EC2 (AMI) as well as on-premises VM formats (VHDX, VMDK, and OVF).
An Image Builder recipe is a file that represents the final state of the images produced by automation pipelines and enables you to deterministically repeat builds. Recipes can be shared, forked, and edited outside the Image Builder UI. You can use your recipes with your version control software to maintain version-controlled recipes that you can use to share and track changes.
Images can be shared as AMIs and container images can be shared via ECR.

Task Statement 1.4: Implement deployment strategies for instance, container, and serverless environments.

I already covered a lot of this in the section on CodeDeploy but no harm to repeat myself.

Understand these deployment strategies in theory. These terms or some variant of them often repeat themselves between services. Understand which are Mutable vs Immutable deployments.

Blue/green
Canary
Immutable rolling
Rolling with additional batch
In-Place
Linear
All-at-Once

Deployments for ECS

  • Rolling update
  • Blue/green deployment with CodeDeploy
  • External deployment

PreTraffic and PostTraffic hooks

Understand how you can use lifecycle hooks during the deployments. These can be used to stop deployments and trigger a rollback. There are different lifecycle hooks depending on the service being deployed. EC2 vs Lambda vs ECS.

Look at the BeforeAllowTraffic traffic hook in the appspec.yml file.

Deployment strategies for serverless

  • CloudFormation
  • AWS Serverless Application Model (SAM)
  • All-at-once
  • Blue/green
  • Canary
  • Linear
  • Lambda versions and aliases
  • CodeDeploy

Dive deeper in lambda versions and aliases

  • every lambda can have a number of versions and aliases associated with them
  • Versions are immutable snapshots of a function including codes and configuration
  • Versions can be used effectively with an alias
  • An alias is a pointer to a version
  • An alias has a name and an arn

Additional Resources

https://aws.amazon.com/devops/continuous-integration/
https://aws.amazon.com/devops/continuous-delivery/
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/high_availability_origin_failover.html
https://docs.aws.amazon.com/lambda/latest/dg/lambda-edge.html
https://docs.aws.amazon.com/codepipeline/latest/userguide/best-practices.html
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/continuous-delivery-codepipeline.html
https://docs.aws.amazon.com/codecommit/latest/userguide/welcome.html
https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/Welcome.html
https://docs.aws.amazon.com/apigateway/latest/developerguide/welcome.html
https://docs.aws.amazon.com/systems-manager/latest/userguide/what-is-systems-manager.html
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/Welcome.html
https://docs.aws.amazon.com/xray/latest/devguide/aws-xray.html
https://docs.aws.amazon.com/codedeploy/latest/userguide/reference-appspec-file-structure-hooks.html#appspec-hooks-ecs
https://docs.aws.amazon.com/codedeploy/latest/userguide/deployment-steps.html#deployment-steps-what-happens
https://docs.aws.amazon.com/codedeploy/latest/userguide/reference-appspec-file-structure-hooks.html#appspec-hooks-server

Domain 2 - Configuration Management and IaC (17%)

Services in scope

AWS Serverless Application Model [AWS SAM]
AWS CloudFormation
AWS Cloud Development Kit [AWS CDK])
AWS OpsWorks
AWS Systems Manager
AWS Config
AWS AppConfig
AWS Service Catalog
AWS IAM Identity Centre (formerly known as SSO)

Task Statement 2.1: Define cloud infrastructure and reusable components to provision and manage systems throughout their lifecycle.

Knowledge of Infrastructure as code is essential for anyone sitting the AWS exams. CloudFormation is AWS's foundational IaC option. And most others build on that. Therefore you need to understand the different sections in a CloudFormation template. The AWS recommended course will really help with that:

Advanced CloudFormation: Macros

AWS SAM is a less verbose way of deploying but it still uses CloudFormation in the background. When deploying a SAM template, you will still see it deployed via CloudFormation. And for services that SAM does not support, you can add CloudFormation yaml directly into the same file.
AWS Cloud Development Kit (CDK) is an option for generating CloudFormation code using TypeScript, Python, Java or .NET.

Task Statement 2.2: Deploy automation to create, onboard, and secure AWS accounts in a multi-account/multi-Region environment.

You need a good understanding of how to manage a multi-account setup in AWS. AWS Organizations is a good place to start with as that is the overall container for all the accounts. Understand that an Organization Unit (OU) is a sub-division of accounts within an Organization. OUs can be setup in a hierarchy. Service Control Policies (SCPs) are essential to understand. These are policies set at the organization and OU level to manage security and permission that are available or not available to the account within the OU. SCPs never grant permissions, instead they specify the maximum permissions for the accounts in scope. If an SCP is set at a higher level in the OU hierarchy, it cannot be overridden at a lower level. For example if you deny access to AWS Redshift at the root level, adding an Allow further down will never apply.
AWS Control Tower is AWS's golden path to setting up a multi-account setup with AWS Organizations. It gives you out of the box governance to enforce best practises and standards.
AWS Service Catalog can be used in a multi-account setup to provide these guardrails in child accounts via Cloudformation templates. These templates get published as product in service catalog that be utilised in accounts as a standard for deploying infrastructure via IAC. Understand the different constraints that apply. For example, a launch constraint can allow users to run the template without having permissions to the services involved. That way, a user can deploy the stack without using their own IAM credentials.
And understand how to apply AWS CloudFormation StackSets across multiple accounts and AWS Regions.
AWS IAM Identity Centre works well with AWS Organizations to manage multiple AWS accounts. You can define permission sets to limit the users’ permissions when they sign in to a role. You'll need to understand how it works with other IdPs like Microsoft Active Directory.
AWS Config is a service that will come up a lot in the exam and is well worth going deep on. In the context of this domain, you need to understand how AWS Config works in a multi-account setup to continuously detect nonconformance with AWS Config rules.

Task Statement 2.3: Design and build automated solutions for complex tasks and large-scale environments.

Additional Resources

https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/what-is-sam.html
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Introduction.html
https://docs.aws.amazon.com/secretsmanager/latest/userguide/intro.html
https://docs.aws.amazon.com/codedeploy/latest/userguide/reference-appspec-file-structure-hooks.html#reference-appspec-file-structure-environment-variable-availability
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/field-level-encryption.html#field-level-encryption-setting-up
https://docs.aws.amazon.com/lambda/latest/dg/configuration-aliases.html
https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-managedinstances.html
https://docs.aws.amazon.com/application-discovery/latest/userguide/what-is-appdiscovery.html
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-global-database.html#aurora-global-database.advantages
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraMySQL.Replication.CrossRegion.html
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AnalyzingLogData.html
https://aws.amazon.com/solutions/implementations/real-time-web-analytics-with-kinesis/
https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_ous.html
https://docs.aws.amazon.com/config/latest/developerguide/WhatIsConfig.html

Domain 3 - Resilient Cloud Solutions (15%)

Implement highly available solutions to meet resilience and business requirements

  • multi-Availability zone and multi-region deployments
  • replication and failover methods for your stateful services
  • techniques to achieve high availability

Implement solutions that are scalable to meet business requirements

  • determine appropriate metrics for scaling
  • using loosely coupled and distributed architectures
  • serverless architectures
  • container platforms

Global Scalability

  • Route 53
  • CloudFront
  • Secrets Manager
  • CloudTrail
  • Security Hub
  • Amazon ECR
  • AWS Transit Gateway
  • AWS IAM

Implement automated recovery processes to meet RTO/RPO requirements

Image description

  • disaster recovery concepts
  • choose backup and recovery strategies
  • and needed recovery procedures This domain will focus on DynamoDB, Amazon RDS, Route 53, Amazon S3, CloudFront, load balancers, Amazon ECS, Amazon EKS, API Gateway, Lambda, Fargate, AWS Backup, Systems Manager

Route 53 Application Recovery Controller can help you manage and coordinate failover for your application recovery across multiple Regions, Availability Zones and on-premises too.

AWS Elastic Disaster Recovery

Auto-scaling - https://aws.amazon.com/ec2/autoscaling/

Scaling options

Manual Scaling

Dynamic Scaling

Check out how to scaling based on Amazon SQS. The solution is to use a backlog per instance metric with the target value being the acceptable backlog per instance to maintain.

Step scaling
Target Tracking

Predictive Scaling

Scheduled Scaling

Scale out agressively, scale back in slowly. Will stop thrashing.

Lifecycle hooks to interrupt scale-in process.
Pending --> Pending Wait --> Pending proceed
--> Inservice
--> Terminating --> Terminating Wait --> Terminating proceed
--> Terminated

Scaling cooldowns - https://docs.aws.amazon.com/autoscaling/ec2/userguide/ec2-auto-scaling-scaling-cooldowns.html

ASG Warm Pools
Termination policies

Load-balancing

DynamoDB Global Tables

RTO and RPO

Additional Resources

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Welcome.html
https://docs.aws.amazon.com/config/latest/developerguide/aws-config-landing-page.html
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraMySQL.Replication.CrossRegion.html
https://aws.amazon.com/dynamodb/global-tables/
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/dns-failover-configuring.html
https://docs.aws.amazon.com/apigateway/latest/developerguide/canary-release.html

Domain 4 - Monitoring and Logging (15%)

Configure the collection, aggregation, and storage of logs and metrics

CloudWatch - Metric, logs and events
CW collects metrics, monitors those metrics and takes actions based on those metrics.
Custom metrics
Namespaces - each namespace contains data, each different namespace holds different data. All AWS data is contained in a namespace named AWS/service. So for EC2, it would be AWS/EC2.
Default namespace for cloudwatch agent is CW Agent.
Cannot push custom metrics to cloudwatch events.
--statistic-values parameter
AWS Logs Log Driver - passes logs from docker to cloudwatch logs.
Cloudwatch logs subscriptions.
CloudWatch Events cannot match API call events, that is why you need a CloudTrail trail to receive events.
CloudWatch Logs Insights can be used to search CloudWatch Logs.
CloudWatch Log Group retention.
Cloudtrail

Image description

  • All activity in your AWS account is logged by CloudTrail as a CloudTrail event. By default, CloudTrail stores the last 90 days of events in the event history. There are two types of events that CloudTrail logs: Management events and data events. By default, CloudTrail only logs management events, but you can choose to start logging data events if needed. There is an additional cost for storing data events. You can setup an S3 bucket to store your CloudTrail trail logs to store more than 90 days of the event history. You can also store these logs in CloudWatch logs. And with AWS Organizations, you can create an organizational trail from your AWS Organization's main account. This will log all events for the organization. Cloudtrail only logs events for the AWS region that the trail was created in.

Image description

Audit, monitor, and analyze logs and metrics to detect issues

Image description

CloudWatch ServiceLens
DevOps monitoring dashboard automates the dashboard for monitoring and visualising CI/CD metrics
CloudWatch Anomaly Detection

X-Ray for distributed tracing
Install x-ray daemon to capture tracing on ECS containers. X-ray daemon listens for traffic on UDP port 2000, uses the default network mode of bridge, gathers raw data and sends it to the X-Ray API.

For end-to-end views, Amazon CloudWatch ServiceLens.

To monitor sites, api endpoints, and web workflows, check out Amazon CloudWatch Synthetics.

Ensure you know which CloudWatch metrics to track for different AWS services. For example, if you use Route 53 health checks,

Exit codes - use systems manager, specifically using Run Command to specify exit codes.

Automate monitoring and event management of complex environments.

AWS CloudTrail Log file integrity. Digest file

Cloudwatch, Eventbridge, Kinesis, cloudtrail
You can only create a metric filter for CloudWatch log groups.
AWS Config

Cloudwatch logs agent to receive logs from on-premise servers
AWS Systems manager agent to manage on-premise servers

Ensure you know how to automate health checks for your applications.

  • Load balancers determine the health before sending traffic.
  • SQS checks the health before it pulls more work from the queue.
    • Route 53 checks the health of your instance, status of other health checks, status of any CloudWatch alarms and health of your endpoint.
  • CodeDeploy can use the rollback when alarm thresholds are met.

Additional resources

https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/SubscriptionFilters.html#LambdaFunctionExample
https://docs.aws.amazon.com/guardduty/latest/ug/what-is-guardduty.html
https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-user-guide.html
https://docs.aws.amazon.com/config/latest/developerguide/evaluate-config_use-managed-rules.html
https://docs.aws.amazon.com/health/latest/ug/cloudwatch-events-health.html
https://docs.aws.amazon.com/codedeploy/latest/userguide/monitoring-cloudwatch-events.html
https://docs.aws.amazon.com/opensearch-service/latest/developerguide/what-is.html

Domain 5 - Incident and Event Response (14%)

Manage event sources to process, notify, and take action in response to events.

AWS Health, CloudTrail and EventBridge

Implement configuration changes in response to events.

AWS Systems manager, AWS Auto Scaling

RDS Event notifications - get notified about database instances that are being created, restarted, deleted, but also notifications for low storage, multi-az failovers and configuration changes.
AWS Health --> CloudWatch
AWS Config --> CloudWatch

Auto scaling events for EC2 Instance-launch lifecycle action:

  • EC2 Instance Launch Successful
  • EC2 Instance Launch Unsuccessful
  • EC2 Instance-terminate Lifecycle Action
  • EC2 Instance Terminate Successful
  • EC2 Instance Terminate Unsuccessful

Systems Manager Fleet Manager - remotely manage your nodes running on AWS or on premises.

Troubleshoot system and application failures.

Systems Manager, Kinesis, SNS, Lambda, Cloudwatch (especially alarms), eventbridge, Xray

Image description

You can use Amazon CloudWatch Synthetics to create canaries, (configurable scripts that run on a schedule, to monitor your endpoints and APIs). They follow the same routes and perform the same actions as a customer. By using canaries, you can discover issues before your customers do. CloudWatch Synthetics' canaries are Node.js scripts. They create Lambda functions in your account that use Node.js as a framework.

CloudWatch Synthetic Monitoring and how it integrates with CloudWatch ServiceLens to trace the cause of impacts. Understand how ServiceLens integrates with AWS X-ray to provide an end-to-end view of your application.

How to implement real-time logs to CloudWatch using subscription filters can deliver logs to an Amazon Kinesis Stream or Kinesis Data Firehose Stream or Lambda for processing and analysis.

Need to be able to analyze incidents regarding failed processes for ECS and EKS.

ECS

  • Service event log
  • Fargate tasks
  • Other ECS tasks

EKS

  • Container insights to collect, aggregate, and summarize metrics and logs from your containerized applications and microservices.
  • CloudWatch alarms

You can view AWS log container logs in CloudWatch Logs if you have a log driver configured. For fargate tasks, the AWSlogs log driver passes these logs from Docker to CloudWatch Logs.

Container Insights are available for ECS, EKS and Kubernetes platform on EC2.

Image description

AWS Systems Manager OpsCenter is used to investigate and remediate an operational issue or interruption. These are known as OpsItems. And then you can run Systems Manager Automation runbooks to resolve any OpsItems.

Use a Guard Custom policy to create an AWS config custom rule.

Additional resources

https://docs.aws.amazon.com/vpc/latest/userguide/vpc-network-acls.html
https://docs.aws.amazon.com/vpc/latest/userguide/vpc-security-groups.html
https://docs.aws.amazon.com/lambda/latest/dg/services-cloudwatchevents.html
https://docs.aws.amazon.com/codedeploy/latest/userguide/deployment-steps.html#deployment-steps-what-happens
https://docs.aws.amazon.com/whitepapers/latest/aws-best-practices-ddos-resiliency/welcome.html
https://docs.aws.amazon.com/storagegateway/latest/APIReference/API_RefreshCache.html

Domain 6 - Security and Compliance (17%)

Services in scope

IAM
AWS IAM Identity Center
Organis
Security Hub
AWS WAF
VPC Flow Logs
Certificate Manager
AWS Config
Amazon Inspector
Guardduty
Macie

  • IAM, managed policies, AWS Security Token Service, KMS, Organizations, AWS Config, AWS Control Tower, AWS Service Catalog, Systems Manager, AWS WAF, Security Hub, GuardDuty, security groups, network ACLs, Amazon Detective, Network Firewall, and more.

Implement techniques for identity and access management at scale

Secrets manager
Permission boundaries
Predictive scaling for service-linked roles

Apply automation for security controls and data protection.

Image description

Security Hub sends all findings to Eventbridge.

Control Tower integrates with Organizations, Service Catalog and IAM Identity Center.

Implement security monitoring and auditing solutions.

Image description

Image description

PII = keyword for Macie
Audit or Auditing = keyword for AWS Config

AWS Trusted Advisor

AWS Systems Manager services and features

Additional resources

https://docs.aws.amazon.com/systems-manager/latest/userguide/patch-manager.html
https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html
https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html
https://docs.aws.amazon.com/AmazonS3/latest/userguide/example-bucket-policies.html
https://docs.aws.amazon.com/systems-manager/latest/userguide/sysman-compliance-about.html#sysman-compliance-custom
https://docs.aws.amazon.com/servicecatalog/latest/adminguide/catalogs_constraints_template-constraints.html

Additional resources

These are links to additional resources that AWS recommends to pass the exam. They are definitely worth going through.

Whitepapers
https://docs.aws.amazon.com/whitepapers/latest/running-containerized-microservices/introduction.html
https://docs.aws.amazon.com/whitepapers/latest/introduction-devops-aws/infrastructure-as-code.html
https://docs.aws.amazon.com/whitepapers/latest/disaster-recovery-workloads-on-aws/disaster-recovery-workloads-on-aws.html
https://docs.aws.amazon.com/whitepapers/latest/aws-multi-region-fundamentals/aws-multi-region-fundamentals.html

FAQs
https://aws.amazon.com/ec2/autoscaling/faqs/?devops=sec&sec=prep
https://aws.amazon.com/elasticloadbalancing/faqs/?devops=sec&sec=prep
https://aws.amazon.com/elasticbeanstalk/faqs/?devops=sec&sec=prep
https://aws.amazon.com/cloudwatch/faqs/
https://aws.amazon.com/eventbridge/faqs/

Other resources
https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/what-is-sam.html
https://aws.amazon.com/blogs/apn/implementing-serverless-tiering-strategies-with-aws-lambda-reserved-concurrency/

Summary

I am glad I sat and passed the exam but it may not be for everyone. I have been working with AWS for over 5 years now and have sat and passed 8 different certification exams. Of those exams, I found this exam to be the toughest. It covers a lot of services (58 at last count) that I do not work with and probably never will. It covers a huge of services so it can be difficult to know where to focus. Taking the practise exams really helped me to find the areas I needed to drill into and put more structure around my revision.

Top comments (2)

Collapse
 
farrukhkhalid profile image
Farrukh Khalid

Extremely detailed overview, really helped!

Collapse
 
tom_millner profile image
Tom Milner

Thanks Farrukh