Amador Criado for AWS Community Builders

Posted on May 10, 2021 • Edited on Oct 8, 2021

10 Unusual concepts in AWS Solutions Architect Associate certification

#aws #cloudskills #certification #architecture

What can we find here?

This post is not the typical one that shows a list of the well known topics that the exam covers like for Example EC2, VPC, etc... This is the result after completing several practice tests in the most known platforms to achieve the AWS Solutions Architect certification. After all, I've noticed that there are several 'secondary' topics that are also important to learn.

Here you're going to find a personal selection of the most valuable exotic topics with a short description that can give you a Top Score in the exam.

VPC peering
Transit Gateway
RDS read replica replication lag
Origin Access Identity
AWS Direct Connect
HPC
ENA/ENI/EFA
CloudTrail Log file integrity
Glacier VS Deep Archive Retrievals
EC2 instance saving plans

VPC peering

A VPC peering connection is a networking connection between two VPCs that enables you to route traffic between them using private IPv4 addresses or IPv6 addresses.
Instances in either VPC can communicate with each other as if they are within the same network
You can create a VPC peering connection between your own VPCs, or with a VPC in another AWS account. The VPCs can be in different regions (also known as an inter-region VPC peering connection).

A VPC peering connection helps you to facilitate the transfer of data. For example, if you have more than one AWS account, you can peer the VPCs across those accounts to create a file sharing network. You can also use a VPC peering connection to allow other VPCs to access resources you have in one of your VPCs.
You can establish peering relationships between VPCs across different AWS Regions (also called Inter-Region VPC Peering). This allows VPC resources including EC2 instances, Amazon RDS databases and Lambda functions that run in different AWS Regions to communicate with each other using private IP addresses, without requiring gateways, VPN connections, or separate network appliances. The traffic remains in the private IP space. All inter-region traffic is encrypted with no single point of failure, or bandwidth bottleneck. Traffic always stays on the global AWS backbone, and never traverses the public internet, which reduces threats, such as common exploits, and DDoS attacks. Inter-Region VPC Peering provides a simple and cost-effective way to share resources between regions or replicate data for geographic redundancy.

Transit Gateway

A transit gateway is a network transit hub that you can use to interconnect your virtual private clouds (VPCs) and on-premises networks.

A transit gateway acts as a Regional virtual router for traffic flowing between your virtual private clouds (VPCs) and on-premises networks. A transit gateway scales elastically based on the volume of network traffic. Routing through a transit gateway operates at layer 3, where the packets are sent to a specific next-hop attachment, based on their destination IP addresses.

RDS read replica replication lag

Amazon RDS uses the MariaDB, Microsoft SQL Server, MySQL, Oracle, and PostgreSQL DB engines' built-in replication functionality to create a special type of DB instance called a read replica from a source DB instance. The source DB instance becomes the primary DB instance. Updates made to the primary DB instance are asynchronously copied to the read replica. You can reduce the load on your primary DB instance by routing read queries from your applications to the read replica. Using read replicas, you can elastically scale out beyond the capacity constraints of a single DB instance for read-heavy database workloads.

In the general process for promoting a read replica to a DB instance it's needed to stop any transactions from being written to the primary DB instance, and then wait for all updates to be made to the read replica. Database updates occur on the read replica after they have occurred on the primary DB instance, and this replication lag can vary significantly. Use the Replica Lag metric to determine when all updates have been made to the read replica.

Monitoring replication lag

You can monitor replication lag in Amazon CloudWatch by viewing the Amazon RDS ReplicaLag metric.
For MySQL and MariaDB, the ReplicaLag metric reports the value of the Seconds_Behind_Master field of the SHOW SLAVE STATUS command. Common causes for replication lag for MySQL and MariaDB are the following:

A network outage.
Writing to tables with indexes on a read replica. If the read_only parameter is not set to 0 on the read replica, it can break replication.
Using a nontransactional storage engine such as MyISAM. Replication is only supported for the InnoDB storage engine on MySQL and the XtraDB storage engine on MariaDB.

When the ReplicaLag metric reaches 0, the replica has caught up to the primary DB instance. If the ReplicaLag metric returns -1, then replication is currently not active. ReplicaLag = -1 is equivalent to Seconds_Behind_Master = NULL.

Origin Access Identity

To restrict access to content that you serve from Amazon S3 buckets, follow these steps:
1. Create a special CloudFront user called an origin access identity (OAI) and associate it with your distribution.
2. Configure your S3 bucket permissions so that CloudFront can use the OAI to access the files in your bucket and serve them to your users. Make sure that users can’t use a direct URL to the S3 bucket to access a file there.
After you take these steps, users can only access your files through CloudFront, not directly from the S3 bucket. Requiring CloudFront URLs isn't necessary, but it’s recommended to prevent users from bypassing the restrictions that you specify in signed URLs or signed cookies.

AWS Direct Connect

AWS Direct Connect links your internal network to an AWS Direct Connect location over a standard Ethernet fiber-optic cable. One end of the cable is connected to your router, the other to an AWS Direct Connect router. With this connection, you can create virtual interfaces directly to public AWS services (for example, to Amazon S3) or to Amazon VPC, bypassing internet service providers in your network path.

Direct Connect gateways

Direct Connect allowed AWS users to connect their AWS environment to AWS. However connecting from a single Direct Connect location to multiple AWS VPCs wasn't so straight forward. AWS Direct Connect gateway is aimed at making it easier to connect from a single Direct Connect location to multiple AWS regions or VPCs.

HPC

AWS ParallelCluster is an AWS supported open source cluster management tool that helps you to deploy and manage high performance computing (HPC) clusters in the AWS Cloud. Built on the open source CfnCluster project, AWS ParallelCluster enables you to quickly build an HPC compute environment in AWS. It automatically sets up the required compute resources and shared filesystem. You can use AWS ParallelCluster with batch schedulers, such as AWS Batch and Slurm. AWS ParallelCluster facilitates quick start proof of concept deployments and production deployments. You can also build higher level workflows, such as a genomics portal that automates an entire DNA sequencing workflow, on top of AWS ParallelCluster.

ENA/ENI/EFA

ENI → An elastic network interface is a logical networking component in a VPC that represents a virtual network card.

ENA → ENA is a custom network interface optimized to deliver high throughput and packet per second (PPS) performance, and consistently low latencies on EC2 instances. Using ENA, customers can utilize up to 20 Gbps of network bandwidth on certain EC2 instance types.

EFA → Elastic Fabric Adapter (EFA) is a network interface for Amazon EC2 instances that enables customers to run applications requiring high levels of inter-node communications at scale on AWS. Its custom-built operating system (OS) bypass hardware interface enhances the performance of inter-instance communications, which is critical to scaling these applications. With EFA, High Performance Computing (HPC) applications using the Message Passing Interface (MPI) and Machine Learning (ML) applications using NVIDIA Collective Communications Library (NCCL) can scale to thousands of CPUs or GPUs. As a result, you get the application performance of on-premises HPC clusters with the on-demand elasticity and flexibility of the AWS cloud.

CloudTrail Log file integrity

To determine whether a log file was modified, deleted, or unchanged after CloudTrail delivered it, you can use CloudTrail log file integrity validation. This feature is built using industry standard algorithms: SHA-256 for hashing and SHA-256 with RSA for digital signing.

Glacier VS Deep Archive Retrievals

EC2 instance saving plans

Savings Plans are a flexible pricing model that offer low prices on EC2, Lambda, and Fargate usage, in exchange for a commitment to a consistent amount of usage (measured in $/hour) for a 1 or 3 year term. When you sign up for a Savings Plan, you will be charged the discounted Savings Plans price for your usage up to your commitment. AWS offers two types of Savings Plans.

EC2 Instance Savings Plans provide the lowest prices, offering savings up to 72% in exchange for commitment to usage of individual instance families in a region (e.g. M5 usage in N. Virginia). This automatically reduces your cost on the selected instance family in that region regardless of AZ, size, OS or tenancy. EC2 Instance Savings Plans give you the flexibility to change your usage between instances within a family in that region. For example, you can move from c5.xlarge running Windows to c5.2xlarge running Linux and automatically benefit from the Savings Plan prices.

Conclusion

Hope that this post help to understand better those unusual topics. In my opinion are interesting not only to pass the certification, also to learn more AWS.
You as a Solutions Architect must master not only the most used concepts but also have the ability to enhance the details and provide the quality that differentiates you from the rest.

DEV Community