In the previous article (part 1/2) we have created an S3 static website as an origin to CloudFront distribution, then we updated the Route 53 configuration to add a new A record pointing to the static website and enabled failover routing policy.
In this phase we will continue deploying serverless services to enable traffic monitoring using S3 bucket to save the CloudFront access logs, create a Lambda function to reduce the log count and use Athena to query the access logs data.
Traffic Monitoring
We will create a new S3 bucket to store the CloudFront access logs, and create another bucket to be used by Athena, then we will enable CloudFront standard logging. Next, we will create a new database and table in Athena to query and analyze the access logs.
- Go to AWS S3 console
- Create an access log bucket for CloudFront and enable ACL then create a log folder "cf_accesslogs"
- Create another bucket to be used by Athena
- Go to AWS Cloudfront console
- Select the distribution then click Edit
- Turn On Standard logging then select the access log S3 bucket you have just created
- Prefix: the logs folder "cf_accesslogs"
- Save changes
- Wait until Cloudfront distribution is deployed
- If the Web App is down, access your web app domain then it will failover to the CloudFront distribution URL
- or access the Cloudfront distribution URL directly
- This will generate logs inside the S3 logs bucket
- Go to AWS Athena console
- Make sure you are using the same region as the created bucket
- Select the Athena bucket to be used by Athena to store the data
- Add the following to the Query editor to create a accesslogs database
create database accesslogs;
- Click Run
- Select the + sign to add a new query tab
- Add the following to create a table (replace YourS3LogsBucket/prefix with the bucket name and prefix)
CREATE EXTERNAL TABLE IF NOT EXISTS accesslogs.cloudfront_logs (
`date` DATE,
time STRING,
location STRING,
bytes BIGINT,
request_ip STRING,
method STRING,
host STRING,
uri STRING,
status INT,
referrer STRING,
user_agent STRING,
query_string STRING,
cookie STRING,
result_type STRING,
request_id STRING,
host_header STRING,
request_protocol STRING,
request_bytes BIGINT,
time_taken FLOAT,
xforwarded_for STRING,
ssl_protocol STRING,
ssl_cipher STRING,
response_result_type STRING,
http_version STRING,
fle_status STRING,
fle_encrypted_fields INT,
c_port INT,
time_to_first_byte FLOAT,
x_edge_detailed_result_type STRING,
sc_content_type STRING,
sc_content_len BIGINT,
sc_range_start BIGINT,
sc_range_end BIGINT
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION 's3://YourS3LogsBucket/prefix'
TBLPROPERTIES ( 'skip.header.line.count'='2' );
- Run the query
- Add another query tab to query the data from the table using the following
SELECT * FROM "accesslogs"."cloudfront_logs" limit 30;
- Or select some of the important columns
SELECT date, time, location, request_ip, user_agent FROM "accesslogs"."cloudfront_logs" ORDER bY date DESC, time DESC limit 30;
- Run the query to show the data
- You will see useful data points such as: Date/time, Location, Request_IP, Method, status, and Host_Header ..
Table fields description Ref. https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/AccessLogs.html#access-logs-choosing-s3-bucket
A Lambda Function to reduce the log files
We will reduce the number of files inside the S3 access logs bucket by creating a lambda function that will keep the latest 5 logs files generated by CloudFront. The function will be triggered once a new log file is generated
- Go to AWS Lambda console
- Click Create function
- Function name: CF-Latest-logs
- Runtime: Pyhton 3.10
- Architecture: x86_64
- Click Create function
- Replace the lambda function code with the following
import boto3
from datetime import datetime
def lambda_handler(event, context):
s3 = boto3.client('s3')
bucket_name = '<YourBucketName>' # replace the value <YourBucketName> with the access logs bucket name
folder_name = '<LogsFolder>' # replace the value <LogsFolder> with the folder name "cf_accesslogs"
last_x_objects = []
xlogs = 30 # number of the latest log files
# Get a list of all objects in the folder
objects = s3.list_objects_v2(Bucket=bucket_name, Prefix=folder_name)['Contents']
# Sort the objects by last modified date, with the most recent objects first
objects.sort(key=lambda x: x['LastModified'], reverse=True)
# Keep the last recent x objects
for i in range(xlogs):
if i < len(objects):
last_x_objects.append(objects[i]['Key'])
# Delete all objects in the folder except the last recent x objects
for obj in objects:
if obj['Key'] not in last_x_objects:
s3.delete_object(Bucket=bucket_name, Key=obj['Key'])
return {
'statusCode': 200,
'body': 'Objects deleted from S3 folder except the last recent x objects'
}
- Click Deploy
Add S3 Bucket Permissions
- Go to Configuration tab inside the Lambda function
- Select Permissions from the left-side menu
- Click the Role name to add new permissions
- Click Add permissions then Create inline policy
- Click on JSON tab then replace the content with the following
- Replace "CF-ACCESSLOGS-BUCKET" with the respective value
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::CF-ACCESSLOGS-BUCKET/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::CF-ACCESSLOGS-BUCKET"
]
}
]
}
- Click Review policy
- Enter a Name then click Create policy
Trigger The Lambda Function
- Go to AWS S3 console
- Select the CloudFront logs bucket you have created previously
- Click on Properties
- Click Create event notification under Event notifications
- Event name: cf-latest-logs
- Event types: select All object create events
- Destination: select Lambda function
- Select the lambda function "CF-Latest-logs" from the drop-down menu
- Click on Save changes
Conclusion
While the web app is under maintenance, access to the web app domain name will be redirected to the static website through CloudFront distribution, which will save the web traffic logs in the access logs bucket. Next, the Lambda function will be triggered, updating the bucket and retaining the latest xlogs number of log files. Lastly, we can then run Athena query to display the latest logs and analyze the data.
Top comments (0)