Intro
Due to the lockdown in place because of COVID-19, I found myself with more spare time than usual so I decided to put it to better use. I'd built plenty of Lambda's over the years but I'd never built a fully serverless solution...so whatever I was building was going to be serverless.
I was having a browse around Github at COVID-19 datasets and came across a repository that contained data for the UK and was updated daily...then I had a lightbulb moment!
I'd seen a ton of amazing dashboards tracking cases all over the globe but my angle was different. I'd ingest the UK data from Github and somehow send out an email showing the increase in confirmed cases, tests and deaths.
I put my AWS cap on and came up with a basic solution...
- A Lambda that ingests a CSV document from Github and stores it in S3, triggered by a CloudWatch event
- A Lambda that reads the CSV doc from S3 and stores the latest data in DynamoDB, utilising DynamoDB as a materialised view
- Another Lambda which is triggered by a DynamoDB Stream and calculates the difference between the old data and new data then sends out an email notification advising of the changes
Then I put the wheels in motion...
First steps - Installing AWS SAM and Creating our Project
We are going to be using the AWS SAM framework to build the pipeline. It's pretty straightforward to get installed and the instructions can be found here.
With SAM installed, we can create a new project by running sam init
and following the CLI prompts.
▶ sam init
Which template source would you like to use?
1 - AWS Quick Start Templates
2 - Custom Template Location
Choice: 1
Which runtime would you like to use?
1 - nodejs12.x
2 - python3.8
3 - ruby2.7
4 - go1.x
5 - java11
6 - dotnetcore3.1
7 - nodejs10.x
8 - python3.7
9 - python3.6
10 - python2.7
11 - ruby2.5
12 - java8
13 - dotnetcore2.1
14 - dotnetcore2.0
15 - dotnetcore1.0
Runtime: 8
Project name [sam-app]: Serverless-Covid-19-Pipeline
Cloning app templates from https://github.com/awslabs/aws-sam-cli-app-templates.git
AWS quick start application templates:
1 - Hello World Example
2 - EventBridge Hello World
3 - EventBridge App from scratch (100+ Event Schemas)
Template selection: 1
-----------------------
Generating application:
-----------------------
Name: Serverless-Covid-19-Pipeline
Runtime: python3.7
Application Template: hello-world
Output Directory: .
Next steps can be found in the README file at ./Covid-19-Serverless-Pipeline/README.md
I chose to write the Lambda's in Python because I'm a big fan of the language and its something I want to use more in my projects even though I'm still learning the language.
With the project created we should have a folder structure like the following...
Serverless-Covid-19-Pipeline/
├── README.md
├── events/
│ └── event.json
├── HelloWorld/
│ ├── __init__.py
│ └── app.py
│ └── requirements.txt
├── template.yaml
├── .gitignore
└── tests/
We can trigger a build by running the sam build
command.
▶ sam build
Building resource 'HelloWorldFunction'
Running PythonPipBuilder:ResolveDependencies
Running PythonPipBuilder:CopySource
Build Succeeded
Built Artifacts : .aws-sam/build
Built Template : .aws-sam/build/template.yaml
Commands you can use next
=========================
[*] Invoke Function: sam local invoke
[*] Deploy: sam deploy --guided
As you can see our first build succeeded. In order to test the function we need to have Docker installed on our machine, we'll do this later when we have our first Lambda function written.
As this project is going to contain multiple Lambdas, we have to create the directories for the two other Lambdas, add the relevant files and rename the existing Lambda folder.
Serverless-Covid-19-Pipeline/
├── README.md
├── events/
│ └── event.json
├── data_ingestion_function/
│ ├── __init__.py
│ └── app.py
│ └── requirements.txt
├── s3_processing_function/
│ ├── __init__.py
│ └── app.py
│ └── requirements.txt
├── ddb_stream_notification/
│ ├── __init__.py
│ └── app.py
│ └── requirements.txt
├── template.yaml
├── .gitignore
└── tests/
I'm not the best person when it comes to naming things, the data_ingestion_function
will contain the Lamba which downloads the data from Github and save it to S3, the s3_processing_function
will contain the Lambda which downloads the data from S3 and saves it to DynamoDB and lastly, the ddb_stream_notification
will contain the Lambda which receives the DynamoDB stream and sends the message to SNS.
First Function - Data Ingestion
The first Lambda function we have to build is responsible for downloading the data from the Github repository and saving it to S3. This Lambda is going to be triggered by a CloudWatch Scheduled Event, so it can be invoked at the same time every day.
We're using the requests
and boto3
libraries in this example so make sure they are included in the requirements.txt
file and install them.
The first thing we want to do open the app.py
file and replace the boilerplate code with our function code.
When the function is invoked it downloads the raw data from Github, writes the data to a CSV file in the tmp folder and then uploads this file to the specified S3 Bucket.
One of the amazing features of the AWS SAM Framework is the ability to test Lambda functions locally before deploying them to AWS.
In order to test our function, we need a mock scheduled event. We can create one by running sam local generate-event cloudwatch scheduled-event
. This will generate a JSON snippet which we can modify as necessary and paste into a new JSON file named scheduled-event.json
within the events folder of the parent directory.
But, we have a little problem.
Creating an S3 Bucket with CloudFormation
When we test this Lambda it will try and upload the file to our S3 Bucket...which we don't have. We need to modify the template.yaml
and deploy our Cloudformation Stack to create the S3 Bucket.
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >
Serverless-Covid-19-Pipeline
SAM Template for Serverless-Covid-19-Pipeline
Parameters:
S3Bucket:
Type: String
Default: danieljameskay-data-ingestion-bucket
Resources:
DataIngestionBucket:
Type: AWS::S3::Bucket
Properties:
BucketName: !Ref S3Bucket
To deploy our CloudFormation Stack, we need to run sam deploy --guided
, this will allow us to save values for deployment which will be used for future deployments.
$ sam build && sam deploy --guided
Build Succeeded
Built Artifacts : .aws-sam/build
Built Template : .aws-sam/build/template.yaml
Commands you can use next
=========================
[*] Invoke Function: sam local invoke
[*] Deploy: sam deploy --guided
Configuring SAM deploy
======================
Looking for samconfig.toml : Not found
Setting default arguments for 'sam deploy'
=========================================
Stack Name [sam-app]: serverless-covid-19-pipeline
AWS Region [us-east-1]: eu-west-2
#Shows you resources changes to be deployed and require a 'Y' to initiate deploy
Confirm changes before deploy [y/N]: y
#SAM needs permission to be able to create roles to connect to the resources in your template
Allow SAM CLI IAM role creation [Y/n]: y
Save arguments to samconfig.toml [Y/n]: y
Looking for resources needed for deployment: Found!
Managed S3 bucket: aws-sam-cli-managed-default-samclisourcebucket-1nw0fx8sanitw
A different default S3 bucket can be set in samconfig.toml
Saved arguments to config file
Running 'sam deploy' for future deployments will use the parameters saved above.
The above parameters can be changed by modifying samconfig.toml
Learn more about samconfig.toml syntax at
https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-config.html
Deploying with following values
===============================
Stack name : serverless-covid-19-pipeline
Region : eu-west-2
Confirm changeset : True
Deployment s3 bucket : aws-sam-cli-managed-default-samclisourcebucket-1nw0fx8sanitw
Capabilities : ["CAPABILITY_IAM"]
Parameter overrides : {}
Initiating deployment
=====================
Waiting for changeset to be created..
CloudFormation stack changeset
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Operation LogicalResourceId ResourceType
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Add DataIngestionBucket AWS::S3::Bucket
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Changeset created successfully. arn:aws:cloudformation:eu-west-2:983527076849:changeSet/samcli-deploy1586858125/37a6647c-ef63-4175-8b5c-dd28a8bd201c
Previewing CloudFormation changeset before deployment
======================================================
Deploy this changeset? [y/N]: y
2020-04-14 10:56:15 - Waiting for stack create/update to complete
CloudFormation events from changeset
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ResourceStatus ResourceType LogicalResourceId ResourceStatusReason
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CREATE_IN_PROGRESS AWS::S3::Bucket DataIngestionBucket -
CREATE_IN_PROGRESS AWS::S3::Bucket DataIngestionBucket Resource creation Initiated
CREATE_COMPLETE AWS::S3::Bucket DataIngestionBucket -
CREATE_COMPLETE AWS::CloudFormation::Stack serverless-covid-19-pipeline -
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Successfully created/updated stack - serverless-covid-19-pipeline in eu-west-2
So now if we head over to the AWS Console and open the S3 Explorer we should see the newly created Bucket. Good job!
Updating the Lambda configuration in the Cloudformation template
We need to update the Cloudformation template with our Data Ingestion function configuration.
Resources:
DataIngestionFunction:
Type: AWS::Serverless::Function
Properties:
FunctionName: DataIngestionFunction
CodeUri: data_ingestion_function/
Handler: app.lambda_handler
Runtime: python3.7
Policies:
- AmazonS3FullAccess
- AWSLambdaBasicExecutionRole
- AWSXrayWriteOnlyAccess
Tracing: Active
Environment:
Variables:
AWS_DESTINATION_BUCKET: !Ref S3Bucket
DATA_URL: https://raw.githubusercontent.com/tomwhite/covid-19-uk-data/master/data/covid-19-totals-uk.csv
We have assigned some basic policies to the Lambda function and specified the Bucket name and URL of the Github data as environment variables.
Now we should be able to build and test the function...
$ sam build && sam local invoke "DataIngestionFunction" -e events/scheduled-event.json
Building resource 'DataIngestionFunction'
Running PythonPipBuilder:ResolveDependencies
Running PythonPipBuilder:CopySource
Build Succeeded
Built Artifacts : .aws-sam/build
Built Template : .aws-sam/build/template.yaml
Commands you can use next
=========================
[*] Invoke Function: sam local invoke
[*] Deploy: sam deploy --guided
Invoking app.lambda_handler (python3.7)
Fetching lambci/lambda:python3.7 Docker container image......
Mounting /Users/daniel.kay/dev/Serverless-Covid-19-Pipeline/Serverless-Covid-19-Pipeline/.aws-sam/build/DataIngestionFunction as /var/task:ro,delegated inside runtime container
[INFO] 2020-04-14T13:28:13.880Z Found credentials in environment variables.
START RequestId: 46f1780b-7e04-124f-f233-6ce13fb27a11 Version: $LATEST
[INFO] 2020-04-14T13:28:14.452Z 46f1780b-7e04-124f-f233-6ce13fb27a11 Requesting data from Github
[INFO] 2020-04-14T13:28:14.946Z 46f1780b-7e04-124f-f233-6ce13fb27a11 Data downloaded successfully
[INFO] 2020-04-14T13:28:15.758Z 46f1780b-7e04-124f-f233-6ce13fb27a11 Data uploaded to S3
END RequestId: 46f1780b-7e04-124f-f233-6ce13fb27a11
REPORT RequestId: 46f1780b-7e04-124f-f233-6ce13fb27a11 Init Duration: 1513.60 ms Duration: 1310.72 ms Billed Duration: 1400 ms Memory Size: 128 MB Max Memory Used: 43 MB
{"message":"Lambda completed"}
Looking good, we can see from the log lines that the data was downloaded and uploaded to S3. If we navigate to the S3 Explorer and open up the Bucket we should see the CSV file.
We can now download the file and verify the contents. The file should contain data up until the previous day.
Configuring a Scheduled Events
So our Data Ingestion Lambda works perfectly. It downloads the data and it uploads it to S3. But we don't want to invoke the function manually, we want it invoked at a particular time of the day. We can utilize a Cloudwatch Scheduled Event to invoke the function for us.
We have to update our Cloudformation Template to include the event rule and permission for EventBridge to invoke the Lambda function.
DataIngestionScheduledRule:
Type: AWS::Events::Rule
Properties:
Name: DataIngestionScheduledRule
Description: Triggers the Data Ingestion lambda once per day
ScheduleExpression: cron(45 14 * * ? *)
State: ENABLED
Targets:
-
Arn: !GetAtt DataIngestionFunction.Arn
Id: DataIngestionScheduledRule
DataIngestionInvokePermission:
Type: AWS::Lambda::Permission
Properties:
FunctionName: !Ref DataIngestionFunction
Action: lambda:InvokeFunction
Principal: events.amazonaws.com
SourceArn: !GetAtt DataIngestionScheduledRule.Arn
We are setting the scheduled expression to be a cron expression which will invoke the function daily at 1545 GMT due to daylight saving in the UK 😄. The target is the Data Ingestion function as this is the function we want to be invoked.
Right, it's ready to be built and deployed...
$ sam build && sam deploy
Building resource 'DataIngestionFunction'
Running PythonPipBuilder:ResolveDependencies
Running PythonPipBuilder:CopySource
Build Succeeded
Built Artifacts : .aws-sam/build
Built Template : .aws-sam/build/template.yaml
Commands you can use next
=========================
[*] Invoke Function: sam local invoke
[*] Deploy: sam deploy --guided
Deploying with following values
===============================
Stack name : serverless-covid-19-pipeline
Region : eu-west-2
Confirm changeset : True
Deployment s3 bucket : aws-sam-cli-managed-default-samclisourcebucket-1nw0fx8sanitw
Capabilities : ["CAPABILITY_IAM"]
Parameter overrides : {}
Initiating deployment
=====================
Uploading to serverless-covid-19-pipeline/175b90da9d229e544ad442eeb0c9e280.template 1711 / 1711.0 (100.00%)
Waiting for changeset to be created..
CloudFormation stack changeset
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Operation LogicalResourceId ResourceType
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ Add DataIngestionInvokePermission AWS::Lambda::Permission
+ Add DataIngestionScheduledRule AWS::Events::Rule
* Modify DataIngestionFunctionRole AWS::IAM::Role
* Modify DataIngestionFunction AWS::Lambda::Function
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Changeset created successfully. arn:aws:cloudformation:eu-west-2:983527076849:changeSet/samcli-deploy1586873775/cd0112a5-5534-4d7e-95b7-a64503823b64
Previewing CloudFormation changeset before deployment
======================================================
Deploy this changeset? [y/N]: y
2020-04-14 15:16:25 - Waiting for stack create/update to complete
CloudFormation events from changeset
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ResourceStatus ResourceType LogicalResourceId ResourceStatusReason
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
UPDATE_IN_PROGRESS AWS::Lambda::Function DataIngestionFunction Requested update requires the creation of
a new physical resource; hence creating
one.
UPDATE_COMPLETE AWS::Lambda::Function DataIngestionFunction -
UPDATE_IN_PROGRESS AWS::Lambda::Function DataIngestionFunction Resource creation Initiated
CREATE_IN_PROGRESS AWS::Events::Rule DataIngestionScheduledRule -
CREATE_IN_PROGRESS AWS::Events::Rule DataIngestionScheduledRule Resource creation Initiated
CREATE_COMPLETE AWS::Events::Rule DataIngestionScheduledRule -
CREATE_IN_PROGRESS AWS::Lambda::Permission DataIngestionInvokePermission Resource creation Initiated
CREATE_IN_PROGRESS AWS::Lambda::Permission DataIngestionInvokePermission -
CREATE_COMPLETE AWS::Lambda::Permission DataIngestionInvokePermission -
UPDATE_COMPLETE_CLEANUP_IN_PROGRESS AWS::CloudFormation::Stack serverless-covid-19-pipeline -
UPDATE_COMPLETE AWS::CloudFormation::Stack serverless-covid-19-pipeline -
DELETE_COMPLETE AWS::Lambda::Function DataIngestionFunction -
DELETE_IN_PROGRESS AWS::Lambda::Function DataIngestionFunction -
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Successfully created/updated stack - serverless-covid-19-pipeline in eu-west-2
We can see from the above that the invoke permissions and the scheduled rule have been created successfully. To validate this, we can head into CloudWatch Console and check out the Rules within the Events section.
If we navigate to the Lambda Console in AWS and open our Lambda from within the Functions section, we should see that the CloudWatch Event is now set up as a trigger for this Lambda.
So how do we know if this function was in fact triggered at 1545?
We can do this in a few different ways:
- We can check the timestamp on the CSV file that was created when we were testing as this will have a new modified date and time
- If the file wasn't there to begin with there should be a CSV file in the bucket
- We can check the CloudWatch logs
- CloudWatch ServiceLens (my favorite)
If we check the CloudWatch logs it will look very similar to the log lines we saw when we were testing earlier.
We can see the data was downloaded and uploaded to S3 without any issues.
CloudWatch ServiceLens provides us with complete observability of our applications and services by providing logs, metrics, tracing and alarms in one friendly dashboard.
I'm planning on writing another post about ServiceLens at a later date.
Second Function - Data Processing
We are now going to start working on the second Lambda function. This function will be responsible for listening to S3 events from the Bucket we created earlier, using the event to download the CSV file and insert the data for the most recent date into DynamoDB.
As we've covered how we interact with the AWS SAM CLI, these steps won't be as heavily detailed as they were in previous sections
First things first, we need a DynamoDB table. We can update our CloudFormation template to include a DynamoDB resource type with some basic attributes.
Whilst we are doing this we can also add our Lambda configuration.
DynamoDBTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: danieljameskay-covid19-data
KeySchema:
-
AttributeName: country
KeyType: HASH
AttributeDefinitions:
-
AttributeName: country
AttributeType: S
ProvisionedThroughput:
ReadCapacityUnits: 5
WriteCapacityUnits: 5
StreamSpecification:
StreamViewType: NEW_AND_OLD_IMAGES
S3ProcessingFunction:
Type: AWS::Serverless::Function
Properties:
FunctionName: S3ProcessingFunction
CodeUri: s3_processing_function/
Handler: app.lambda_handler
Runtime: python3.7
Tracing: Active
Environment:
Variables:
AWS_DOWNLOAD_BUCKET: !Ref S3Bucket
DDB_TABLE: danieljameskay-covid19-data
Policies:
- AWSLambdaBasicExecutionRole
- AWSXrayWriteOnlyAccess
- AmazonDynamoDBFullAccess
- AmazonS3FullAccess
Events:
s3Notification:
Type: S3
Properties:
Bucket: !Ref DataIngestionBucket
Events: s3:ObjectCreated:*
We are specifying the key of our table is the country
. The type is hash
which is also known as the partition key. We're also specifying a StreamSpecification
of type NEW_AND_OLD_IMAGES
which means that when the values for the country are updated, DynamoDB will stream the new value and old value to any listeners...in our case, our Notification Lambda.
We should now be able to run sam build && sam deploy
to deploy our DynamoDB Table.
Now the code...
When the function is invoked it downloads the CSV file from Github and stores it in the tmp directory, processes it and inserts the most recent record into DynamoDB along with the date that the data corresponds to.
There's probably a nicer way to write some of this logic but as I'm still learning Python...it does the job 👨💻
As we did earlier we can test this Lambda by running sam build "S3ProcessingFunction" && sam local invoke "S3ProcessingFunction" -e events/s3-put-event.json
We can verify the Lambda has worked correctly by verifying the log lines...
[INFO] 2020-04-16T10:25:10.372Z bfcc35ee-2908-1a44-afe0-e02816837dbc Downloading CSV file from S3
[INFO] 2020-04-16T10:25:10.663Z bfcc35ee-2908-1a44-afe0-e02816837dbc File downloaded to tmp directory
[INFO] 2020-04-16T10:25:10.669Z bfcc35ee-2908-1a44-afe0-e02816837dbc Inserting record into DynamoDB...
[INFO] 2020-04-16T10:25:10.937Z bfcc35ee-2908-1a44-afe0-e02816837dbc Record added to DynamoDB
and checking DynamoDB via the AWS CLI...
$ aws dynamodb scan --table-name danieljameskay-covid19-data
{
"Count": 1,
"Items": [
{
"date": {
"S": "04-13-2020"
},
"country": {
"S": "UK"
},
"confirmed_cases": {
"S": "88621"
},
"tests": {
"S": "290720"
},
"deaths": {
"S": "11329"
}
}
],
"ScannedCount": 1,
"ConsumedCapacity": null
}
We can see that our test has successfully run and the data has been inserted into DynamoDB.
We should now be able to run sam build && sam deploy
to deploy our new Lambda.
Lets see if what we've done so far works 🤔
So, we've tested our 2 Lambdas locally and 1 of them in AWS. We can wait until the time we set earlier to see if our Lambda's work or we can trigger the Data Ingestion Lambda from the AWS CLI...
$ aws lambda invoke --function-name DataIngestionFunction out --log-type None
{
"ExecutedVersion": "$LATEST",
"StatusCode": 200
}
and then we can check DynamoDB.
$ aws dynamodb scan --table-name danieljameskay-covid19-data
{
"Count": 1,
"Items": [
{
"date": {
"S": "04-15-2020"
},
"country": {
"S": "UK"
},
"confirmed_cases": {
"S": "98476"
},
"tests": {
"S": "313769"
},
"deaths": {
"S": "12868"
}
}
],
"ScannedCount": 1,
"ConsumedCapacity": null
}
Looking good! So we manually triggered the Data Ingestion Lambda which in turn triggered the S3 Processing Lambda using the S3 Event which added the new data to the Table.
Third Function - Notification
We have a Lambda which downloads the data and another which processes this data and saves it to DynamoDB.
The last Lambda in this solution listens to a stream from our DynamoDB Table which contains the old data and the new data. Using this data it works out the percent difference between the two datasets and builds a message which is then used to be sent out using AWS SNS to an email address.
Right, CloudFormation additions...
DDBNotificationFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: ddb_notification_function
FunctionName: 'DDBNotificationFunction'
Handler: app.lambda_handler
Runtime: python3.7
Tracing: Active
Environment:
Variables:
SNS_TOPIC: !Ref SNSTopic
Policies:
- AmazonSNSFullAccess
- AWSLambdaBasicExecutionRole
- AWSXrayWriteOnlyAccess
Events:
Stream:
Type: DynamoDB
Properties:
Stream: !GetAtt DynamoDBTable.StreamArn
StartingPosition: TRIM_HORIZON
SNSTopic:
Type: AWS::SNS::Topic
Properties:
TopicName: "ddb-stream-covid-19-daily-data"
SNSSubscription:
Type: AWS::SNS::Subscription
Properties:
Endpoint: 4i4nbyt@2go-mail.com
Protocol: email
TopicArn: !Ref 'SNSTopic'
Here we have our last Lambda configuration. The only thing to really point out here is we are specifying DynamoDB as the type of event source for this Lambda and the Stream ARN of our DynamoDB Table.
Lastly, we are creating an SNS Topic and Subscription. Our Lambda will send the message to the SNS Topic and the Subscription will be responsible for sending it to our temporary email address.
Code time people...
We are retrieving the old and new records from the event which triggered the Lambda function. Then we establish the percentage difference between the old attributes and new attributes and finally publishing a message with the percentage change to SNS.
We should now be able to run sam build && sam deploy
to deploy our new Lambda.
Testing from the Lambda Console
In the previous sections, we tested our functions from the AWS CLI and AWS SAM CLI but we can invoke our function from Lambda Console.
With the Notification Lambda open in the Lambda Console, we can click "Test" followed by "Configure Test Events". Using the "Hello World" template, we can just overwrite the JSON payload with a JSON payload generated by running sam local generate-event dynamodb update
and changing the values.
Below is an example of the event that the DynamoDB Notification Lambda receives when the data is changed.
{
"Records": [
{
"eventID": "24df338c43fa112256a16d7d13c6eb14",
"eventName": "MODIFY",
"eventVersion": "1.1",
"eventSource": "aws:dynamodb",
"awsRegion": "eu-west-2",
"dynamodb": {
"ApproximateCreationDateTime": 1587043428,
"Keys": {
"country": {
"S": "UK"
}
},
"NewImage": {
"date": {
"S": "04-13-2020"
},
"country": {
"S": "UK"
},
"confirmed_cases": {
"S": "98621"
},
"tests": {
"S": "291720"
},
"deaths": {
"S": "12329"
}
},
"OldImage": {
"date": {
"S": "04-12-2020"
},
"country": {
"S": "UK"
},
"confirmed_cases": {
"S": "88621"
},
"tests": {
"S": "290720"
},
"deaths": {
"S": "11329"
}
},
"SequenceNumber": "5571000000000004914106821",
"SizeBytes": 125,
"StreamViewType": "NEW_AND_OLD_IMAGES"
},
"eventSourceARN": "arn:aws:dynamodb:eu-west-2:983527076849:table/danieljameskay-covid19-data/stream/2020-04-15T09:00:54.674"
}
]
}
We can then use this saved event and some fake data to trigger our Lambda.
As our Lambda has successfully run we should have received an email. Let's check our inbox...
Oh yeah 😎😎😎
And there we have it, the email with the percentage changes. It looks amazing, doesn't it?
End to end
We haven't seen the solution work end to end as of yet, let's change that!
I modified the time for the DataIngestionScheduledRule
to 10 minutes in the future and altered the DynamoDB record so it had the values for the 15th April so when the Data Ingestion Lambda ran it would pull down the data for the 16th April.
So I deployed the changes, made a coffee and came back to see the email in my inbox with the difference between the two days.
I also checked SerivceLens and everything had run smoothly with no errors :)
Wrapping Up
Working on this has been extremely rewarding. I know my way around AWS pretty well but, I've never used CloudFormation before and I'm still learning Python which added an extra challenge.
We could have used LocalStack and local DynamoDB to aid with the testing locally so we wouldn't have to interact with AWS until we deployed.
Things I would have liked to do and may do in the future:
- Utilize AWS Amplify and build a React/GraphQL application that interacts with the full UK dataset as opposed to the most recent dates
- Use some more datasets for USA and Europe
- Use the X-Ray SDK to track AWS SDK calls
- Learn more about CloudFormation and tidy up the
template.yaml
file
With everything that's been going on with COVID-19 building this solution and writing it up has kept me preoccupied in these uncertain times. Hopefully, there will be another post or two coming in the future.
Big thanks to Tom White for the data used in this post. For anyone interested in his Github profile, it can be found here.
Cover Photo by Dhaya Eddine Bentaleb on Unsplash.
I hope everyone who has read this post has enjoyed it and if anyone has any questions drop me a comment or drop a tweet!
Stay safe!
Cheers
DK
Top comments (0)