loading...
Cover image for Build a Serverless Twitter Dashboard using DynamoDB, APIGW, and Highcharts

Build a Serverless Twitter Dashboard using DynamoDB, APIGW, and Highcharts

z0ph profile image Victor GRENU ・3 min read

TL;DR

HighCharts -> APIGW -> DynamoDB + Lambda function 🎉 cockpit.zoph.io

Introduction

Lately, I was working on a new version, API Based, for my Twitter Cockpit. In the previous version, Highcharts was loading data from some flat csv files.

Cockpit

The goal of this cockpit is to retrieve and store unlimited history of specific Twitter Accounts based on a Twitter List. It means that you can control it from Twitter which accounts you want to graph the follower/following history. When you are using Twitter Analytics, the history only compares followers (not following) to the previous 28 days period, and the graph is in fact a non-clickable thumbnail image, a very poor experience for free users.

Twitter Analytics Thumbnail

It was for me an opportunity to work on a DynamoDB data modeling, and craft an API Gateway from scratch. 🧑‍🏭

Technical Stack

  1. CloudFront + S3 Origin + ACM Certificate
  2. API Gateway
  3. DynamoDB Tables
  4. IAM Roles
  5. Github Actions as CI/CD Pipeline

Architecture Schema

Architecture Schema

DynamoDB Tables

You will need to prepare carefully your data model here, and understand clearly how DynamoDB is working under the hood.

For my use-case, I've decided to use two DynamoDB tables, one for users statistics, and another one for user-list, used later for the <select> on the webapp.

Stats table

The data model for this table is pretty simple, I'm using the following attributes:

  1. scan_date as the hash key
  2. screen_name as the range (sort) key

As my scan_date is not unique, I was needed to select a range key as well.

Beware of the reserved words on DynamoDB.

Now, I need to query the table by the user, because my graph is on a per user basis. To do so, I've used a Global Secondary Index (GSI):

  • screen_name as the hash key
  • scan_date as the sort key, as my data needs to be ordered by date for HighCharts

The advantage of the index, is that you only need to specify one attribute when you are querying this index, in my case, it was screen_name.

Users table

The goal of this table is to store for later expose the list of current lambda crawled users (from the Twitter list)

The data model for this table is pretty straightforward: screen_name only, as this attribute will be unique. That's all.

At this stage, for users table, I'm using a Scan on DynamoDB to retrieve the full list of users.

API Gateway

To get data from DynamoDB, we will need to create a GET API method, and apply some transformation using Apache's Velocity Templating Language (VTL). Warning the AWS Service Method to interact with DynamoDB is a POST Method.

Integrated Request - Mapping Template

In this mapping, we are passing the input param to the KeyConditionExpression, and we query the index (GSI).

{
    "TableName": "twitter-cockpit-sandbox",
    "IndexName": "screen_name-index",
    "KeyConditionExpression": "screen_name = :v1",
    "ExpressionAttributeValues": {
        ":v1": {
            "S": "$input.params('user')"
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Integrated Response

For each Item I need to format the response from the API Gateway in DynamoDB Format, using Integrated Response, to format as you want the expected result to fit with the requirements of Highcharts.

#set($inputRoot = $input.path('$'))
{
    "cockpit": [
        #foreach($elem in $inputRoot.Items) {
            "scan_date": "$elem.scan_date.S",
            "screen_name": "$elem.screen_name.S",
            "followers": "$elem.followers.N",
            "following": "$elem.following.N"
        }#if($foreach.hasNext),#end
    #end
    ]
}
Enter fullscreen mode Exit fullscreen mode

FrontEnd

This part was the hardest part for me, as I'm more the backend / infrastructure guys...

I'm only using some HTML and Javascript (jQuery) + CSS.

Conclusion

The integration between API Gateway and DynamoDB is pretty cool and easy to manipulate, you don't need to use a proxy Lambda function for every use-cases. This architecture is scalable, managed, and cost-efficient.

The source code for this workload is not yet open-sourced, as some elements are not yet fully-automated, please tell me if you are interested to get early access.

That's all folks! Don't hesitate to Ask Me Anything (AMA) on comments or on Twitter.

zoph.

Discussion

pic
Editor guide