Ogooluwa Akinola

Posted on Mar 25

Building a Serverless Social Media Sentiment Analytics Dashboard on AWS

#aws #serverless #ai #machinelearning

Hey there, fellow AWS explorers! Ever wondered how to turn the chaotic chatter of social media into actionable insights? Today, we're diving headfirst into the world of serverless architecture to build a simple analytics dashboard for social media sentiment data.

In this tutorial, we'll walk through building a complete serverless backend solution for a social media sentiment analytics dashboard. We'll be leveraging AWS Lambda, API Gateway, DynamoDB, Amazon Kinesis, and a few other services to create a scalable, cost-effective system. We will be using AWS CloudFormation templates to define and manage our infrastructure as code, which enables version control, reproducibility, and easier collaboration.

Why Serverless?

Now, you might be asking, "Why go serverless?" Great question! Here's why:

No Servers to Manage: Say goodbye to patching, scaling, and the headache of managing infrastructure.
Pay-as-you-go: Only pay for the compute time you consume.
Scalability: AWS handles scaling automatically, so your application can handle any load.
Faster Development: Focus on your code, not infrastructure.
Infrastructure as Code: Using CloudFormation, we define and manage our infrastructure in a declarative way, enabling version control, reproducibility, and easier collaboration.

Let’s get our Hands Dirty

Pre-requisite

AWS Account: If you don't have one, sign up for a free tier account.
IAM User: Create an IAM user with the necessary permissions for Lambda, API Gateway, DynamoDB, etc.
AWS CLI: Install and configure the AWS CLI for command-line access if you haven’t.
Mastodon API Credentials: In this project, we’ll use the Mastodon to source our social media data. Follow this link to obtain your access token.

Note: You can find the complete code here.

Let's get started! We'll break this down into manageable steps.

1. Creating your secrets manager

We’ll create a secrets manager to store our mastodon API credentials.

Navigate to the AWS Secrets Manager console
Click the Store new secrets button
Select Other type of secret
Add your Mastodon API credentials
- MASTODON_INSTANCE_URL
- We use https://mastodon.social for this project
- MASTODON_ACCESS_TOKEN
Save and note the Secret ARN (we will use it later 😉)

2. Creating an S3 bucket for Code storage

We’ll create an S3 bucket to store all code for our lambda functions and cloud formation. This is where we will push our compiled lambda code and cloud formation templates for this project. To do this:

Navigate to the Amazon S3 console.
Click Create a bucket
Provide a unique name for your bucket and use all default settings.
Your bucket URL is https://{YOUR-UNIQUE-BUCKET-NAME}.s3.amazonaws.com
We will use this URL throughout this project

3. Defining the DynamoDB Table with CloudFormation

We'll use DynamoDB to store our sentiment data. Here's the CloudFormation template (dynamodb-table.yaml) that defines our table:

Resources:
  SentimentDataTable:
    Type: 'AWS::DynamoDB::Table'
    Properties:
      TableName: SentimentDataTable
      AttributeDefinitions:
        - 
          AttributeName: DataId
          AttributeType: S
      KeySchema:
        - 
          AttributeName: DataId
          KeyType: HASH
      ProvisionedThroughput:
        ReadCapacityUnits: 5
        WriteCapacityUnits: 5
Outputs:
  SentimentDataTableName:
    Value: !Ref SentimentDataTable
    Description: Name of the Sentiment data table
    Export:
      Name: SentimentDataTableName
  SentimentDataTableArn:
    Value: !GetAtt SentimentDataTable.Arn
    Description: ARN of the Sentiment data table
    Export: 
      Name: SentimentDataTableArn

Explanation:
- The Resources section defines the resources we want to create. Here, we're creating a DynamoDB table named SentimentDataTable.
- Properties specify the table's configuration, including:
- TableName: The name of the table.
- AttributeDefinitions: The attributes that make up the table's schema. We define DataId as a String attribute.
- KeySchema: The primary key for the table. We use DataId as the hash key.
- ProvisionedThroughput: The read and write capacity for the table. For this tutorial, we'll use a basic configuration.
- The Outputs section defines values that are returned when you create the CloudFormation stack. Here, the table name and ARN are exported, which can be used by other stacks.
Deployment: You would deploy this template using the AWS CLI or the AWS Management Console. We will do this later (stay tuned 🙂).
After Deployment, navigate to the AWS console to view the SentimentDataTable.!

4. Building the Lambda Functions

Lambda is the heart of our serverless backend. We'll create three main Lambda functions:

data-collection-function: This function will fetch data from a social media source (Mastodon) and send it to a Kinesis stream.
sentiment-analysis-function: This function will process the data from the Kinesis stream, analyze the sentiment of the text, and store the results in DynamoDB.
api-handlers-function: This function will handle API requests from the frontend, querying the sentiment data from DynamoDB.

4.1 Data Collection Function

Here's the code for the data-collection-function:

import { KinesisClient, PutRecordCommand, PutRecordCommandInput } from "@aws-sdk/client-kinesis";
import { SecretsManagerClient, GetSecretValueCommand } from "@aws-sdk/client-secrets-manager";
import { Handler } from 'aws-lambda';
import { createRestAPIClient } from 'masto';

const kinesisClient = new KinesisClient({});
const secretsManagerClient = new SecretsManagerClient({});

export const handler: Handler = async (event: any): Promise<{ statusCode: number, body: string }> => {
  try {
    // 1. Retrieve API credentials from Secrets Manager
    const secretResponse = await secretsManagerClient.send(
      new GetSecretValueCommand({ SecretId: "social-media-analytics-secrets-manager" })
    );
    const secrets = JSON.parse(secretResponse.SecretString || "{}");

    const masto = createRestAPIClient({
      url: secrets.MASTODON_INSTANCE_URL || '',
      accessToken: secrets.MASTODON_ACCESS_TOKEN || '',
    });

    // 2. Fetch toots based on your criteria (e.g., keywords, hashtags)
    const toots = await masto.v2.search.list({
      q: "crypto",
      type: "statuses",
      limit: 40
    })

    // 3. Iterate through toots and send them to Kinesis
    const kinesisStreamName = "SocialMediaDataStream";
    const encoder = new TextEncoder()

    const res = await Promise.allSettled(toots.statuses?.map((toot) => {
      const text = toot.content.replace(/<[^>]+>/g, '');
      const postId = toot.id;
      const createdAt = toot.createdAt;
      const authorUsername = toot.account.username;

      const data = {
        PostId: postId,
        Text: text,
        CreatedAt: createdAt,
        AuthorUsername: authorUsername,
      }

      // Prepare data for Kinesis
      const recordParams: PutRecordCommandInput = {
        Data: encoder.encode(JSON.stringify(data)),
        PartitionKey: postId,
        StreamName: kinesisStreamName
      };

      return kinesisClient.send(new PutRecordCommand(recordParams));
    }));

    return {
      statusCode: 200,
      body: "Successfully sent toots to Kinesis",
    };
  } catch (error) {
    console.error("Error processing toots:", error);
    return {
      statusCode: 500,
      body: "Error processing toots",
    };
  }
};

Explanation:
- The function uses the SecretsManagerClient to retrieve the Mastodon API credentials. This is a best practice for security, as it avoids hardcoding sensitive information in your code.
- It then uses the Masto library to fetch "toots" (posts) from Mastodon. In this case, we are searching for “crypto” related posts.
- For each toot, it extracts the relevant data (post ID, text, creation date, author) and sends it to a Kinesis stream using the KinesisClient. The PartitionKey is set to the postId for even data distribution across Kinesis shards.
- For error handling, we log into the console and return a status 500.

4.2 Sentiment Analysis Function

Here's the code for the sentiment-analysis-function:

import { ComprehendClient, DetectSentimentCommand } from "@aws-sdk/client-comprehend";
import { DynamoDBClient, PutItemCommand } from "@aws-sdk/client-dynamodb";
import { Handler, KinesisStreamEvent } from "aws-lambda";

const comprehendClient = new ComprehendClient({});
const dynamoDBClient = new DynamoDBClient({});

export const handler: Handler = async (event: KinesisStreamEvent) => {
    try {
      // 1. Process each record (toot)
      for (const record of event.Records || []) {
        if (!record?.kinesis?.data) continue

        const toot: {
          PostId: string,
          Text: string,
          AuthorUsername: string,
          CreatedAt: string
        } = JSON.parse(Buffer.from(record.kinesis.data, 'base64').toString())

        // 3. Detect sentiment using Comprehend
        const sentimentResponse = await comprehendClient.send(
          new DetectSentimentCommand({
            LanguageCode: "en",
            Text: toot.Text,
          })
        );

        // 4. Store toot and sentiment in DynamoDB
        const putItemParams = {
          TableName: "SentimentDataTable",
          Item: {
            DataId: { S: toot.PostId },
            Text: { S: toot.Text },
            AuthorUsername: { S: toot.AuthorUsername },
            CreatedAt: { S: toot.CreatedAt },
            Sentiment: { S: sentimentResponse.Sentiment || "UNKNOWN" },
            SentimentScore: {
              M: {
                Positive: { N: sentimentResponse.SentimentScore?.Positive?.toString() || "0" },
                Negative: { N: sentimentResponse.SentimentScore?.Negative?.toString() || "0" },
                Neutral: { N: sentimentResponse.SentimentScore?.Neutral?.toString() || "0" },
                Mixed: { N: sentimentResponse.SentimentScore?.Mixed?.toString() || "0" },
              },
            },
          },
        };
        await dynamoDBClient.send(new PutItemCommand(putItemParams));
      }
    } catch (error) {
      console.error("Error processing records:", error);
    }
};

Explanation:
- This function is triggered by new data arriving in the Kinesis stream. It uses the KinesisClient to retrieve records from the stream.
- For each record, it parses the data and extracts the toot information.
- It then uses the ComprehendClient to detect the sentiment of the toot's text.
- Finally, it stores the toot data and the sentiment analysis results in the DynamoDB table using the DynamoDBClient.

4.3 API Handlers Function

Here's the code for the api-handlers-function:

import { APIGatewayProxyEvent, APIGatewayProxyResult, Context } from 'aws-lambda';
import { DynamoDBClient, ExecuteStatementCommand } from '@aws-sdk/client-dynamodb';

const TABLE_NAME = 'SentimentDataTable';
const VALID_SENTIMENTS = ['POSITIVE', 'NEGATIVE', 'NEUTRAL', 'MIXED'];
const dynamoDBClient = new DynamoDBClient({});

export const handler = async (event: APIGatewayProxyEvent): Promise<APIGatewayProxyResult> => {
  const headers = {
    "Access-Control-Allow-Headers" : "Content-Type",
    "Access-Control-Allow-Origin": "*",
    "Access-Control-Allow-Methods": "OPTIONS,POST,GET"
}

  try {
    // 1. Extract relevant data from the API Gateway event
    const { httpMethod, requestContext, queryStringParameters } = event;

    // 2. Determine the requested action based on the path or method
    if (httpMethod === 'GET' && requestContext.resourcePath === '/sentiment/{keyword}') {
        // 3. Validate keyword parameter
        const keyword = event.pathParameters?.keyword;
        if (!keyword) {
            return { statusCode: 400, body: JSON.stringify({ error: 'Missing keyword' }) };
        }

        // 4. Validate sentiment parameter
        const sentiment = (queryStringParameters?.sentiment || '').toUpperCase(); 
        if (sentiment && !VALID_SENTIMENTS.includes(sentiment)) {
          return { statusCode: 400, body: JSON.stringify({ error: `Invalid sentitment value: allowed values are ${VALID_SENTIMENTS}` }) };
        }

        // 5. Fetch data from DynamoDB based on the request
        const sentimentData = await getSentimentData(keyword, sentiment);

        // 6. Format the response
        return { statusCode: 200, headers, body: JSON.stringify(sentimentData ?? []) };
    }


    return { statusCode: 404, headers, body: JSON.stringify({ error: 'Not found' }) };
  } catch (error) {
    console.error('Error processing request:', error);
    return { statusCode: 500, headers, body: JSON.stringify({ error: 'Internal server error' }) };
  }
};

const getSentimentData = async (keyword: string, sentiment?: string) => {
    let statement = `SELECT * FROM "${TABLE_NAME}" WHERE Sentiment = '${sentiment}' AND contains(Text, '${keyword}')`

    if (!sentiment) {
        statement = `SELECT * FROM "${TABLE_NAME}" WHERE contains(Text, '${keyword}')`
    }

    // Execute the PartiQL statement
    const command = new ExecuteStatementCommand({
        Statement: statement,
    });

    const response = await dynamoDBClient.send(command);

    return response.Items;
};

Explanation:
- This function handles requests from external users and is triggered by an API Gateway request.
- It extracts the keyword and sentiment parameters from the request.
- It uses the DynamoDBClient to query the SentimentDataTable using a PartiQL SELECT statement. The query filters the data based on the provided keyword and sentiment (optional).
- It then returns the data in a JSON format.
- Note, since we are using the AWS_PROXY Method Integration type (refer to api-gateway.yaml), It is important to send the headers in the response object to prevent CORS errors.

5. Setting up API Gateway with CloudFormation

We'll use CloudFormation to define our API Gateway. Here's the api-gateway.yaml template:

Parameters:
  apiGatewayName:
    Type: String
    Default: SentimentAPI
  apiGatewayStageName:
    Type: String
    AllowedPattern: '[a-z0-9]+'
    Default: dev
  apiGatewayHTTPMethod:
    Type: String
    Default: GET

Resources:
  SentimentAPI:
    Type: AWS::ApiGateway::RestApi
    Properties:
      Name: SentimentAPI
      EndpointConfiguration:
        Types:
          - REGIONAL

  SentimentAPIResource:
    Type: AWS::ApiGateway::Resource
    DependsOn:
      - SentimentAPI
    Properties:
      RestApiId: !Ref SentimentAPI
      ParentId: !GetAtt SentimentAPI.RootResourceId
      PathPart: 'sentiment'

  SentimentKeywordResource:
    Type: AWS::ApiGateway::Resource
    DependsOn:
      - SentimentAPI
    Properties:
      RestApiId: !Ref SentimentAPI
      ParentId: !Ref SentimentAPIResource
      PathPart: '{keyword}'

  SentimentAPIMethod:
    Type: AWS::ApiGateway::Method
    DependsOn:
      - SentimentAPI
      - SentimentAPIResource
    Properties:
      AuthorizationType: NONE
      ApiKeyRequired: false
      HttpMethod: !Ref apiGatewayHTTPMethod
      RequestParameters:
        method.request.path.keyword: true
      MethodResponses:
        - StatusCode: 200
          ResponseModels:
            "application/json": "Empty"
      Integration:
        IntegrationHttpMethod: POST
        Type: AWS_PROXY
        Uri: !Sub 
          - arn:aws:apigateway:${AWS::Region}:lambda:path/2015-03-31/functions/${ApiHandlersFunctionArn}/invocations
          - ApiHandlersFunctionArn: !ImportValue ApiHandlersFunctionArn
      RestApiId: !Ref SentimentAPI
      ResourceId: !Ref SentimentKeywordResource

  SentimentAPIDeployment:
    Type: AWS::ApiGateway::Deployment
    DependsOn: SentimentAPIMethod
    Properties:
      RestApiId: !Ref SentimentAPI
      StageName: !Ref apiGatewayStageName

  SentimentAPIPermission:
    Type: AWS::Lambda::Permission
    DependsOn: SentimentAPI
    Properties:
      Action: lambda:InvokeFunction
      FunctionName: !ImportValue ApiHandlersFunctionName
      Principal: apigateway.amazonaws.com
      SourceArn: !Sub arn:aws:execute-api:${AWS::Region}:${AWS::AccountId}:${SentimentAPI.RestApiId}/*/*

Outputs:
  apiGatewayInvokeURL:
    Value: !Sub https://${SentimentAPI}.execute-api.${AWS::Region}.amazonaws.com/${apiGatewayStageName}

Explanation:
- This template defines the API Gateway and its resources.
- It creates a REST API named SentimentAPI.
- It defines a resource /sentiment/{keyword}, where {keyword} is a path parameter.
- It creates a GET method for this resource.
- The Integration property is crucial:
- Type: AWS_PROXY: This tells API Gateway to forward the entire request to the ApiHandlersFunction Lambda function.
- Uri: This specifies the ARN of the Lambda function to invoke. The !Sub syntax is used to substitute the actual function ARN, which is obtained from the output value of the api-handlers-function CloudFormation stack.
- AWS::Lambda::Permission: This resource grants API Gateway permission to invoke the Lambda function.

6. Tying it All Together with a Main Stack

To simplify deployment, we'll create a "main" CloudFormation stack (main-stack.yaml) that references the other stacks. This helps manage dependencies and ensures resources are created in the correct order.

Parameters:
  BucketName:
    Type: String
    Description: Unique name for the S3 bucket
    Default: {YOUR-UNIQUE-BUCKET-NAME}
  BucketURL:
    Type: String
    Description: S3 bucket URL
    Default: https://{YOUR-UNIQUE-BUCKET-NAME}.s3.amazonaws.com
  EnvVariablesAndCredentials:
    Type: String
    Description: Credentials
    Default: {YOUR-SECRETS-MANAGER-ARN}

Resources:
  KinesisDataStreamStack:
    Type: AWS::CloudFormation::Stack
    Properties:
      TemplateURL: !Sub
        - ${BucketURL}/infrastructure/kinesis-data-stream.yaml
        - BucketURL: !Ref BucketURL

  DataCollectionFunctionStack:
    Type: AWS::CloudFormation::Stack
    Properties:
      TemplateURL: !Sub
        - ${BucketURL}/infrastructure/data-collection-function.yaml
        - BucketURL: !Ref BucketURL
      Parameters:
        BucketName: !Ref BucketName
        EnvVariablesAndCredentials: !Ref EnvVariablesAndCredentials
    DependsOn:
      - KinesisDataStreamStack

  DynamoDBTableStack:
    Type: AWS::CloudFormation::Stack
    Properties:
      TemplateURL: !Sub
        - ${BucketURL}/infrastructure/dynamodb-table.yaml
        - BucketURL: !Ref BucketURL

  SentimentAnalysisFunctionStack:
    Type: AWS::CloudFormation::Stack
    Properties:
      TemplateURL: !Sub
        - ${BucketURL}/infrastructure/sentiment-analysis-function.yaml
        - BucketURL: !Ref BucketURL
      Parameters:
        BucketName: !Ref BucketName
    DependsOn:
      - DynamoDBTableStack
      - KinesisDataStreamStack

  ApiHandlersFunctionStack:
    Type: AWS::CloudFormation::Stack
    DependsOn:
      - DynamoDBTableStack
    Properties:
      TemplateURL: !Sub
        - ${BucketURL}/infrastructure/api-handlers-function.yaml
        - BucketURL: !Ref BucketURL
      Parameters:
        BucketName: !Ref BucketName

  ApiGatewayStack:
    Type: AWS::CloudFormation::Stack
    DependsOn:
      - ApiHandlersFunctionStack
    Properties:
      TemplateURL: !Sub
        - ${BucketURL}/infrastructure/api-gateway.yaml
        - BucketURL: !Ref BucketURL

Explanation:
- This template defines the overall application stack.
- It uses AWS::CloudFormation::Stack resources to reference the other CloudFormation templates (for Kinesis, Data Collection, DynamoDB, Sentiment Analysis, API Handlers, and API Gateway).
- The DependsOn property is used to specify dependencies between the stacks, ensuring they are created in the correct order. For example, the SentimentAnalysisFunctionStack depends on the DynamoDBTableStack and KinesisDataStreamStack because the Sentiment Analysis function needs the DynamoDB table and Kinesis stream to be created first.
- Parameters like BucketName, BucketURL, and EnvVariablesAndCredentials are used to pass configuration values to the nested stacks.

7. Deploying to AWS

We’ll deploy our code to aws via aws-cli.

Since our lambda code is in typescript and has some external dependencies, we will build and compile our code to javascript. We will use esbuild for this, ensure to install it globally or in your project dependencies.

esbuild ./src/index.ts \--bundle \--minify \--sourcemap \--platform=node \--target=es2020 \--outfile=dist/index.js

After building the lambda functions, we’ll zip the bundled code.

cd dist && zip \-r {function-name}.zip index.js\*

Now, we’ll upload our zip files and cloud formation template to s3 Upload zipped files

  aws s3 cp ./dist/{function-name}.zip s3://{YOUR-UNIQUE-BUCKET-NAME}.}/{function-name}.zip

Upload Cloud formation templates

aws s3 cp backend/infrastructure/lib/ s3://{YOUR-UNIQUE-BUCKET-NAME}/infrastructure/ \--recursive

Finally, we will deploy our stack using aws cloud-formation

   aws cloudformation create-stack \  
      --stack-name social-sentiment-backend-stack \ 
      --template-url https://{YOUR-UNIQUE-BUCKET-NAME}.}.s3.amazonaws.com/infrastructure/main-stack.yaml \ 
      --capabilities CAPABILITY\_NAMED\_IAM CAPABILITY\_AUTO\_EXPAND

8. Connecting the Frontend

We will create a simple analytics dashboard. This dashboard will retrieve the sentiment data via the sentiment API (api-gateway) url. To connect the frontend to this API, we would use the URL provided in the CloudFormation stack's output (apiGatewayInvokeURL) or we can navigate to the api-gateway console and get the invoke URL. For example, if the URL is https://your-api-gateway-id.execute-api.us-east-1.amazonaws.com/dev, you would make a GET request to https://your-api-gateway-id.execute-api.us-east-1.amazonaws.com/dev/sentiment/keyword?sentiment=POSITIVE to get all positive sentiments for "keyword".

Note: this tutorial only supports the crypto keyword.

9. Clean up:

To avoid unnecessary and unforeseen costs, it is a good practice to clean up your aws resources. We simply just have to delete our stack by running this 👇🏿CLI command.

aws cloudformation delete-stack --stack-name social-sentiment-backend-stack

Then, navigate to your AWS console to delete your project bucket, and your secrets manager for the project.

And there you have it! We've built a serverless backend for a social media sentiment analytics dashboard using AWS Lambda, API Gateway, DynamoDB, Amazon Kinesis, Eventbridge, and CloudFormation. This is just the beginning. You can further enhance it by adding more features, visualizations, and social media integrations.

Share your thoughts and questions in the comments below.

How I Cut 22.3 Seconds Off an API Call with Sentry 👀

Struggling with slow API calls? Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

DEV Community

Building a Serverless Social Media Sentiment Analytics Dashboard on AWS

Why Serverless?

Let’s get our Hands Dirty

Pre-requisite

1. Creating your secrets manager

2. Creating an S3 bucket for Code storage

3. Defining the DynamoDB Table with CloudFormation

Explanation:

4. Building the Lambda Functions

4.1 Data Collection Function

4.2 Sentiment Analysis Function

4.3 API Handlers Function

5. Setting up API Gateway with CloudFormation

Explanation:

6. Tying it All Together with a Main Stack

Explanation:

7. Deploying to AWS

8. Connecting the Frontend

9. Clean up:

How I Cut 22.3 Seconds Off an API Call with Sentry 👀

Top comments (0)

Your AI Code Assistant

Okay