DEV Community

Cover image for Building a serverless file manager
Jimmy Dahlqvist for AWS Community Builders

Posted on • Originally published at jimmydqv.com

Building a serverless file manager

In my previous post Serverless and event-driven design thinking I used a fictitious service, serverless file-manager, as example. In the post we only scratched the surface around the design and building of said service.

In this post we will go a bit deeper in the design and actually build the service using an serverless and event-driven architecture.

File manager service overview

The service will store files in S3 and keep a record of all the files in a DynamoDB table, each file belongs to a specific user. We also need to keep track of each users total storage size. The system overview would look like this, with and API exposing functionality to the user and then carry out the work in a serverless and event-driven way.

Image showing the file-manager architecture overview.

Required infrastructure

To start with we should create some of the common infrastructure components that are needed in this service. We need an S3 bucket, for storing files, a custom Eventbridge event-bus to act as event-broker. We will be using both a custom event-bus and the default event-bus. More on that design further down. On the S3 bucket we need to EventBridge notifications. We will be using an ApiGateway HTTP API as the front door that clients will interact with.

  EventBridge:
    Type: AWS::Events::EventBus
    Properties:
      Name: !Sub ${Application}-eventbus

  FileBucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketEncryption:
        ServerSideEncryptionConfiguration:
          - ServerSideEncryptionByDefault:
              SSEAlgorithm: AES256
      NotificationConfiguration:
        EventBridgeConfiguration:
          EventBridgeEnabled: True
      BucketName: !Sub ${Application}-files

  HttpApi:
    Type: AWS::Serverless::HttpApi
    Properties:
      DefinitionBody:
        Fn::Transform:
          Name: AWS::Include
          Parameters:
            Location: ./api.yaml

Enter fullscreen mode Exit fullscreen mode

Inventory table design

The design of the inventory DynamoDB table is fairly easy, so let's just have a quick look at it. I used a single table design where all different kind of data will live in the same table. We need a record of a users all files and a users total storage. We will be using a composite primary key consisting of a partition and sort key. Since I opted in for a single table design I will name the partition key PK and the sort key SK. That way I can store different data without being confused by the attribute name.

  InventoryTable:
    Type: AWS::DynamoDB::Table
    Properties:
      TableName: !Sub ${Application}-file-inventory
      BillingMode: PAY_PER_REQUEST
      AttributeDefinitions:
        - AttributeName: PK
          AttributeType: S
        - AttributeName: SK
          AttributeType: S
      KeySchema:
        - AttributeName: PK
          KeyType: HASH
        - AttributeName: SK
          KeyType: RANGE
Enter fullscreen mode Exit fullscreen mode

When data is stored PK will be username and SK will be the full file path in case of a file. In case of the total storage the SK will have the static value 'Quota'.

If you want to learn more about DynamoDB and single table design I would recommend The DynamoDB book by Alex DeBrie

Cloud design

As I mentioned last time I normally use a simplified version of EventStorming when starting the design. During that process four different flows was identified: Put (Create and Update), Fetch, List, and Delete. The Fetch and List flows will actually be synchronous request-response design, Put will contain both a synchronous request-response and asynchronous event design, and Delete will be a asynchronous event design. In the event design we will be using both commands and events.

Let's now look at each of the flows and how to implement them.

Put (Create and Update)

The put-flow, or create and update, is initiated by a client calling the front door API. The API will then create a pre-signed S3 url that the client can use to upload the file. This part is a classic request-response model. What is the reason for using pre-signed S3 URL that the client upload to? Why not just post the file directly over the API? The primary reason is the Amazon API gateway payload quota. The max payload size is 10mb and to support all kinds of files this will become a limitation. The second reason would be that the backend logic can create a structure in S3 and organize files in the structure that is preferred.

When the client then uploads the file to S3 this will generate an event, the part below the dashed line in the image, this will invoke a StepFunction that will update the file inventory and also update the the users total storage. We will dive deep into this StepFunction a bit further down.

Image showing the cloud design for create file.

We need to hook up an Lambda function to our API Gateway and implement our logic to pre-sign the S3 url.

  CreateFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: Api/Create
      Handler: create.handler
      Policies:
        - S3CrudPolicy:
            BucketName: !Ref FileBucket
        - DynamoDBReadPolicy:
            TableName: !Ref InventoryTable
      Environment:
        Variables:
          BUCKET: !Ref FileBucket
      Events:
        Create:
          Type: HttpApi
          Properties:
            Path: /files/create
            Method: post
            ApiId: !Ref HttpApi
Enter fullscreen mode Exit fullscreen mode

import json
import boto3
import os

def handler(event, context):
    body = json.loads(event["body"])
    bucket = os.environ["BUCKET"]

    if not body["path"].endswith("/") and len(body["path"]) > 0:
        body["path"] = body["path"] + "/"

    fullpath = body["user"] + "/" + body["path"] + body["name"]

    client = boto3.client("s3")
    response = client.generate_presigned_url(
        "put_object",
        Params={"Bucket": bucket, "Key": fullpath},
        ExpiresIn=600,
    )

    return {"url": response}

Enter fullscreen mode Exit fullscreen mode

Fetch

Now let's look at the Fetch flow, basically downloading a file. This is implemented as 100% request-response as the client need the response with the file back. It's not possible to do this in an event-driven way. Or it would be if you instead send the user a link by e-mail to download the file. That is a good approach if there need to be some form of processing done before downloading.

Once again the API will return a pre-signed S3 url that the client can use to download the file. A different approach to that would to use the new Lambda streaming response and just use a Lambda Function URL with CloudFront. But now we use API Gateway so let's stick to the pre-signed url pattern.

Image showing the cloud design for fetching file.

So let's create the Lambda Function and logic that will handle this flow.

  FetchFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: Api/Fetch
      Handler: fetch.handler
      Policies:
        - S3CrudPolicy:
            BucketName: !Ref FileBucket
      Environment:
        Variables:
          BUCKET: !Ref FileBucket
          INVENTORY_TABLE: !Ref InventoryTable
      Events:
        Fetch:
          Type: HttpApi
          Properties:
            Path: /files/fetch
            Method: post
            ApiId: !Ref HttpApi
Enter fullscreen mode Exit fullscreen mode
import json
import boto3
import os

def handler(event, context):
    body = json.loads(event["body"])
    bucket = os.environ["BUCKET"]

    if not body["path"].endswith("/") and len(body["path"]) > 0:
        body["path"] = body["path"] + "/"

    fullpath = body["user"] + "/" + body["path"] + body["name"]

    if check_file_exists(body["user"], fullpath):
        client = boto3.client("s3")
        response = client.generate_presigned_url(
            "get_object",
            Params={"Bucket": bucket, "Key": fullpath},
            ExpiresIn=600,
        )

        return {"url": response}

    else:
        return {"status": 404, "body": "File not found"}


def check_file_exists(user, path):
    table = os.environ["INVENTORY_TABLE"]

    dynamodb_client = boto3.client("dynamodb")
    response = dynamodb_client.query(
        TableName=table,
        KeyConditionExpression="PK = :pk AND SK = :sk",
        ExpressionAttributeValues={":pk": {"S": user}, ":sk": {"S": path}},
    )

    return len(response["Items"]) > 0

Enter fullscreen mode Exit fullscreen mode

List

Just as the Fetch flow the List flow is also a classical request-response, for the same reason, the client expect to get a list of files back. We will query the file inventory to get the actual list and return it to the client.

Image showing the cloud design for listing files.

So let's create infrastructure and logic that will handle this flow.

  ListFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: Api/List
      Handler: list.handler
      Policies:
        - DynamoDBReadPolicy:
            TableName: !Ref InventoryTable
      Environment:
        Variables:
          INVENTORY_TABLE: !Ref InventoryTable
      Events:
        Fetch:
          Type: HttpApi
          Properties:
            Path: /files/list
            Method: post
            ApiId: !Ref HttpApi
Enter fullscreen mode Exit fullscreen mode
import json
import boto3
import os


def handler(event, context):
    body = json.loads(event["body"])
    table = os.environ["INVENTORY_TABLE"]

    dynamodb_client = boto3.client("dynamodb")
    response = dynamodb_client.query(
        TableName=table,
        KeyConditionExpression="PK = :user AND begins_with ( SK , :filter )",
        ExpressionAttributeValues={
            ":user": {"S": body["user"]},
            ":filter": {"S": f'{body["user"]}/{body["filter"]}'},
        },
    )

    file_list = []

    for item in response["Items"]:
        file = {
            "Path": item["Path"]["S"].replace(f'{body["user"]}/', ""),
            "Size": item["Size"]["N"],
            "Etag": item["Etag"]["S"],
        }
        file_list.append(file)

    return file_list
Enter fullscreen mode Exit fullscreen mode

Delete

In the delete flow we are starting to touch on a fully event-driven flow. In this flow the delete command will be directly added onto an EventBridge custom event-bus, we use the storage-first pattern. Directly after adding the command to the bus API Gateway will return a HTTP 200, Status OK, to the client. All the work is then carried out in an asynchronous manner.

Image showing the cloud design for delete a file.

We create an Open API specification to setup the direct integration between ApiGateway and EventBridge. When doing that we at the same time need to give ApiGateway permissions to call Eventbridge.

openapi: 3.0.1
info:
  title: File Management API
paths:
  /files/delete:
    post:
      responses:
        default:
          description: Send delete command to EventBridge
      x-amazon-apigateway-integration:
        integrationSubtype: EventBridge-PutEvents
        credentials:
          Fn::GetAtt: [ApiRole, Arn]
        requestParameters:
          Detail: $request.body
          DetailType: DeleteFile
          Source: FileApi
          EventBusName:
            Fn::GetAtt: [EventBridge, Name]
        payloadFormatVersion: 1.0
        type: aws_proxy
        connectionType: INTERNET
x-amazon-apigateway-importexport-version: 1.0
Enter fullscreen mode Exit fullscreen mode
  ApiRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service: apigateway.amazonaws.com
            Action:
              - sts:AssumeRole
      Policies:
        - PolicyName: ApiDirectWriteEventBridge
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              Action:
                - events:PutEvents
              Effect: Allow
              Resource:
                - !Sub arn:aws:events:${AWS::Region}:${AWS::AccountId}:event-bus/${EventBridge}
Enter fullscreen mode Exit fullscreen mode

As shown in the flow chart, let's use a StepFunction to delete the file from S3. By doing that, we can actually delete the file without writing a single line of code, instead we use the StepFunction built in support calling AWS services.

  DeleteStateMachineStandard:
    Type: AWS::Serverless::StateMachine
    Properties:
      DefinitionUri: Events/Delete/StateMachine/delete.asl.yaml
      Tracing:
        Enabled: true
      DefinitionSubstitutions:
        InventoryTable: !Ref InventoryTable
        FileBucket: !Ref FileBucket
      Policies:
        - Statement:
            - Effect: Allow
              Action:
                - logs:*
              Resource: "*"
        - S3CrudPolicy:
            BucketName: !Ref FileBucket
        - DynamoDBCrudPolicy:
            TableName: !Ref InventoryTable
      Events:
        Delete:
          Type: EventBridgeRule
          Properties:
            EventBusName: !Ref EventBridge
            InputPath: $.detail
            Pattern:
              source:
                - FileApi
              detail-type:
                - DeleteFile
Enter fullscreen mode Exit fullscreen mode
Comment: Delete File
StartAt: Get Path Length
States:
    Get Path Length:
        Type: Pass
        Parameters:
            user.$: $.user
            path.$: $.path
            name.$: $.name
            length.$: States.ArrayLength(States.StringSplit($.path,'/'))
        Next: Does path need an update?
    Does path need an update?:
        Type: Choice
        Choices:
            - And:
                  - Not:
                        Variable: $.path
                        StringMatches: "*/"
                  - Variable: $.length
                    NumericGreaterThan: 0
              Next: Update Path
        Default: Create Full Path
    Update Path:
        Type: Pass
        Parameters:
            user.$: $.user
            path.$: States.Format('{}/',$.path)
            name.$: $.name
        Next: Create Full Path
    Create Full Path:
        Type: Pass
        Parameters:
            User.$: $.user
            Path.$: $.path
            Name.$: $.name
            FullPath.$: States.Format('{}/{}{}',$.user,$.path,$.name)
        Next: Find Item
    Find Item:
        Type: Task
        Next: File Exists?
        Parameters:
            TableName: ${InventoryTable}
            KeyConditionExpression: PK = :pk AND SK = :sk
            ExpressionAttributeValues:
                ":pk":
                    S.$: $.User
                ":sk":
                    S.$: $.FullPath
        ResultPath: $.ItemQueryResult
        Resource: arn:aws:states:::aws-sdk:dynamodb:query
    File Exists?:
        Type: Choice
        Choices:
            - Variable: $.ItemQueryResult.Count
              NumericGreaterThan: 0
              Comment: Delete File
              Next: DeleteObject
        Default: File Not Found
    DeleteObject:
        Type: Task
        Parameters:
            Bucket: ${FileBucket}
            Key.$: $.FullPath
        Resource: arn:aws:states:::aws-sdk:s3:deleteObject
        Next: The End
    File Not Found:
        Type: Pass
        Next: The End
    The End:
        Type: Pass
        End: true
Enter fullscreen mode Exit fullscreen mode

Update inventory

The update inventory StepFunction is invoked by the default event-bus, when files are added, updated, or deleted from S3. This functionality is responsible for keeping a correct inventory of all files and to keep track of how much storage space a user has allocated. It will be run in an asynchronous and event-driven way, one thing to remember is that the inventory become eventually consistent. So a list directly after a put might not return the newly added file.

Image showing the update inventory stepfunction.

As seen we use a choice state to determine either Put or Delete path in the flow and also if it's a new or updated file. This way it's possible to isolate the logic to different flows, making updating and debugging easier. Throughout the entire flow I only use service or SDK integration, and StepFunctions built in intrinsic functions. Not a single Lambda function is used, StepFunctions is a very powerful low-code service.

Comment: Update File Inventory
StartAt: Operation
States:
    Operation:
        Type: Choice
        Choices:
            - Variable: $.reason
              StringEquals: PutObject
              Next: Put Flow
              Comment: PutObject
            - Variable: $.reason
              StringEquals: DeleteObject
              Next: Delete Flow
              Comment: DeleteObject
        Default: Unknown Operation
    Put Flow:
        Type: Pass
        Parameters:
            User.$: States.ArrayGetItem(States.StringSplit($.object.key,'/'),0)
            Bucket.$: $.bucket.name
            Path.$: $.object.key
            Size.$: $.object.size
            Etag.$: $.object.etag
            Reason.$: $.reason
        ResultPath: $
        Next: Find Item
    Find Item:
        Type: Task
        Next: What to do?
        Parameters:
            TableName: ${InventoryTable}
            KeyConditionExpression: PK = :pk AND SK = :sk
            ExpressionAttributeValues:
                ":pk":
                    S.$: $.User
                ":sk":
                    S.$: $.Path
        ResultPath: $.ItemQueryResult
        Resource: arn:aws:states:::aws-sdk:dynamodb:query
    What to do?:
        Type: Choice
        Choices:
            - And:
                  - Variable: $.ItemQueryResult.Count
                    NumericGreaterThan: 0
                  - Not:
                        Variable: $.Reason
                        StringMatches: DeleteObject
              Comment: Update File
              Next: Update Object Flow
            - And:
                  - Variable: $.ItemQueryResult.Count
                    NumericLessThanEquals: 0
                  - Not:
                        Variable: $.Reason
                        StringMatches: DeleteObject
              Next: New Object Flow
              Comment: New File
            - And:
                  - Variable: $.Reason
                    StringMatches: DeleteObject
                  - Variable: $.ItemQueryResult.Count
                    NumericGreaterThan: 0
              Next: Delete Object Flow
              Comment: Delete File
        Default: Unknown Operation
    Delete Object Flow:
        Type: Pass
        Parameters:
            User.$: $.User
            Bucket.$: $.Bucket
            Path.$: $.Path
            Reason.$: $.Reason
            DeletionType.$: $.DeletionType
            Count.$: $.ItemQueryResult.Count
            OldItemSize.$: $.ItemQueryResult.Items[0].Size.N
            OldItemEtag.$: $.ItemQueryResult.Items[0].Etag.S
            OldItemUser.$: $.ItemQueryResult.Items[0].User.S
            OldItemPath.$: $.ItemQueryResult.Items[0].Path.S
            OldItemBucket.$: $.ItemQueryResult.Items[0].Bucket.S
        ResultPath: $
        Next: Set Delete File Diff
    Set Delete File Diff:
        Type: Pass
        Parameters:
            Size.$: States.Format('-{}',$.OldItemSize)
        ResultPath: $.Diff
        Next: Delete Item
    Delete Item:
        Type: Task
        Resource: arn:aws:states:::dynamodb:deleteItem
        Parameters:
            TableName: ${InventoryTable}
            Key:
                PK:
                    S.$: $.User
                SK:
                    S.$: $.Path
        ResultPath: null
        Next: Update User Quota
    Update Object Flow:
        Type: Pass
        Parameters:
            User.$: $.User
            Bucket.$: $.Bucket
            Path.$: $.Path
            Size.$: $.Size
            Etag.$: $.Etag
            Reason.$: $.Reason
            Count.$: $.ItemQueryResult.Count
            OldItemSize.$: $.ItemQueryResult.Items[0].Size.N
            OldItemEtag.$: $.ItemQueryResult.Items[0].Etag.S
            OldItemUser.$: $.ItemQueryResult.Items[0].User.S
            OldItemPath.$: $.ItemQueryResult.Items[0].Path.S
            OldItemBucket.$: $.ItemQueryResult.Items[0].Bucket.S
        ResultPath: $
        Next: Calculate Update File Diff
    Calculate Update File Diff:
        Type: Pass
        Parameters:
            Size.$: >-
                States.MathAdd(States.StringToJson(States.Format('{}',$.Size)),
                States.StringToJson(States.Format('-{}', $.OldItemSize)))
        ResultPath: $.Diff
        Next: Update Item
    Update Item:
        Type: Task
        Resource: arn:aws:states:::dynamodb:updateItem
        Parameters:
            TableName: ${InventoryTable}
            Key:
                PK:
                    S.$: $.User
                SK:
                    S.$: $.Path
            UpdateExpression: SET Size = :size, Etag = :etag
            ExpressionAttributeValues:
                ":size":
                    N.$: States.Format('{}', $.Size)
                ":etag":
                    S.$: $.Etag
        ResultPath: null
        Next: Update User Quota
    Update User Quota:
        Type: Task
        Resource: arn:aws:states:::dynamodb:updateItem
        ResultPath: null
        Parameters:
            TableName: ${InventoryTable}
            Key:
                PK:
                    S.$: $.User
                SK:
                    S: Quota
            UpdateExpression: ADD TotalSize :size
            ExpressionAttributeValues:
                ":size":
                    N.$: States.Format('{}', $.Diff.Size)
        End: true
    Unknown Operation:
        Type: Pass
        Next: The End
    Delete Flow:
        Type: Pass
        Parameters:
            User.$: States.ArrayGetItem(States.StringSplit($.object.key,'/'),0)
            Bucket.$: $.bucket.name
            Path.$: $.object.key
            Reason.$: $.reason
            DeletionType: $.deletion-type
        Next: Find Item
    The End:
        Type: Pass
        End: true
    New Object Flow:
        Type: Pass
        Next: Set Add File Diff
    Set Add File Diff:
        Type: Pass
        Parameters:
            Size.$: $.Size
        ResultPath: $.Diff
        Next: Add New Item
    Add New Item:
        Type: Task
        Resource: arn:aws:states:::dynamodb:putItem
        Parameters:
            TableName: ${InventoryTable}
            Item:
                PK:
                    S.$: $.User
                SK:
                    S.$: $.Path
                User:
                    S.$: $.User
                Path:
                    S.$: $.Path
                Bucket:
                    S.$: $.Bucket
                Size:
                    N.$: States.Format('{}',$.Size)
                Etag:
                    S.$: $.Etag
        ResultPath: null
        Next: Update User Quota
Enter fullscreen mode Exit fullscreen mode

Patterns and best-practices used

In the design I used a couple of common patterns, let's take a look at them.

Event-bus design

Since we rely on events from S3 we must use the default event-bus for that. All AWS services send their events to the default bus. We post our application specific events to an custom event-bus. When using EventBridge as your event broker this is best practice, we leave the default bus to AWS service events and use one or more custom buses for our application events.

Storage-first

Always aim to use the storage-first pattern. This will ensure that data and messages are not lost. You can read more about storage-first in my post Serverless patterns

CRUD pattern

The CRUD pattern is primarily used in the Update Inventory StepFunction, and by utilizing the powerful choice state we can route different operations down different paths. This will give an easy to overview method handling, it will also be so much easier to debug in case of a failure.

You can read more details in my previous blog. Or you can watch James Beswick talk about building serverless espresso at re:Invent

Use JSON

Always create your commands and events in JSON. Why? JSON is widely supported by AWS services, SQS, SNS, EventBridge can apply powerful logic based on JSON formatted data. Don't use yaml, Protobuf, xml, or any other format.

Final Words

This was a quick walkthrough of the service used in my blog around Serverless and event-driven design thinking. You can get the entire source code from Github.
The source code for the entire setup is available in my GitHub repo, Serverless File Manager

Don't forget to follow me on LinkedIn and Twitter for more content, and read rest of my Blogs

Top comments (5)

Collapse
 
imflamboyant profile image
Marko Djakovic

I think this is a pretty cool series of articles, thanks for sharing. I wanted to comment on a few things, specifically around Step Functions.

It is mentioned that by using StepFunctions "we can actually delete the file without writing a single line of code", while the yaml for the state machine is around 100 lines of, more or less unreadable code. For the update flow, it's around 200 lines of yaml for the state machine.

And while I understand what is meant by "low code", these delete and update flows seem to me like some sort of business logic - remove the file, update the quotas, maybe something else in the future, etc. It is said also that this approach makes updating and debugging easier, however I am not convinced - because this business logic is now written in this state machine yaml code, instead of for example a lambda function in python that does the same thing. With that in mind, I am wondering what are the tradeoffs regarding testability of this application? Are there any "unit tests" for the state machine that can run quickly, rather then having some integration level tests running in the cloud every time?

Perhaps it wasn't in scope of the article, but I think it's an important consideration when deciding on the tech stack to use.

Cheers!

Collapse
 
jimmydqv profile image
Jimmy Dahlqvist

Since I always aim to keep my Lambda Function super small I would never implement neither the Update or Delete flows in one single functions. I would always use a StepFunctions state-machine to coordinate the flow. So I define no-code in this post as no Lambda Functions and instead use the built in SDK / Service integrations in StepFunctions.

From a debugging perspective. I think it's easier to debug a state-machine failure as I can fetch the exact step that failed and can inspect input / output parameters. If the entire logic was in a single Lambda function, I think the debugging would be harder. Now, not everyone thinks that and it all depends on the way you work.

From a testing perspective, you can always invoke the StepFunction with different event input an validate the output. If you like to Unit test a single step in the state-machine, no that is not possible. I however prefer testing the StepFunction logic as one entity. And I always prefer testing in the cloud with cloud resources running.

Now, everyone has their own way of working and what is easier or not. In the teams I have been working in we have preferred this way of working and felt it gave us better debugging abilities.

Collapse
 
riosje profile image
Jefferson

Nice design, you're missing a couple of common operations in a Filesystem.
Move files-folders to another locations, rename objects.

Collapse
 
jimmydqv profile image
Jimmy Dahlqvist

That is absolutely true. I should have made clear that it's not a full solution, rather an embryo and a way to demonstrate serverless and event-driven systems.

Collapse
 
karishmashukla profile image
Karishma Shukla

This is great.