DEV Community

David Israel for Uclusion

Posted on • Edited on

Uploading user files to S3 without passing through your Lambdas

Uclusion’s back end is built entirely on AWS Lambda, which means we don’t have big beefy servers waiting around to service file upload requests. This presents a few challenges,

  1. User uploads can go slow, so we can easily hit lambda timeouts

  2. The API gateway in front of the lambda has a relatively low payload limit in the 10s of MBs.

The obvious solution to that problem is to have the users upload to S3 directly, ideally in a way that doesn’t open the S3 bucket up to world writes. I’m going to cover how to do that.

Step 1: Give a Lambda Put Permissions on your Bucket

If you don’t already have a Lambda with permission to put files in bucket you’ll need to create an IAM role granting that privilege to the Lambda. Here’s an example one written in the Serverless Framework’s YAML format:

- Effect: "Allow"
  Action:
    - "s3:PutObject"
  Resource: arn:aws:s3:::${your_bucket_name}/
Enter fullscreen mode Exit fullscreen mode

Step 2: Generate a Presigned POST

Now that you have a Lambda with write permissions to the bucket, with the S3 SDK you can have that lambda generate a URL for a POST and field structure that lets that lambda share it’s write access for a *specific path in the bucket, subject to certain conditions. *This is where the first part of your security comes into play, because this is where you can do the following

  1. Authenticate your user. No access to the Post Lambda via the gateway means’ they are not uploading to your bucket

  2. Capture metadata about the upload BEFORE they send it, and make decisions on whether to grant the upload. For example, Uclusion requires that clients tell the back end what the Content Type and Content Length are before allowing the upload

  3. Structure your paths in a way that clients don’t step on each other. Each call to the back end for a post URL should generate a new unique path, and return that path in the URL. You may even chose to encode information in that path to help with security decisions later on.

Here’s the code that we use to generate the a post back to the user, however the data_dictionary is compared against business rules in other code, so I’m not presenting the code that validates the content length and type

content_length = data['content_length']
content_type = data['content_type']
file_path = str(uuid.uuid4()) #this is pretty much guaranteed unique
# more bits of entropy, and now I can find user's files later
used_id = get_user_id() # a UUID itself
path = used_id + '/' + file_path
return create_post(path, content_type, content_length, bucket)

def create_post(path, content_type, content_length, bucket):
    expiration = 60 * 60 * 24  *# Upload must occur within 24 hours
    *conditions = [
        ['content-length-range', content_length, content_length],
        {'Content-Type': content_type}
    ]
    post = s3_client.generate_presigned_post(bucket,
                                             path,
                                             Fields={"Content-Type": content_type},
                                             Conditions=conditions,
                                             ExpiresIn=expiration)
    return {
        'metadata': {
            'path': path,
            'content_length': content_length,
            'content_type': content_type
        },
        'presigned_post': post
    }
Enter fullscreen mode Exit fullscreen mode

You can see from the above conditions, that we only let them upload files of the EXACT size the tell us, make them set the content type exactly the same as they told me, and only let the returned signed POST be valid for 24 hours.

Step 3: Write the front end code to do the upload.

You’ve now created a back end that generates a presigned POST Body, and returns it in a packet. So how do you use it? That requires a little bit of trickery on the front end, as you have to build a multipart POST grammatically. Here’s how we do it:

// get the presigned post packet from the backend
.then((client) => backendApi.getFileUploadData(type, size))
.then((data) => {
  const { metadata, presigned_post } = data;
  const { url, fields } = presigned_post;
  // load up the fields and file data into the post body
  const body = new FormData();
  for (const [field, value] of Object.entries(fields)) {
    body.append(field, value);
  }
  // aws ignores all fields after the file field, so the data has to be last
  body.append('file', file);
  const fetchParams = { method: 'POST', body };
  return fetch(url, fetchParams)
    .then(() => metadata); //return metadata from the backend
*})
Enter fullscreen mode Exit fullscreen mode

That’s it, your users can now securely upload their files to your bucket without having access themselves. In the next post I will cover how to securely let your users READ files from S3 without having to front S3 with any Lambdas or servers.

Image of AssemblyAI

Automatic Speech Recognition with AssemblyAI

Experience near-human accuracy, low-latency performance, and advanced Speech AI capabilities with AssemblyAI's Speech-to-Text API. Sign up today and get $50 in API credit. No credit card required.

Try the API

Top comments (0)

👋 Kindness is contagious

Explore a sea of insights with this enlightening post, highly esteemed within the nurturing DEV Community. Coders of all stripes are invited to participate and contribute to our shared knowledge.

Expressing gratitude with a simple "thank you" can make a big impact. Leave your thanks in the comments!

On DEV, exchanging ideas smooths our way and strengthens our community bonds. Found this useful? A quick note of thanks to the author can mean a lot.

Okay