DEV Community

Cover image for Build a Serverless File to Link Converter using AWS Services✨👩‍💻
Rajeshwari Vakharia
Rajeshwari Vakharia

Posted on

Build a Serverless File to Link Converter using AWS Services✨👩‍💻

Build your own file to link converter with the magic of URL Shortener🪄

While working with Ubuntu VM, the Drag and Drop functionality was not working as intended and was showing glitches, and it made transferring files from Ubuntu to Windows, or vice-versa, a nightmare.

Hence, I found a solution which I created myself, that too on the Cloud.

File to Link Converter

Yes! Rather than moving the whole file why not just convert it into a link and make the sharing more better.

Tech Stack Used

  • Cloud Services Used

  • API Gateway (for better API handling)

  • Lambda (for making the app serverless)

  • S3 (for storing files)

  • DynamoDB (for storing Short keys to Original URL)

Languages Used

  • Python

Architecture of the App

aws infrastructure

What is happening in the diagram?

The API Gateway has two endpoints created one for POST and another to GET, and they both are integrated to one lambda function which handles both the request.

a user made a request to POST

  1. When a user wants to convert a file into link, he/she will go to POST /shorten and the lambda will put that object into the S3, with the key name being the filename given by the user.
  • Then a shortKey and Pre Signed URL of that object will be created, with a TimeToLive attribute of 5 mins only. So the user gets total 5 mins of time to use the URL to share or copy paste it into the browser to download the file. All this data will be stored in DynamoDB to create a mapping between the three keys.

  • In return to the POST request he gets the generated shortened URL.

the file being downloaded after the GET request

  1. Now with this URL, the user would like to copy it and share it or paste it in some other browser to GET the download of the file.
  • This is done by actually redirecting the location to the corresponding Pre Signed URL, and the download will start automatically.

Setting up the Infrastructure

  • This is the whole Terraform code to automate the setup w/o manual interference.
#create a bucket resource
resource "aws_s3_bucket" "s3bucket"{
  bucket = "your-bucket-name"
}

#lambda iam role
resource "aws_iam_role" "lambda_exec_role" {
    name = "lambda_exec_role"

    assume_role_policy = jsonencode({
      Version = "2012-10-17",
      Statement = [{
      Action    = "sts:AssumeRole",
      Effect    = "Allow",
      Principal = {
        Service = "lambda.amazonaws.com"
      }
    }]
  })
}

#give that iam role permissions to access dynamodb and s3
resource "aws_iam_policy" "lambda_policy" {
  name = "lambda_policy"

  policy = jsonencode({
    Version = "2012-10-17",
    Statement = [
      {
        Effect = "Allow",
        Action = ["logs:*"],
        Resource = "arn:aws:logs:*:*:*",
      },
      {
        Effect = "Allow",
        Action = ["s3:GetObject", "s3:PutObject"],
        Resource = "${aws_s3_bucket.s3bucket.arn}/*",
      },
      {
        Effect = "Allow",
        Action = ["dynamodb:PutItem", "dynamodb:GetItem"],
        Resource = aws_dynamodb_table.db.arn,
      },
    ]
  })
}

resource "aws_iam_role_policy_attachment" "lambda_policy" {
  role = aws_iam_role.lambda_exec_role.name
  policy_arn = aws_iam_policy.lambda_policy.arn

}
#create a zip file for go
data "archive_file" "zip_file_lambda" {
  type = "zip"
  source_dir = "/path/to/lambda/lambda_function.py"
  output_path = "/path/to/lambda/function.zip"
}

#create lambda function
resource "aws_lambda_function" "lambda_func" {
  function_name = "MyProjectFunction"
  role          = aws_iam_role.lambda_exec_role.arn
  handler       = "lambda_function.lambda_handler" 
  runtime       = "python3.9"                         
  filename         = "/path/to/lambda/function.zip"
  source_code_hash = filebase64sha256("/path/to/lambda/function.zip")

  environment {
    variables = {
      TABLE_NAME = aws_dynamodb_table.db.name
      API_ID = aws_api_gateway_rest_api.rest_api.id
      BUCKET_NAME = aws_s3_bucket.s3bucket.bucket
    }
  }
}

#creating a rest api
resource "aws_api_gateway_rest_api" "rest_api" {
  name = "myprojectAPI"
  description = "MyProject Rest API"
  binary_media_types = ["multipart/form-data"]
}

#creating the resource for shorten to post files
resource "aws_api_gateway_resource" "shorten" {
  rest_api_id = aws_api_gateway_rest_api.rest_api.id
  parent_id = aws_api_gateway_rest_api.rest_api.root_resource_id
  path_part = "shorten"
}

#creating a method for the /shorten endpoint
resource "aws_api_gateway_method" "shorten_method" {
  rest_api_id = aws_api_gateway_rest_api.rest_api.id
  resource_id =  aws_api_gateway_resource.shorten.id
  http_method = "POST"
  authorization = "NONE"

  request_parameters = {
    "method.request.header.Content-Type" = true
    "method.request.header.Accept" = true
  }
}

#creating integration
resource "aws_api_gateway_integration" "integration_shorten" {
  rest_api_id = aws_api_gateway_rest_api.rest_api.id
  resource_id =  aws_api_gateway_resource.shorten.id
  http_method = aws_api_gateway_method.shorten_method.http_method
  type = "AWS_PROXY"
  integration_http_method = "POST"
  uri = aws_lambda_function.lambda_func.invoke_arn

  request_parameters = {
    "integration.request.header.Content-Type" = "method.request.header.Content-Type"
    "integration.request.header.Accept" = "method.request.header.Accept"
  }
}

resource "aws_api_gateway_resource" "short" {
  rest_api_id = aws_api_gateway_rest_api.rest_api.id
  parent_id = aws_api_gateway_rest_api.rest_api.root_resource_id
  path_part = "short"
}

#alloting method to the path
resource "aws_api_gateway_method" "method" {
  rest_api_id = aws_api_gateway_rest_api.rest_api.id
  resource_id = aws_api_gateway_resource.short.id
  http_method = "GET"
  authorization = "NONE"
}

#integrating the created api with lambda function
resource "aws_api_gateway_integration" "integration_short" {
  rest_api_id = aws_api_gateway_rest_api.rest_api.id
  resource_id = aws_api_gateway_resource.short.id
  http_method = aws_api_gateway_method.method.http_method
  type = "AWS_PROXY"
  integration_http_method = "POST"
  uri = aws_lambda_function.lambda_func.invoke_arn
}

#make a deployment
resource "aws_api_gateway_deployment" "api_deployment" {
  depends_on = [ 
    aws_api_gateway_integration.integration_shorten,
    aws_api_gateway_integration.integration_short,
   ]
   rest_api_id = aws_api_gateway_rest_api.rest_api.id  
}

#create a stage
resource "aws_api_gateway_stage" "stage" {
  stage_name = "project"
  rest_api_id = aws_api_gateway_rest_api.rest_api.id
  deployment_id = aws_api_gateway_deployment.api_deployment.id

}

#permission for lambda for apigw
resource "aws_lambda_permission" "api_gw" {
  statement_id  = "AllowExecutionFromAPIGateway"
  action        = "lambda:InvokeFunction"
  function_name = aws_lambda_function.lambda_func.function_name
  principal     = "apigateway.amazonaws.com"
  source_arn    = "${aws_api_gateway_rest_api.rest_api.execution_arn}/*/*"
}

#create a dynamodb table with ttl enabled to store mapping of short code with the originalURL
resource "aws_dynamodb_table" "db" {
    name = "URLShortner"
    billing_mode = "PAY_PER_REQUEST"
    hash_key = "shortKey"
    attribute {
      name = "shortKey"
      type = "S"
    }

    ttl {
      attribute_name = "TimeToExist"
      enabled = true
    }
}

output "myprojectlink" {
  value = "https://${aws_api_gateway_rest_api.rest_api.id}.execute-api.us-east-1.amazonaws.com/${aws_api_gateway_stage.stage.stage_name}/shorten"

}
Enter fullscreen mode Exit fullscreen mode

Let us go through it resource by resource.

Resource 1: Creating an S3 Bucket

We, at very first, named and created an S3 Bucket where actual files will be stored of whose Pre Signed URLs will be generated, using which the file will be automatically downloaded.

Resource 2: Creating a Lambda function

Next, we need to create an AWS Lambda function that will Put objects into S3, creates the Pre Signed URL, generates short key, creates a shortened URL and finally Redirects to the Pre Signed URL

Steps:

#lambda iam role
resource "aws_iam_role" "lambda_exec_role" {
    name = "lambda_exec_role"

    assume_role_policy = jsonencode({
      Version = "2012-10-17",
      Statement = [{
      Action    = "sts:AssumeRole",
      Effect    = "Allow",
      Principal = {
        Service = "lambda.amazonaws.com"
      }
    }]
  })
}

#give that iam role permissions to access dynamodb and s3
resource "aws_iam_policy" "lambda_policy" {
  name = "lambda_policy"

  policy = jsonencode({
    Version = "2012-10-17",
    Statement = [
      {
        Effect = "Allow",
        Action = ["logs:*"],
        Resource = "arn:aws:logs:*:*:*",
      },
      {
        Effect = "Allow",
        Action = ["s3:GetObject", "s3:PutObject"],
        Resource = "${aws_s3_bucket.s3bucket.arn}/*",
      },
      {
        Effect = "Allow",
        Action = ["dynamodb:PutItem", "dynamodb:GetItem"],
        Resource = aws_dynamodb_table.db.arn,
      },
    ]
  })
}

resource "aws_iam_role_policy_attachment" "lambda_policy" {
  role = aws_iam_role.lambda_exec_role.name
  policy_arn = aws_iam_policy.lambda_policy.arn

}
Enter fullscreen mode Exit fullscreen mode

The above code is just used to create a role for Lambda named lambda_exec_role, create a policy to give it permissions to access S3, DynamoDB and CloudWatch.
This policy is then attached to the lambda function role using its ARN.

Creating the Lambda Function

#create a zip file for go
data "archive_file" "zip_file_lambda" {
  type = "zip"
  source_dir = "/path/to/lambda/lambda_function.py"
  output_path = "/path/to/lambda/function.zip"
}

#create lambda function
resource "aws_lambda_function" "lambda_func" {
  function_name = "MyProjectFunction"
  role          = aws_iam_role.lambda_exec_role.arn
  handler       = "lambda_function.lambda_handler" 
  runtime       = "python3.9"                         
  filename         = "/path/to/lambda_code/function.zip"
  source_code_hash = filebase64sha256("/home/rashu/myproject/function.zip")

  environment {
    variables = {
      TABLE_NAME = aws_dynamodb_table.db.name
      API_ID = aws_api_gateway_rest_api.rest_api.id
      BUCKET_NAME = aws_s3_bucket.s3bucket.bucket
    }
  }
}
Enter fullscreen mode Exit fullscreen mode
  • Here, we have first started by creating a zip file.

  • Then we created the lambda function by specifying the handler ({filename_of_lambda}.{main_entry_point_in_the_code}), the runtime as python3.9, the role for lambda that we just created above and the zip file.

  • One of the best feature I find of Terraform is the environment attribute. As lambda runs inside a container, so I used this code block to declare the environment variables in that container so that I can use those values without hardcoding them in the code.

Also, here's the Python code that for the Lambda that we created, so now lets go through it:

import json
import boto3
import os
import io
import base64
import random
import string
import time
import re
import cgi

s3_client = boto3.client('s3')
dynamodb = boto3.client('dynamodb')

BUCKET_NAME = os.environ.get('BUCKET_NAME')
TABLE_NAME = os.environ.get('TABLE_NAME')
API_ID = os.environ.get('API_ID')

def generate_short_key(length=6):
    return ''.join(random.choices(string.ascii_letters + string.digits, k=length))

def sanitize_filename(name):
    return re.sub(r'[^a-zA-Z0-9._-]', '_', name)

def upload_file_to_s3(file_content, key):
    s3_client.put_object(
        Bucket=BUCKET_NAME,
        Key=key,
        Body=file_content
    )

def create_presigned_url(key):
    return s3_client.generate_presigned_url(
        'get_object',
        Params={'Bucket': BUCKET_NAME, 'Key': key},
        ExpiresIn=300
    )

def store_url_in_dynamodb(short_key, url):
    ttl = int(time.time()) + 300
    dynamodb.put_item(
        TableName=TABLE_NAME,
        Item={
            'shortKey': {'S': short_key},
            'originalUrl': {'S': url},
            'TimeToExist': {'N': str(ttl)}
        }
    )

def get_url_from_dynamodb(short_key):
    response = dynamodb.get_item(
        TableName=TABLE_NAME,
        Key={'shortKey': {'S': short_key}}
    )
    if 'Item' not in response:
        raise Exception("Invalid Key")
    return response['Item']['originalUrl']['S']

def lambda_handler(event, context):
    method = event.get("httpMethod")

    if method == "POST":
        try:
            # Decode base64-encoded body (multipart/form-data comes in like this)
            content_type = event["headers"].get("Content-Type") or event["headers"].get("content-type")
            body = base64.b64decode(event["body"])

            # Use cgi to parse multipart data
            environ = {'REQUEST_METHOD': 'POST'}
            headers = {'content-type': content_type}
            fs = cgi.FieldStorage(fp=io.BytesIO(body), environ=environ, headers=headers)

            # Get the file
            file_field = fs['file']
            file_bytes = file_field.file.read()

            # Get custom filename
            custom_name = fs.getvalue('filename')
            filename = sanitize_filename(custom_name) if custom_name else file_field.filename

            # Detect if it's a binary file (e.g., based on extension)
            binary_extensions = ['.pdf', '.jpg', '.jpeg', '.png', '.xlsx']
            is_binary = any(filename.endswith(ext) for ext in binary_extensions)

            content = file_bytes if is_binary else file_bytes.decode("utf-8")

            upload_file_to_s3(content, filename)
            presigned_url = create_presigned_url(filename)
            short_key = generate_short_key()
            store_url_in_dynamodb(short_key, presigned_url)

            short_url = f"https://{API_ID}.execute-api.us-east-1.amazonaws.com/project/short?shortKey={short_key}"

            return {
                "statusCode": 200,
                "headers": {"Content-Type": "application/json"},
                "body": json.dumps({"msg": short_url})
            }

        except Exception as e:
            return {
                "statusCode": 500,
                "headers": {"Content-Type": "application/json"},
                "body": json.dumps({"msg": "Error: " + str(e)})
            }

    elif method == "GET":
        try:
            short_key = event.get("queryStringParameters", {}).get("shortKey")
            original_url = get_url_from_dynamodb(short_key)

            return {
                "statusCode": 301,
                "headers": {'Location': original_url},
            }

        except Exception as e:
            return {
                "statusCode": 500,
                "headers": {"Content-Type": "application/json"},
                "body": json.dumps({"msg": "Error: " + str(e)})
            }

    else:
        return {
            "statusCode": 405,
            "headers": {"Content-Type": "application/json"},
            "body": json.dumps({"msg": "Method not allowed"})
        }

Enter fullscreen mode Exit fullscreen mode

Resource 3: Configuring the API Gateway

#creating a rest api
resource "aws_api_gateway_rest_api" "rest_api" {
  name = "myprojectAPI"
  description = "MyProject Rest API"
  binary_media_types = ["multipart/form-data"]
}

#creating the resource for shorten to post files
resource "aws_api_gateway_resource" "shorten" {
  rest_api_id = aws_api_gateway_rest_api.rest_api.id
  parent_id = aws_api_gateway_rest_api.rest_api.root_resource_id
  path_part = "shorten"
}

#creating a method for the /shorten endpoint
resource "aws_api_gateway_method" "shorten_method" {
  rest_api_id = aws_api_gateway_rest_api.rest_api.id
  resource_id =  aws_api_gateway_resource.shorten.id
  http_method = "POST"
  authorization = "NONE"

  request_parameters = {
    "method.request.header.Content-Type" = true
    "method.request.header.Accept" = true
  }
}

#creating integration
resource "aws_api_gateway_integration" "integration_shorten" {
  rest_api_id = aws_api_gateway_rest_api.rest_api.id
  resource_id =  aws_api_gateway_resource.shorten.id
  http_method = aws_api_gateway_method.shorten_method.http_method
  type = "AWS_PROXY"
  integration_http_method = "POST"
  uri = aws_lambda_function.lambda_func.invoke_arn

  request_parameters = {
    "integration.request.header.Content-Type" = "method.request.header.Content-Type"
    "integration.request.header.Accept" = "method.request.header.Accept"
  }
}

resource "aws_api_gateway_resource" "short" {
  rest_api_id = aws_api_gateway_rest_api.rest_api.id
  parent_id = aws_api_gateway_rest_api.rest_api.root_resource_id
  path_part = "short"
}

#alloting method to the path
resource "aws_api_gateway_method" "method" {
  rest_api_id = aws_api_gateway_rest_api.rest_api.id
  resource_id = aws_api_gateway_resource.short.id
  http_method = "GET"
  authorization = "NONE"
}

#integrating the created api with lambda function
resource "aws_api_gateway_integration" "integration_short" {
  rest_api_id = aws_api_gateway_rest_api.rest_api.id
  resource_id = aws_api_gateway_resource.short.id
  http_method = aws_api_gateway_method.method.http_method
  type = "AWS_PROXY"
  integration_http_method = "POST"
  uri = aws_lambda_function.lambda_func.invoke_arn
}

#make a deployment
resource "aws_api_gateway_deployment" "api_deployment" {
  depends_on = [ 
    aws_api_gateway_integration.integration_shorten,
    aws_api_gateway_integration.integration_short,
   ]
   rest_api_id = aws_api_gateway_rest_api.rest_api.id  
}

#create a stage
resource "aws_api_gateway_stage" "stage" {
  stage_name = "project"
  rest_api_id = aws_api_gateway_rest_api.rest_api.id
  deployment_id = aws_api_gateway_deployment.api_deployment.id

}
Enter fullscreen mode Exit fullscreen mode

Here, we created a rest api.

  • As my app would be transferring file contents by using mutipart/form-data, I configured the api to accept binary_media_types, it is a mandatory step.

  • I created 2 endpoints,
    POST /shorten: This route will handle the operation of put_objects in the S3 and creation of short URLs.
    GET /short?shortKey={shortKey}: This route will redirect users to the original URL based on the short code provided.

  • Integrate Routes with Lambda:
    For each route, select the Lambda function that we created.
    API Gateway will automatically link the function to the routes.

  • Deploy the API:
    After setting up the routes, you need to deploy the API through their lambda integrations.

  • And finally attached them to a stage named project.

Resource 5: Creating a DB in DynamoDB to store the mapping.

#create a dynamodb table with ttl enabled to store mapping of short code with the originalURL
resource "aws_dynamodb_table" "db" {
    name = "URLShortner"
    billing_mode = "PAY_PER_REQUEST"
    hash_key = "shortKey"
    attribute {
      name = "shortKey"
      type = "S"
    }

    ttl {
      attribute_name = "TimeToExist"
      enabled = true
    }
}
Enter fullscreen mode Exit fullscreen mode

This was really easy to setup. All you need to do is:

  • Create a dynamodb table named as URLShortner, with the billing mode PAY_PER_REQUEST, if you use PROVISIONED_MODE then you are at your own risk as that will start billing you as the table is created, so be careful.

  • Then I gave shortKey as the attribute of HASH_KEY or PRIMARY_KEY as that has to be unique for each case.

  • I also used TTL as in my app the time limit to use the URL and copy paste it to download the file is of only 5 mins.

Resource 5: Generating the Output

output "myprojectlink" {
  value = "https://${aws_api_gateway_rest_api.rest_api.id}.execute-api.us-east-1.amazonaws.com/${aws_api_gateway_stage.stage.stage_name}/shorten" 
}
Enter fullscreen mode Exit fullscreen mode

In the end it's always a good choice to output the api link so that its easy to use.

Drawbacks

  • One of the major drawback that I found with this project is that the setup is a monolith. As, the POST method is handling everything from creating the bucket to generating the shortkeys to getting the presigned urls and at last putting it all into the dynamodb.

And that being said, a better architecture has to be build to make services loosely coupled to each other.

  • And of course, other drawbacks are that it still doesn't include NOT creating a different short Key even for the same file. Hence, it lacks uniqueness.

  • The S3 object still stays even after the deletion of its URL mapping in the DynamoDB.

  • And of course, a custom Domain Name, as this API Gateway can't be used in real life application.

Thanks for reading till here.🙌
Leave down your thoughts in the comment to discuss how to improve this app more.⭐

Top comments (0)