Adam Johnson

Posted on Jun 28, 2018 • Edited on Jul 10, 2018

The Right Way™ to do Serverless in Python (Part 1)

#serverless #python #tutorials #awslambda

Note: This article was authored by Michael Lavers

As you've undoubtedly heard, serverless has grown in leaps and bounds since the release of AWS Lambda in November 2013. And while the idea of not having to manage a server, paying only for the compute you actually use and horizontal scaling out-of-the box may sound really appealing, where do you start? As a Python developer, maybe you've heard of the Serverless Framework or Zappa; and either the prospect of working with an unfamiliar ecosystem like Node.js (in Serverless' case) is daunting, or maybe you're unclear on what these tools actually do and if they're overkill when you're just getting started. If that's the case, you've come to the right place. In this blog post we'll take a brief tour of serverless as it applies to Python as well as some handy tips and tricks to help you get your feet wet.

Is my app/workload right for serverless?

This is probably the most important question you need to ask about serverless. While serverless has a lot to offer, it isn't ideal for all apps/workloads. For example, the current maximum duration for an AWS Lambda function invocation is five minutes. What this means is that if your app/workload can't be broken up into five minute chunks, then serverless might not be the best option. If your app uses websockets, for example, which maintains a persistent connection with the server, on AWS Lambda this connection would be closed and need to be reestablished every five minutes. It would also mean you're paying for all that compute time for a workload that is really just keeping a socket open. But if your app is mostly stateless, like a REST or GraphQL API, then serverless may be a great fit, as HTTP requests rarely go beyond 30 seconds, let alone five minutes.

Where serverless really shines is workloads where you might have spikes and/or periods of low activity. If you think about it, that description likely applies to most apps/workloads you've worked on. When your app/workload spikes, then AWS Lambda is there to provide considerable horizontal scaling, by default handling 1,000 concurrent requests out-of-the-box (this limit can be increased). And when your app/workload is in a low activity period, your meter isn't running at full tilt, which can save you a lot on operating expenses. Think about it; most apps/workloads serve a range of timezones, so why should you pay full price to run your app/workload when your customers are asleep?

If you're still not sure whether or not your app/workload is a good fit, here's a handy calculator to compare AWS Lambda to EC2.

Python 2 or 3?

The Python ecosystem has gone through a lot of changes the past decade -- the most significant being the release of Python 3 and the transition of many codebases from Python 2.x to 3.x. In the serverless world, the advice here is consistent with any other Python project: use Python 3.6 if it's a new project. While Python 2.7 has served many of us well, it will cease to receive updates after 2020. So if you haven't already started your transition to 3.6, you have two years to do so.

If you have an existing Python 2.7 project, you're in luck, as AWS Lambda supports 2.7, but you should seriously consider starting to port your code to Python 3.6 soon. Here's a super helpful cheatsheet on how to write Python2+3 compatible code. And like the cheat sheet, the advice here will be 2.7/3.6 compatible.

Web API or Worker?

Before we go over the serverless tools available to Python, lets drill down a little more into our app/workload. If you have a web app that serves a front-end with several web assets (HTML, JavaScript, CSS, images, etc.), then you should not be serving these up via your AWS Lambda function. That's not to say that you can't, just that you shouldn't. Remember that with AWS Lambda you pay for the time your function is running. It doesn't make much sense to spend this time serving up web assets. In fact, since your front-end likely has many web assets, this could be a relatively expensive way of doing what is otherwise a simple task. For serving up web assets, you should look at a CDN (Content Delivery Network). AWS has a service specifically built for this purpose called CloudFront, here's a guide on how to use it with S3.

But that really only covers your web app's front-end. Or maybe your app/workload doesn't have a front-end at all. We're going to break these workloads into two categories: web APIs (REST, GraphQL, etc.), and workers. You're likely already familiar with the web API variety, and AWS Lambda allows us to serve them using a service called API Gateway (more on that later). The other, what I'm going to call "workers," is all the other tasks your app might need to do, such as: sending email, uploading files, push notifications, etc. Serverless can handle this workload, and even offers it's own cron job like service to handle scheduled tasks.

The reason I'm defining these two categories is that while in serverless these workloads may look similar (or indistinguishable, as they're both essentially functions), some tools are better at or designed for one over the other. In this guide, I'll point out where this is the case. Hopefully you're already thinking about what parts of your app will be served via a web API and what parts can be worker tasks run in the background so you can pick the right tool for the job.

Serverless Framework

The big kahuna in the serverless space is undoubtedly the Serverless Framework, and for good reason. The team behind it has poured considerable time and effort into developer experience to make it one of the most intuitive and accessible serverless tools out there. It also offers a feature set that is second to none, supporting multiple clouds in addition to AWS Lambda and a growing plugin ecosystem. For a Python developer, this would seem like the obvious starting point and it certainly is a great one.

But serverless framework is not without some Python-specific caveats. For starters, Serverless is written in Node.js, which may not be your cup of tea. But like any good tool, if done right, you shouldn't even notice what language it is implemented in, and the case could certainly be made here for Serverless. That being said, you'll still need to install Node and NPM, but after that you're only a npm install -g serverless away from having the Serverless CLI be your primary interface. You won't need to know any JavaScript, although your configs will be YAML by default.

Let's give it a try.

npm install -g serverless

The CLI can be accessed using either serverless or the shorthand sls. I prefer sls, so lets create a project:

mkdir ~/my-serverless-project
cd ~/my-serverless-project
sls create -n my-serverless-project -t aws-python3

Here I've created a directory called my-serverless-project and created a project using sls create. I'm also specifying a template with -t aws-python3. Serverless comes bundled with several templates that will set some sensible defaults for you in your serverless.yml that will be created when you run sls create. In this case, I'm specifying the AWS template for Python 3.6. There's also a aws-python2 if you're project is Python 2.7. There are other templates for other languages and clouds, but that's outside of the scope of this guide. The -n my-serverless-project specifies a service name, you can change this to whatever you want to name your project. Once you run these commands you should have a serverless.yml in your my-serverless-project directory. Let's take a look at the contents:

cat serverless.yml

As you will see the serverless.yml is YAML and comes loaded with several helpful comments explaining each section of the config. At this point I recommend reading through these comments as it will be helpful later on. Once done, let's write a hello world equivalent of a lambda function:

def handler(event, context):
    return {"message": "hi there"}

And save that in our my-serverless-project directory as hello.py. The name handler is just what lambda functions are commonly referred, you can name your functions whatever you like. Now that we have a function, let's make Serverless aware of it by adding it to our serverless.yml. First open up your serverless.yml and look for the functions section. Then replace that section with the following:

functions:
  hello:
    handler: hello.handler

Now save your serverless.yml. To deploy this function, we'll need to make sure we have our AWS credentials configured. Assuming we're all good there, then just run the following:

sls deploy

Deployment will take a little while. In short, here's the steps that Serverless is doing:

Creating a CloudFormation template based on the serverless.yml
Compressing the CloudFormation template and your hello.py into a zip archive
Creating a S3 bucket and uploading the zip archive to it
Executing the CloudFormation template, which includes configuring an AWS Lambda function and pointing it to the S3 zip archive

You could do all of these steps manually, but why would you want to? When our deploy is complete, we can test it with the following command:

sls invoke -f hello

Here -f hello is the name of the function we specified in our serverless.yml file. The sls invoke command invokes the function. A function invocation is a fancy word for calling the function. When we invoke this function, it should respond with:

{"message": "hi there"}

If that's what you see, congratulations, you've just created your first AWS Lambda function. But this isn't terribly useful.? What if we wanted to do something a little more challenging, like making a HTTP request and returning the result. Let's create a new file called httprequest.py and add the following:

import requests

def handler(event, context):
    r = requests.get("https://news.ycombinator.com/news")
    return {"content": r.content}

Update the functions section of your serverless.yml:

functions:
  hello:
    handler: hello.handler
  httprequest:
    handler: httprequest.handler

And re-deploy and invoke our new function:

sls deploy
sls invoke -f httprequest

You should now see an ImportError. This is because requests is not installed. With AWS Lambda, we need to bundle any libraries we want to use with our function. We can do this a number of ways, with pip we can run:

# Don't run this just yet, keep reading
pip install requests -t .

This will install the requests wheel (and it's dependencies) in our my-serverless-project directory. But there's a catch. AWS Lambda runs on 64-bit Linux. And because requests comes with dependencies that are compiled, if you're not running 64-bit Linux as well this copy of requests you just installed won't run on AWS Lambda. So what do you do if you're running Mac OS? Or Windows? Or FreeBSD? Or, well, you get the picture.

Serverless plugins (and Docker) to the rescue.

Thankfully Serverless comes with a plugin ecosystem to fill in the gaps, and for Python, packages with compiled code just happens to be one of those gaps. The relevant plugin here is aptly named serverless-python-requirements, and to install it:

npm install -g serverless-python-requirements

And add the following lines to the end of your serverless.yml:

plugins:
  - serverless-python-requirements

Now right away this plugin enables requirements.txt support. So instead of the pip line above, we now just need to add a requirements.txt file to our project directory and the next time we run sls deploy the requirements will be installed and bundled automatically. Let's do that now:

echo "requests" >> requirements.txt

But we haven't solved our compilation problem yet. To do that we'll need to add a custom section to our serverless.yml. This section is where we can put our custom configuration options, but it's also where plugins look to for their own config options. Our new custom section should look like this:

custom:
  pythonRequirements:
    dockerizePip: true

What this does is tell the serverless-python-requirements plugin to compile the Python packages in a Docker container before bundling them in the zip archive to ensure they're compiled for 64-bit Linux. You'll also need to install Docker in order for this to work, but once done this plugin will handle the dependencies you define in your requirements.txt file automatically. Pretty cool, huh?

Now let's re-deploy and re-invoke our function:

sls deploy
sls invoke -f httprequest

Even if you are running 64-bit Linux, this is way cleaner, don't you think?

Now before we continue, you likely have a few questions. What are event, context and how is any of this useful to me?

AWS Lambda functions are event-driven, so when you invoke a function, you're actually triggering an event within AWS Lambda. The first argument of your lambda function contains the event that triggered the function, which is represented within AWS Lambda as a JSON object, but what is passed to Python is a dict of that object. When you run sls invoke -f hello, what is passed to the lambda function is an empty dict. But if it was an API request, it would contain the entire HTTP request represented as a dict. In other words, the event dict acts as your lambda function's input parameters and what your lambda function returns is the output. With AWS Lambda, your output needs to be JSON serializable as well. Here's some example events you might see with AWS Lambda.

The second argument is the AWS Lambda context, which is a Python object with useful meta data about the lambda function and the current invocation. For example, every invocation has a aws_request_id which is useful if you want to track down what happened in a specific invocation within your logs. Check here for more information about the context object. You probably won't need to worry about the context object right away, but you'll eventually find it useful when debugging.

So how is this useful? Well, if your app/workload can work with a JSON serializable input and produce a JSON serializable output, then it can plug right into a AWS Lambda. If we go back to our two workload categories, web API and worker, we've already pretty much implemented what you'll need for a worker. Lets say we wanted to run our httprequest function every ten minutes, in order to do that all we would need to do is add the following to our serverless.yml:

functions:
  httprequest:
    handler: httprequest.handler
    events:
      - schedule: rate(10 minutes)

And re-deploy:

sls deploy

And we're done. Now httprequest is being triggered automatically every ten minutes. If you want more fine-grained control, you can specify a specific time your function should be triggered as well. You can also build more complex workflows using SNS, SQS or other AWS services.

What about the web API? Remember earlier we covered events and mentioned that an HTTP request can be represented as an event? In the case of web APIs, it is Amazon's API Gateway service that will be triggering the events for us. In addition to this, API Gateway will provide a hostname that can receive HTTP requests, transform those HTTP requests into an event object, invoke our lambda function, collect the response and pass it on to the requester as an HTTP response. That might sound complex, but thankfully the Serverless framework abstracts much of this for us. Let's add an HTTP endpoint to our serverless.yml:

functions:
  webapi:
    handler: webapi.handler
    events:
      - http:
          path: /
          method: get

This looks a lot like our scheduled worker task earlier, doesn't it? Like in the scheduled task, here we configure this handler to handle http events and specify a path and method. As you've likely guessed, the path is our HTTP request path and the method is the HTTP method this handler will handle. You'll also notice we've added a new handler, let's create that now in webapi.py:

import json

def handler(event, context):
    return {"statusCode": 200, "body": json.dumps({"message": "I'm an HTTP response"})}

Here our handler will accept an event from the API Gateway and will respond with a JSON serializable dict. Within the dict we have two keys, the first is statusCode which is the HTTP status code we want the API Gateway to respond with and the second is the HTTP body of the response. You'll also notice that we serialize the body into JSON as well. The reason we do this is that API Gateway expects the HTTP response body to be a string. So if we want our web API to respond with JSON, we need to serialize it before handing it back to API Gateway. This also means that if you wanted to support some other form of serialization, like XML, YAML or something else -- you could do that, too.

Now let's deploy it:

sls deploy

Once done the Serverless framework will provide us with an endpoint:

endpoints:
 GET - https://XXXXXXXXXX.execute-api.us-east-1.amazonaws.com/dev

Your endpoint will look a little different. What just happened here? Well, in short, Serverless created our new AWS Lambda function and then configured an AWS API Gateway to point to this lambda function. The endpoint you see above is the one provided by API Gateway.

Let's take our endpoint for a spin:

curl https://XXXXXXXXXX.execute-api.us-east-1.amazonaws.com/dev

And you should see the following response:

{"message": "I'm an HTTP response"}

Congratulations, you've just created your first serverless web API. Now you've probably noticed that the URL API Gateway provides is pretty ugly. It would be a lot nicer if it could be something more readable like https://api.mywebapi.com/. Well, there's a plugin for that, too.

Cleaning Up

In this post we've created three Lambda functions and an API Gateway. But these were really just examples to help you get your feet wet in serverless. You'll probably want to clean up and to do that just run:

sls remove

And the Serverless framework will take care of the rest.

In Summation

In this post we've gone over some of the AWS Lambda basics and created a simple web API and worker using the Serverless framework. In Part 2 we cover some more advanced topics, including an introduction to two serverless frameworks specifically built for Python: Zappa and Chalice. We also demonstrate how quickly you can build a web API using these tools, including examples using popular web frameworks such as Django and Flask.