DEV Community

Mohamed Mahmoud
Mohamed Mahmoud

Posted on

From Zero to Serverless

I was first exposed to Lambda functions at work when the team needed to implement an API service that will receive requests from other services, parse the data, and then sends this data to Kafka. Now this service is handling around 1M request each month πŸŽ‰πŸŽ‰

I was very confused about where to start or how to get this done to work in development, testing, and production environments. I saw Serverless Framework, Knative, and Kubeless on the internet and took much time to decide what to choose since we are using K8s in our environments.

At last, I found Serverless the most suitable one to use in all our environments so I started learning how to build a serverless function using it. It's pretty easy to learn, use, and even read your configuration.

The directory structure

It's better to abstract each function in a separate directory with its components. You will also see Dockerfile and .docker directory which are used in the development environment. You will also see we are using Jenkins in our CI/CD pipeline. The programming language we used is NodeJS. We will talk about each environment setup next.

Alt Text

The function handler's code is always inside handler.js. We create as many files inside lib/ as necessary. The corresponding tests to all this code are under tests/.

The serverless.yml

Let me start by sharing our "simple" serverless.yml πŸ˜…, and then I'll start commenting on pieces of it and explain how we are using this file for all of our environments:

service: api-service
provider:
  name: aws
  runtime: nodejs12.x
  memorySize: 512
  stage: ${opt:stage, 'dev'}
  logRetentionInDays: ${self:custom.logRetention.${self:provider.stage}, self:custom.logRetention.other}
  vpc:
    subnetIds:
      - ${self:custom.subnetIds.${self:provider.stage}, self:custom.subnetIds.other}
    securityGroupIds:
      - ${self:custom.securityGroupIds.${self:provider.stage}, self:custom.securityGroupIds.other}
  deploymentBucket:
    name: api-service
Enter fullscreen mode Exit fullscreen mode

First, you will have to specify the service name to which you want to name the service. In the provider part, you will need to name your cloud provider you are going to use. I am AWS as shown above but if you are using GCP it will be google. You can see an example for GCP here. Then, the runtime your serverless application will use. Don't forget to set the version like the one you are using in the developent environment.

One of the important options in the provider block is adding a memorySize value. The default value is 1024 but sometimes your function is not going to use this much of memory so it's good to save some money here. Next the stage option, you can set it with passing an sls argument during the deployment or use a value to fallback to as above, dev in our case.

Next is setting a retention policy to your function (logRetentionInDays). Some people forget to use this option but it's a good way to get rid of old logs and save money from decreasing the tons of logs stored on Cloudwatch. We are setting a different value in testing other than the production as we need the production logs to be kept longer. The values here are set by values in the custom section. We are setting a value depending on the stage we are deploying to. We are using this approach in most of our fields. It's good to write a generic serverless.yml file that could handle multiple stages/environments. So if you are using prod stage, it will get the value from custom.logRetention.prod. It will go and look if we have custom.logRetention.prod already defined. If not, it will set the value with custom.logRetention.other which we are using for all the testing environments.

For the subnetIDs and securityGroupIds options, you will use them if you need to connect the application to an internal resource in your cluster, so skip it if not. You will have to grab the subnets and security-groups that your other resources is hosted in.

Be carefull, as soon as you are connecting your function with a VPC, the function is no longer able to access the Internet, even if you are choosing a public subnet with a route to the Internet gateway for your function (see Internet Access for Lambda Functions to learn more). So you can’t have both: access to resources within your VPC and through the Internet.

To be able to access your VPC as well as the Internet, you need to spin up a NAT Gateway. Or in some cases, you might get away with a VPC Endpoint.

Last part here, is adding your deploymentBucket. It's important to put all of your deployments in one bucket. You will have each stage, function, and deployment in an organized directories with timestamps. If your going to skip this one, it will create a random bucket for each deployment.

package:
  include:
    - functions/api-service/.env.defaults
    - functions/api-service/app.js
    - functions/api-service/handler.js
    - functions/api-service/lib/**
    - functions/api-service/node_modules/**
Enter fullscreen mode Exit fullscreen mode

The package option is used to include and also exclude files and directories from your functions and can also help you in decreasing your application size that will be uploaded every deployment to S3.

functions:
  api-service:
    handler: functions/api-service/handler.api-service
    description: this function will receive data from multiple services
    environment:
      KAFKA_TOPIC: api.service
      KAFKA_BROKERS: ${self:custom.KAFKA_BROKERS.${self:provider.stage}, self:custom.KAFKA_BROKERS.other}
    events:
      - http:
          path: api-service
          method: post
          cors:
            origins:
              - https://*.example.com
            headers:
              - Content-Type
              - X-Amz-Date
              - Authorization
              - X-Api-Key
              - X-Amz-Security-Token
              - X-Amz-User-Agent
            allowCredentials: true
Enter fullscreen mode Exit fullscreen mode

The most interseting part is defining your functions array. In our case we are using just one function but you can add as many as you need depending on the pipeline you are working on or you are kinda nerd and using step functions πŸ€“. I recommend looking here if you are interested to play with step functions.

Name your function and add it alongside with its description and handler. Make sure you are writing the right handler function not the file name. Then, you can use environment option to export some environment variables to use while your function is running. You will realize we are using the same approach in handling the same environment variable with multiple stages like in ${self:custom.KAFKA_BROKERS.${self:provider.stage}, self:custom.KAFKA_BROKERS.other}

In the events section, it depends on what is triggering your function: api gateway, websocket, kinesis & dynamodb ,s3 , etc. Look here for more details.

We are using an API gateway with http endpoint handling only POST requests on /api-service. You can also add some CORS configurations to your function as we you see above.

plugins:
  - serverless-domain-manager
  - serverless-offline
Enter fullscreen mode Exit fullscreen mode

A good thing serverless framework provide you with, is plugins. We use serverless-offline for running our setup locally in the development environment and serverless-domain-manager plugin to handle the API gateway configuration.

custom:
  customDomain:
    domainName: ${self:custom.subDomain.${self:provider.stage}, self:custom.subDomain.other}.${self:custom.domain.${self:provider.stage}, self:custom.domain.other}
    certificateName: "*.${self:custom.domain.${self:provider.stage}, self:custom.domain.other}"
    basePath: "${self:custom.basePath.${self:provider.stage}, self:custom.basePath.other}"
    stage: ${self:provider.stage}
    createRoute53Record: true
  domain:
    prod: example-prod.com
    other: example.com
  subDomain:
    prod: api-service
    other: api-service
  basePath:
    prod: 
    other: ${self:provider.stage}
  subnetIds:
    prod: subnet-xxxxxxx
    other: subnet-yyyyyyy
  securityGroupIds:
    prod: 'sg-xxxxxxx'
    other: 'sg-yyyyyyy'
  KAFKA_BROKERS:
    prod: kafka-prod.example.com
    other: kafka-${self:provider.stage}.example.com
  logRetention:
    prod: 14
    other: 7
Enter fullscreen mode Exit fullscreen mode

In the custom section, it's like declaring variables. Variables allow you to dynamically replace config values in serverless.yml config. We declare one for the prod and other to fallback to when the stage is not prod. Like when we used ${self:custom.subnetIds.${self:provider.stage}, self:custom.subnetIds.other}.

The customDomain variable is used here to define the API gateway config. So, we are setting domainName, certificateName, basePath which is very helpful if the certificate you are using is not handling alternate domain names (CNAMEs) look here for more info,and the stage field. You can also set createRoute53Record with true to have the framework create for you Route53Records to use in your other external services.

Setting up the development environment

For the development environment, I preferred to run the serverless function using Docker to better handle the dependencies. So this is how the Dockerfile and the entrypoint look like:

FROM node:12.15.0-stretch

COPY ./.docker/startup.sh /etc/
RUN chmod +x /etc/startup.sh
CMD /etc/startup.sh
Enter fullscreen mode Exit fullscreen mode
#!/bin/bash

cd /var/www/app
npm install -g serverless
npm install
sls offline -o api-service
Enter fullscreen mode Exit fullscreen mode

You will see above how simple the setup is in the development environment. We are just running the sls command with the serverless-offline plugin. This will let a local API gateway to run on port 3000 inside the container and triggers the function once we hit it.

Our parameters are stored in functions/api-service/.env.defaults. We will handle our parameters in a different way in the other environments.

To run this setup locally, you can either use only docker command or you can also use docker-compose instead.

Docker command:

$ docker build -t api-service .
$ docker run --name api-service -d -p 80:3000 -v ${PWD}:/var/www/app api-service
Enter fullscreen mode Exit fullscreen mode

docker-compose:

version: '3.7'
services:
  # api service
  api:
    build:
      context: .
    ports:
      - "80:3000"
    volumes:
      - .:/var/www/app
Enter fullscreen mode Exit fullscreen mode
$ docker-compose up -d
Enter fullscreen mode Exit fullscreen mode

Setting up testing and production environments

Luckily, because we have written a very cool and generic serverless.yml file. We don't have much of steps to do in these environments. You will just need to run:

sls deploy -s [stage] -v
Enter fullscreen mode Exit fullscreen mode

This step will create a .serverless directory that will have the cloudformation template generated, serverless-state and the zip file that will be uploaded to the S3 bucket. I also like the -v option to see every resource created and every step serverless is making πŸ‘€

Automating production deployments

If you want to step higher with your pipeline and add some automation, you can use Jenkins to automate your CI/CD pipeline or autodeploy your application when merging your changes to master for example.

An example of a jenkinsfile is as below:

pipeline {
  agent {
    kubernetes {
      label 'api-service'
      defaultContainer 'jnlp'
      yaml """
apiVersion: v1
kind: Pod
metadata:
labels:
  component: ci
spec:
  # Use service account that can deploy to all namespaces
  containers:
  - name: node
    image: node:12.15.0-stretch
    command:
    - cat
    tty: true
  - name: awscli
    image: organs/awscli
    command:
    - cat
    tty: true

"""
}
  }
  stages {
    stage('build') {
      environment {
          BRANCH_NAME = sh ( script: "echo ${env.GIT_BRANCH}|  rev  | cut -f1 -d '/' | rev | sed 's/ *//g'", returnStdout: true ).trim()
      }
      steps {
        container('node') {
            sh """
               npm install
               """
        }
        container('awscli') {
            sh """
               # deploy to prod stage if branch is master
               if [ "${env.BRANCH_NAME}" == "master" ]; then
                 sls deploy -s prod -v
               fi
               """
        }
      }
    }
  }
}

Enter fullscreen mode Exit fullscreen mode

Closing

Serverless is an amazing technology, or a mindset if you will. It's a major step towards delegating infrastructure problems to companies that are much better positioned to deal with them. No matter how good you become at DevOps, you sometimes need those companies to handle and manage your applications you don't want to bother yourself with. And the main point here is building your architecture without spending ages building it yourself and saving tons of money hosting it.

I will continue writing about various serverless topics and case scenarios that I have faced that I also see it interesting and challenging.

Appreciate your comments and feedback 😊

Top comments (1)

Collapse
 
tifa64 profile image
Mostafa Ali Mansour

Really awesome and fruitful article, I have 2 suggestions though:

  • Set the runtime to by dynamically defined and fetch it from an endpoint which DevOps and Development team could access so that you won't have a missing dependency issue.

  • Kudos on being frugal regarding the memory, but you never know what traffic could cause so I suggest to either setup a SNS notification to alert you when the memory is about to cross a certain threshold or migrate to EKS which could autoscale.

But seriously bravo and really proud <3