golambda: A suite of Go utilities for AWS Lambda functions to ease adopting best practices

#go #aws #lambda

I often use Go language + AWS Lambda at work, especially developing backend processing for security monitoring related infrastructure. While advancing the development, there were various tips that said "this is convenient", but since the processing was too detailed, I used it for development by copying it between projects. However, as the number of projects managed has increased, the behavior has become different, and the number of tips has accumulated to some extent, so I cut it out as a package.

https://github.com/m-mizutani/golambda

Basically, I assume Lambda for data processing pipeline and a little integration, and implement the following four functions.

Event decapsulation
Structured logging
Error handling
Secret values management

Implemented features

Event decapsulation

AWS Lambda can have event source(s) and be invoked by trigger notifications from them. When triggered, the Lambda function is started by passing the data structure of the event source such as SQS and SNS. Therefore, it is necessary to extract the data to be used by yourself from various structural data. If you specify callback (Handler in the example below) in the function golambda.Start(), the necessary information is stored in golambda.Event and can be retrieved from there.

package main

import (
    "strings"

    "github.com/m-mizutani/golambda"
)

type MyEvent struct {
    Message string `json:"message"`
}

// Handler concat messages of SQS
func Handler(event golambda.Event) (interface{}, error) {
    var response []string

    // Extract body of SQS
    bodies, err := event.DecapSQSBody()
    if err != nil {
        return nil, err
    }

    // Handling as multiple data because of batch SQS messages
    for _, body := range bodies {
        var msg MyEvent
        if err := body.Bind(&msg); err != nil {
            return nil, err
        }

        // Save messages to slice
        response = append(response, msg.Message)
    }

    // concat
    return strings.Join(response, ":"), nil
}

func main() {
    golambda.Start(Handler)
}

This sample code is located in the ./example/deployable directory where you can deploy and try it out.

Contrary to the function DecapXxx that implements the process of retrieving data, the process of embedding data is prepared asEncapXxx. This allows you to write a test for the above Lambda Function as follows:

package main_test

import (
    "testing"

    "github.com/m-mizutani/golambda"
    "github.com/stretchr/testify/require"

    main "github.com/m-mizutani/golambda/example/decapEvent"
)

func TestHandler(t *testing.T) {
    var event golambda.Event
    messages := []main.MyEvent{
        {
            Message: "blue",
        },
        {
            Message: "orange",
        },
    }
    // Inject event data
    require.NoError(t, event.EncapSQS(messages))

    resp, err := main.Handler(event)
    require.NoError(t, err)
    require.Equal(t, "blue:orange", resp)
}

Currently, golambda supports SQS, SNS, and SNS over SQS (SQS queue that subscribes to SNS), but I'm planning to implement DynamoDB stream and Kinesis stream later.

Structured Logging

Standard log output destination of Lambda function is CloudWatch Logs, but Logs and Insights (log viewer) supports JSON-formatted logs. Therefore, it is useful to have a logging tool that can output in JSON format instead of using the Go language standard log package.

Logging on Lambda, including the log output format, generally has the same requirements. General logging tools have many options for output methods and formats, but we don't need to change the settings for each Lambda function. Also, in most cases, only the variables necessary for explaining the message + context are sufficient for the output content, so I implemented a wrapper with such a simplification in golambda. In the actual output part, zerolog is used. Actually, it was good to expose the logger created with zerolog as it is, but I thought it would be easier for me to narrow down what I could do, so I dared to wrap it.

A global variable called Logger is exported so that messages for each log level such as Trace, Debug, Info, and Error can be output. We have Set, which allows you to permanently embed any variable, and With, which allows you to add values in a method chain.

// ------------
// Use With() to embed temporary variables
v1 := "say hello"
golambda.Logger.With("var1", v1).Info("Hello, hello, hello")
/* Output:
{
    "level": "info",
    "lambda.requestID": "565389dc-c13f-4fc0-b113-xxxxxxxxxxxx",
    "time": "2020-12-13T02:44:30Z",
    "var1": "say hello",
    "message": "Hello, hello, hello"
}
*/

// ------------
// Use Set () to embed variables that you want to output permanently, such as request ID
golambda.Logger.Set("myRequestID", myRequestID)
// ~~~~~~~ snip ~~~~~~
golambda.Logger.Error("oops")
/* Output:
{
    "level": "error",
    "lambda.requestID": "565389dc-c13f-4fc0-b113-xxxxxxxxxxxx",
    "time": "2020-11-12T02:44:30Z",
    "myRequestID": "xxxxxxxxxxxxxxxxx",
    "message": "oops"
}
*/

In addition, CloudWatch Logs is relatively expensive for writing logs, and if you constantly output detailed logs, it will greatly affect the cost. Therefore, it is usually convenient to output only the minimum log so that detailed output can be performed only when troubleshooting or debugging. In golambda, the log output level can be tampered with externally by setting the LOG_LEVEL environment variable. (Because only environment variables can be easily changed from the AWS console etc.)

Error Handling

AWS Lambda should be implemented so that each function is as single feature as possible, and to implement complicated workflow, multiple Lambda can be combined with SNS, SQS, Kinesis Stream, Step Functions, etc. Therefore, if an error occurs in the middle of processing, do not try to recover forcibly in the Lambda code, but return the error as straightforwardly as possible to make it easier to notice by external monitoring, or benefit from Lambda's own retry function. It will be easier to receive.

On the other hand, Lambda itself does not handle errors very carefully, so you need to prepare your own error handling. As mentioned earlier, it is convenient to configure the Lambda function so that if something happens, it will just return an error and fail. So, in most cases, if an error occurs, if the main function ( Handler() in the example described later) returns an error, it will output all the information about the error, and it will be output here and there. There is no need to write a process to output a log at the location where an error occurs or to skip an error somewhere.

golambda mainly handles the following two errors called by golambda.Start().

Output detailed log of the error generated by golambda.NewError or golambda.WrapError
Send the error to the exception monitoring service (Sentry)

Output detailed log of the error

From my experience, when an error occurs, there are two main things you want to know for debugging: "where it happened" and "what kind of situation it happened".

Strategies to find out where the error occurred include adding a context using the Wrap function, or having a stack trace like the github.com/pkg/errors package. In the case of Lambda, if you implement it as simple as possible, in most cases you can find out where the error occurred and how it occurred in the stack trace.

Also, by knowing the contents of the variable that caused the error, you can understand the conditions for reproducing the error. This can be dealt with by logging the variables that are likely to be relevant each time an error occurs, but it will result in poor log visibility across multiple output lines (especially if the call is deep). Also, you have to simply write the log output code repeatedly, which makes it redundant, and it is difficult to write it simply, and it is troublesome when you want to make changes related to log output.

Therefore, for the error generated by golambda.NewError() or golambda.WrapError(), the variable related to the error can be routed by the function With(). The entity is simply stored in the variable map [string] interface {} in the form of key / value. When the main logic (Handler() in the example below) returns an error generated by golambda.NewError() or golambda.WrapError(), the variables stored byWith()and the error Prints the stack trace of the generated function to CloudWatch Logs. Below is an example of the code.

package main

import (
    "github.com/m-mizutani/golambda"
)

// Handler is exported for test
func Handler(event golambda.Event) (interface{}, error) {
    trigger := "something wrong"
    return nil, golambda.NewError("oops").With("trigger", trigger)
}

func main() {
    golambda.Start(Handler)
}

Lambda with above code will output a log with the variables stored in With inerror.values and the stack trace in error.stacktrace, as shown below. The stack trace is also output as text in the %+v format of github.com/pkg/errors, but golambda.Error supports JSON format according to the output of the structured log.

{
    "level": "error",
    "lambda.requestID": "565389dc-c13f-4fc0-b113-f903909dbd45",
    "error.values": {
        "trigger": "something wrong"
    },
    "error.stacktrace": [
        {
            "func": "main.Handler",
            "file": "xxx/your/project/src/main.go",
            "line": 10
        },
        {
            "func": "github.com/m-mizutani/golambda.Start.func1",
            "file": "xxx/github.com/m-mizutani/golambda/lambda.go",
            "line": 127
        }
    ],
    "time": "2020-12-13T02:42:48Z",
    "message": "oops"
}

Send the error to the exception monitoring service

There is no particular reason why using Sentry, but it is desirable to use some kind of exception monitoring service not only for API but also for Lambda function like Web application. The reasons are as follows.

Since it is not possible to determine whether the log ended normally or abnormally from the log output to CloudWatch Logs by default, it is difficult to extract only the log of the execution that ended abnormally.
CloudWatch Logs doesn't have a function to group errors, so it's difficult to find one that has a different type of error in only one out of 100 errors.

Both can be solved to some extent by devising the error log output method, but it is recommended to use the exception monitoring service obediently because you have to be careful and implement it.

golambda sends the error returned by the main logic to Sentry by specifying the Sentry's DSN (Data Source Name) as the environment variableSENTRY_DSN (Sentry + Go Details). It doesn't matter which error you send, but the errors generated by golambda.NewError orgolambda.WrapError implement a function called StackTrace () that is compatible with github.com/pkg/errors. Therefore, the stack trace is also displayed on the Sentry side.

This is the same as what is output to CloudWatch Logs, but since you can also check it on the Sentry side screen, "View notification" → "View Sentry screen" → "Search logs with CloudWatch Logs and check details" You may be able to guess the error in the second step. Also, the search for CloudWatch Logs is slightly heavy UI, so if you don't have to search, it's better...

By the way, when you send an error to Sentry, the Sentry event ID is embedded in the CloudWatch Logs log as error.sentryEventID, so you can search from the Sentry error.

Secret values management

In Lambda, parameters that change depending on the execution environment are often stored in environment variables and used. If it is an AWS account used by an individual, it is sufficient to store it in an environment variable, but in an AWS account shared by multiple people, by separating the secret value and the environment variable, Lambda information You can separate the person (or Role) who can only refer to the secret value and the person (or Role) who can also refer to the secret value. Even if it is used by an individual, if it deals with truly dangerous information, there may be cases where the authority is separated so that even if some access key is leaked, it will not die instantly.

In my case, I often use AWS Secrets Manager to separate permissions [^ use-param-store]. Retrieving the value from Secrets Manager is relatively easy by calling the API, but I got tired of writing the same process about 100 times, so I modularized it. You can get the value by adding the json meta tag to the field of the structure.

type mySecret struct {
    Token string `json:"token"`
}
var secret mySecret
if err := golambda.GetSecretValues(os.Getenv("SECRET_ARN"), &secret); err != nil {
    log.Fatal("Failed: ", err)
}

Not implemented features

I thought it would be useful, but I didn't implement followings.

Execute arbitrary processing just before timeout: Lambda will shutdown silently after the configured maximum execution time, so there is a technique to call some processing just before timeout to output performance analysis information. However, I was not in trouble in may cases by silent shutdown of Lambda timeout because I could find out bottleneck by a few logs.
Tracing: Powertools in Python provide the ability to measure performance on AWS X-Ray using annotations and so on. If you try to do this with Go, you can enjoy it more than Use official SDK. At the moment, I didn't come up with a better way in Go than [Use the official SDK], so I didn't work on it.