Omid Eidivandi for AWS Community Builders

Posted on Feb 1 • Originally published at serverlessfolks.com on Feb 1

Sidecar Pattern In Serverless Design

#lambdawebadapter #lambdacustomimage #sidecarcontainer #lambdaextension

Observability becomes significantly more challenging when transitioning to distributed systems, particularly in Serverless architectures. While serverless design is beneficial for decomposition and scalability, its granular nature imposes challenges for observability. Therefore, it is important always to find ways to instrument the software without tightly coupling the instrumentation to the dedicated environment for core software processing.

This article explores two Serverless computing services and offers a straightforward method to centralize logging for Serverless applications:

Cloudwatch Subscription Filters
AWS Lambda Zip package with Extensions
AWS Lambda Custom Image with Extensions
AWS Lambda Web Adapter Image with Extensions
AWS Fargate with sidecar

About Provided Source Code

💡 The whole examples can be found in source code GitHub repository

The source code is designed as mono-repo using NX, and pnpm. The core package is a private lib that shares some central helpers with other modules. observability-core module is the prerequisite for all other modules and provides:

Lambda Extension Layer
ECR Repository
Base Container Image with Extension
Kinesis data Stream
IAM managed policy.

The dependencies are configured via nx.json file in root of the repository for the cdk and build targets. This will force the build and deployment of prerequisite module before the other modules.

{
   "targetDefaults":{
      "cdk":{
         "dependsOn":[
            {
               "projects":"@xaaxaax/observability-core",
               "target":"cdk",
               "params":"forward",
               "required":[
                  "projects",
                  "target"
               ]
            }
         ]
      },
      "build":{
         "dependsOn":[
            {
               "projects":"@xaaxaax/observability-core",
               "target":"build",
               "params":"forward",
               "required":[
                  "projects",
                  "target"
               ]
            }
         ]
      }
   }
}

The targets are defined in the root package.json file scripts section as below:

{
   "scripts":{
      "nx:build:all":"nx run-many --target=build --output-style static --skip-nx-cache",
      "nx:cdk:all":"nx run-many --target=cdk --output-style static --skip-nx-cache --require-approval never"
   }
}

💡 For simplicity the created functions are configured with function URLs and can be triggered easily. The only hint is a function gets triggered two times when triggering from the browser as an invocation will be for favicon.ico.

Cloudwatch Subscription Filters

AWS services like Lambda and Fargate have native integration with CloudWatch, but for critical workloads, the cost of log ingestion can become prohibitive. Depending on usage needs, CloudWatch logs can be utilized selectively, with different approaches available.

When using CloudWatch, Subscription Filters offer a way to forward logs to various destinations, including OpenSearch, Kinesis Data Streams, Amazon Data Firehose, or AWS Lambda.

In this section, CloudWatch Subscription Filters are used to stream logs to an Amazon Kinesis Data Stream for further processing and analysis.

The following code snippet showcases how to implement this log forwarding mechanism using AWS CDK:

    const logGropup = new LogGroup(this, 'LogGroup', {
      logGroupName: `/aws/lambda/${lambdaWithCloudwatch.functionName}`,
      retention: RetentionDays.ONE_DAY,
      removalPolicy: RemovalPolicy.DESTROY, });
    const logsDeliveryRole = new Role(this, `LogsDeliveryRole`, { 
      assumedBy: new ServicePrincipal('logs.amazonaws.com')
    }); 

    logGropup.addSubscriptionFilter('SubscriptionFilter', { 
      destination: new KinesisDestination(LogStream,{ 
         role: logsDeliveryRole 
     }),
     filterPattern: { 
         logPatternString: ' ', // this configure all logs to be filtered 
     } 
   }) 

   LogStream.grantWrite(logsDeliveryRole);

Invoking the lambda function will result sending log records in kinesis via cloudwatch subscription filters as shown is the following figure.

Lambda Telemetry Api

Lambda offers a Telemetry API, which will be excellent choice to capture function log records without using the cloudwatch logs. The logs received through the Telemetry API follow a straightforward format, as shown below.

{
   "time":"2025-01-30T00:00:00.000Z",
   "type":"function",
   "record":{
      "timestamp":"2025-01-30T00:00:09.429Z",
      "level":"INFO",
      "requestId":"79b4f56e-95b1-4643-9700-2807f4e68189",
      "message":"Log Message HERE"
   }
}

If The lambda LogFormat is TEXT the received format will be as the following snippet.

{
   "time":"2025-01-30T00:00:00.000Z",
   "type":"function",
   "record":"2025-01-30T00:00:09.429Z 79b4f56e-95b1-4643-9700-2807f4e68189 [INFO] Log Message HERE"
}

Lambda Extensions

For high-throughput applications, relying on CloudWatch Logs can lead to substantial costs. One way to mitigate this is by denying CloudWatch Logs permissions in the Lambda execution role. This prevents the Lambda service from sending logs to CloudWatch, and the consequent will be preventing the use of "Subscription Filters."

However, Lambda provides a Telemetry API that captures all logs, even when CloudWatch logging is disabled. By using the Extension API, you can subscribe to the Telemetry API and register for specific log categories, such as platform, function, or extension logs.

💡 The Source code repository provides the extension module here, that will be used for Lambda Zip package, Custom Image and Web Adapter image sections

The execution of extensions, whether as a standard ZIP package or a custom image, is managed by the Lambda runtime. The Lambda service scans the /opt/extensions directory and automatically executes any extensions found in that location.

For ZIP package deployments, this attachment occurs during the Lambda initialization phase, where the extensions path is constructed by aggregating all attached layers. However, for custom images, this structure must be manually set up during the container image build process.

This project generates two final assets from the same extension source, along with other previously mentioned resources. The extension itself is built using esbuild and bundled as JavaScript, with a post-build script handling the folder structure setup.

The provided final assets are:

A Lambda Layer
A ECR Base Container Image

Lambda Layer

For the layer the build process do all necessary steps. The only required step is to use CDK for creating the layer. The following snippet demonstrates the way to create a layer using CDK.

   const extension = new LayerVersion(this, 'kinesis-telemetry-api-extension', { 
      layerVersionName: `${props?.extensionName}`,
      code: Code.fromAsset(resolve(process.cwd(), `build`)),
      compatibleArchitectures: [
        Architecture.X86_64, 
        Architecture.ARM_64
      ], 
      compatibleRuntimes: [
         Runtime.NODEJS_20_X, 
         Runtime.NODEJS_22_X
      ], 
      description: props?.extensionName
   }); 
   // Exporting the Layer Arn to parameter store 
   new StringParameter(this, `LambdaExtensionArnParam`, { 
       parameterName: `/${props.contextVariables.stage}/${props.contextVariables.context}/telemetry/kinesis/extension/arn`,
       stringValue: extension.layerVersionArn
   });

The LayerVersion resource point to the build directory generated by build script, The underlying build folder structure is as below

- build 
   - extensions 
      - kinesis-telemetry-extension
   - kinesis-telemetry-extension
     - index.js

The kinesis-telemetry-extension file under extensions folder is an executable file that will be the entry point for lambda service to detect extension and execute it. The file name must be equal to the extension directory name for this example executable.

#!/bin/bash

set -euo pipefail
OWN_FILENAME="$(basename $0)"
LAMBDA_EXTENSION_NAME="$OWN_FILENAME"

echo "[extension:bash] launching ${LAMBDA_EXTENSION_NAME}"
exec "/opt/${LAMBDA_EXTENSION_NAME}/index.js"

Base Container Image

The base custom image with extension included is created using a Dockerfile. The Dockerfile simply use the built asset ( build folder ) contents and move it to the /opt/ directory in built image.

FROM node:22.13.1-slim

COPY build /opt/

WORKDIR /opt/extensions

The Image will be built and pushed to the ECR repository created via cdk which is a prerequisite for pushing the image to ECR. This is done using a post script.

{
   "name":"@xaaxaax/observability-core",
   ...
   "scripts":{
      "build:docker":"docker buildx build --platform linux/arm64 --no-cache -t $ECR_REPOSITORY:latest .",
      "postbuild:docker":"pnpm run build:docker:login && pnpm run build:docker:tag && pnpm run build:docker:push",
      "build:docker:login":"aws ecr get-login-password --region $REGION --profile admin@dev | docker login --username AWS --password-stdin $ECR_URI",
      "build:docker:tag":"docker tag $ECR_REPOSITORY:latest $ECR_URI/$ECR_REPOSITORY:latest",
      "build:docker:push":"docker push $ECR_URI/$ECR_REPOSITORY:latest",
      "cdk":"cdk --profile admin@dev --app 'tsx ./cdk/bin/app.ts' -c env=dev",
      "postcdk":"cross-env REGION=eu-west-1 ECR_URI=904233108557.dkr.ecr.eu-west-1.amazonaws.com ECR_REPOSITORY=lambda-telemetry-image pnpm run build:docker"
   }
   ...
}

Lambda Zip package with Extensions

Dealing with Zip lambda package is the simplest option by attaching the layer to the lambda function. but also giving the required permissions to the associated role for underlying infrastructure that the extension need to interact with that is kinesis data stream in the provided example.

The following CDK represents how the extension layer can be attached to the function and what are the required permissions.

    const extensionArn = StringParameter.fromStringParameterName( 
       this, 
       'extensionId', 
       `/${props.contextVariables.stage}/logs-collector-lambda-extension/telemetry/kinesis/extension/arn`
    ).stringValue;

    const managedPolicyArn = StringParameter.fromStringParameterName(
       this, 
       'policyName', 
       `/${props.contextVariables.stage}/logs-collector-lambda-extension/telemetry/kinesis/runtime/policy/arn`
    ).stringValue; 

    const functionRole = new Role(
       this, 
       'LambdaFunctionRole', { 
         assumedBy: new ServicePrincipal('lambda.amazonaws.com'),
         managedPolicies: [
            ManagedPolicy.fromAwsManagedPolicyName('service-role/AWSLambdaBasicExecutionRole'),
            ManagedPolicy.fromManagedPolicyArn(this, 'managed-policy', managedPolicyArn)] }); 

    const lambdaFunction = new NodejsFunction(this, 'LambdaZipFunction', { 
        entry: resolve(process.cwd(), 'src/handler.ts'), 
        ... 
        bundling: { ... }, 
        layers: [
           LayerVersion.fromLayerVersionArn(this, 'ExtensionArn', extensionArn)
        ] 
    });

Lambda Custom Image With Extensions

The example custom images build a docker image base function from a Dockerfile based on provided base image with extension included.

The Dockerfile is based on both extension image and lambda nodejs 22 image provided by aws. The interesting part of AWS provided image is that it can be run locally and can be invoked for example using a curl command.

FROM 904233108557.dkr.ecr.eu-west-1.amazonaws.com/lambda-telemetry-image:latest AS extensions
FROM public.ecr.aws/lambda/nodejs:22

WORKDIR ${LAMBDA_TASK_ROOT}

COPY dist/* ./
COPY --from=extensions ./opt/ /opt/

CMD ["index.handler"]

The built asset will be copied to the /var/task path that can be accessed using LAMBDA_TASK_ROOT env variable and finally the CMD layer point to the handler inside index.js file.

💡 In the example the base image imageUri is hardcoded in Dockerfile but it can be parametrized using parameter store and passing as docker build ARGs,

Lambda Web Adapter Image With Extensions

While Lambda Adapter provides a custom runtime, it brings some particularity to the way the Dockerfile shall be used. As per LWA documentation and examples, the base image used is public.ecr.aws/docker/library/node:22.9.0-slim and not public.ecr.aws/lambda/nodejs:22 which means the lambda api interface no more can be used for local invocation, e.g using curl.

The example Dockerfile uses multiples stages

FROM 904233108557.dkr.ecr.eu-west-1.amazonaws.com/lambda-telemetry-image:latest AS extensions
FROM public.ecr.aws/awsguru/aws-lambda-adapter:0.9.0-aarch64 AS webadapter
FROM public.ecr.aws/docker/library/node:22.9.0-slim

WORKDIR ${LAMBDA_TASK_ROOT}

COPY dist/* ./
COPY --from=extensions ./opt/ /opt/
COPY --from=webadapter /lambda-adapter /opt/extensions/lambda-adapter

CMD ["node", "index.js"]

The image will use the extension base image alongside the lambda web adapter base image and use the contents of ./opt folder related to each extension. Also it uses the built asset of function code ( here a http server on default port of LWA 8080 ).

The particularity behavior for LWA in our example is the way function logs are received in extension. Only the function logs are under this unfortunate behavior and are not formatted as a valid json object but are treated as text event while the LambdaLogFormat is set as JSON. so the above official format is not working and element.record.message will result an undefined value. The following shows how the record is received which is a representation of Javascript object surrounded by double quotes.

{
   "time":"2025-01-29T21:24:33@.665Z",
   "type":"function",
   "record":"{ name: 'omid' }"
}

To resolve the problem, the extension is adopted to look at element.record if the element.record.message is value. But event the change is not sufficient as the received record is a double quoted JS object. So the log data must be formatted using JSON.stringify().

console.log(JSON.stringify( logObject ));

Fargate with Firelens sidecar

Fargate, as a serverless solution for running containers on demand, supports both short-lived and long-running tasks. Regardless of the use case, enabling containers to communicate and complement each others capabilities is essential for building scalable and efficient architectures.

In line with the examples in this article, this section demonstrates how to forward container logs to a central Kinesis Data Stream. To achieve this, the Fargate task can include a sidecar container responsible for collecting logs and forwarding them to the data stream.

The application container provides a Dockerfile as below

FROM --platform=linux/arm64 public.ecr.aws/docker/library/node:22-slim

COPY dist/* ./

CMD ["node", "index.js"]

But as mentioned above, there will be another Dockerfile for the log forwarder container that uses the FluentBit image provided by aws.

FROM amazon/aws-for-fluent-bit:latest

ADD container.conf /container.conf
ADD parsers.conf /parsers.conf

As shown in the Dockerfile there are two configuration files for Parsing and Container specific configurations such as Filtering , etc. the content of both files are as below

// parsers.conf file
[PARSER]
    Name    log_json
    Format  json

// container.conf file
[SERVICE]
    Parsers_File    parsers.conf

[FILTER]
    Name            parser
    Match           *
    Key_Name        log
    Parser          log_json

[FILTER]
    Name            grep
    Match           *
    Regex           app_name fargate-example-app

Let see how these are deployed and resources are created. AWS Cdk provides the L2 constructs that can be used to simplify the infra as code steps. The example uses the FargateTaskDefinition and FargateService constructs.

   const jobDefinition = new FargateTaskDefinition(this, 'JobDefinition', { 
      cpu: 256, 
      memoryLimitMiB: 512, 
      runtimePlatform: { 
        cpuArchitecture: CpuArchitecture.ARM64,
        operatingSystemFamily: OperatingSystemFamily.LINUX
      }, 
      taskRole: jobTaskRole, 
      executionRole: jobTaskExecutionRole
   });

After creating the base TaskDefinition, the app container and firelens router will be added as below

    jobDefinition.addContainer('Container', { 
      image: ContainerImage.fromAsset(join(process.cwd())), 
      logging: LogDrivers.firelens({ 
        options: { 
          Name: 'kinesis_streams', 
          region, 
          stream: props.streamName
        }
      })
    }); 

    jobDefinition.addFirelensLogRouter('LoggingContainer', { 
       image: ContainerImage.fromAsset(join(process.cwd(), 'fluent-bit')), 
       logging: LogDrivers.awsLogs({ 
          streamPrefix: 'logging',
          logGroup: new LogGroup(this, 'FireLensLogGroup', {
             logGroupName: `/ecs/${props.contextVariables.context}`, 
             retention: RetentionDays.ONE_DAY, 
             removalPolicy: RemovalPolicy.DESTROY
         })
      }),
      environment: { FLB_LOG_LEVEL: 'info' }, 
      firelensConfig: { 
        type: FirelensLogRouterType.FLUENTBIT, 
        options: { 
          configFileType: FirelensConfigFileType.FILE, 
          configFileValue: '/container.conf'
        }
      }
   });

A service shall be created to encapsulate a task consisting of two side by side containers. The is simple and straightforward.

   const service = new FargateService(this, 'Service', {
      cluster, 
      capacityProviderStrategies: capacityStrategy,
      desiredCount: 1,
      platformVersion: FargatePlatformVersion.VERSION1_4,
      propagateTags: PropagatedTagSource.TASK_DEFINITION, 
      taskDefinition: jobDefinition, 
      assignPublicIp: true, 
      vpcSubnets: { 
        subnets: vpc.publicSubnets 
      }, 
      securityGroups: [taskSecurityGroup]
   });

💡 For simplicity, the example allow assigning a public ip address to the task and put the service in public subnets, the reason that this is required is the FargatePlatformVersion.VERSION1_4 is under managed awsvpc and this is the simplest way to let fargate pull images from ECR. This is not recommended for production cases.

The Task role must have the kinesis PutRecords action permissions. Here the observability-core stack provides a managed policy that can be attached to the role.

  const managedPolicyArn = StringParameter.fromStringParameterName(
    this, 
    'ObservabilityManagedPolicy',
    `/${props.contextVariables.stage}/logs-collector-observability-core/telemetry/kinesis/runtime/policy/arn`
  ).stringValue; 

  const jobTaskRole = new Role(this, 'JobTaskRole', { 
     assumedBy: new ServicePrincipal('ecs-tasks.amazonaws.com'), 
     managedPolicies:[ 
        ManagedPolicy.fromManagedPolicyArn(this, 'TaskRoleManagedPolicy', managedPolicyArn)
     ]
 }); 

  const jobTaskExecutionRole = new Role(this, 'JobTaskExecutionRole', {
     assumedBy: new ServicePrincipal('ecs-tasks.amazonaws.com'),
     managedPolicies: [ 
         ManagedPolicy.fromAwsManagedPolicyName('service-role/AmazonECSTaskExecutionRolePolicy')
     ]
   });

After deploying attached IP to the created ENI can be used over http protocol. The logs will be sent to the Kinesis DS as shown in following screenshot.

💡 The Firelens has the same problem as LWA mentioned before, The log metadata object must be stringyfied. If the JS object is directly logged the same behavior will be provided as Web Adapter.

Conclusion

While Serverless offers a wide range of managed services that scale per needs, It is important to forget the shared responsibility that forces engineering teams to be engaged on their side. As part of software development the use of Processors capacity and Memory is the part under engineering teams ownership. This is not far from traditional software principals but is somehow forgotten by the fascinating nature of managed services.

Using Lambda extensions, multi container, or background processes is the way to apply processing isolation and achieve more trustable software which is running as a foreground process.

The article focuses on log aggregation to represent how decouple the critical processing from non critical ones via isolation and provides some examples to showcase the implementation in different scenarios.