Rajesh Murali Nair

Posted on Jan 19

Deploy-Time Intelligence in AWS CDK: Custom Resources in Action

#aws #cdk #lambda

Introduction

In real-world AWS platforms, a single CDK codebase is often deployed across multiple AWS accounts, each representing a different environment such as development, staging, or production.

While AWS CDK excels at defining infrastructure, it has a limitation:
it cannot make decisions at deploy time based on values stored inside the account, such as values stored in AWS Systems Manager (SSM) Parameter Store.

In this blog, we solve a practical platform-engineering problem using a Lambda-backed Custom Resource to make environment-aware decisions when installing an EKS Helm add-on.

The Real-Life Problem

You are a platform engineer managing EKS clusters across multiple AWS accounts:

Environment	Expectation
Development	Low cost, minimal redundancy
Staging	Production-like but smaller
Production	High availability

Your organization already stores the environment type centrally as an SSM parameter:

/platform/account/env

with values like:

development
staging
production

Now you want to install **ingress-nginx **on every EKS cluster, but configure it differently:

development → replicaCount = 1
staging / production → replicaCount = 2

Why Not Use CDK Context? (Short Version)

At first glance, CDK context variables may seem like a simpler solution for environment-based configuration. However, context values are resolved at synthesis time, not during deployment. This means they must be provided externally (via cdk.json or CI/CD pipelines) and are unaware of account-level metadata such as values stored in SSM Parameter Store. In multi-account platforms, this often leads to manual coordination, configuration drift, and governance issues. Since the environment classification already lives inside the AWS account and should be owned by the platform, using a deploy-time Custom Resource ensures the configuration is accurate, consistent, and centrally controlled.

Why CDK Alone Is Not Enough

AWS CDK evaluates logic during synthesis, but the SSM parameter value is only reliably available during deployment.

This means:

You cannot use CDK if statements
You cannot hardcode environment values
You cannot rely on CDK context safely

What you need is deploy-time logic.

The Solution: Lambda-Backed Custom Resource

A Custom Resource allows CloudFormation to:

Invoke a Lambda function during stack creation or update
Wait for the result
Consume returned attributes as inputs for other resources

In this case, the Custom Resource:

Reads the environment value from SSM
Computes the correct Helm value
Returns it to CDK
CDK passes it into the Helm chart

Architecture Overview

Deployment flow:

CDK creates:
- EKS cluster
- SSM parameter /platform/account/env
- Lambda function
Custom Resource triggers the Lambda
Lambda computes Helm values
Helm chart is installed using returned values

This keeps:

One CDK codebase
Zero manual steps
Environment-aware behavior

CDK Stack Code (Python)

Below is the CDK stack that creates:

EKS cluster
SSM parameter
Lambda function
Custom Resource
Helm chart with dynamic values

from aws_cdk import (
    Stack,
    aws_eks as eks,
    aws_ec2 as ec2,
    aws_iam as iam,
    aws_ssm as ssm,
    aws_signer as signer,
    aws_lambda as _lambda,
    custom_resources as cr,
    CustomResource,
    Token
)
from aws_cdk.lambda_layer_kubectl_v34 import KubectlV34Layer
from constructs import Construct
import os

class TestStack(Stack):

    def __init__(self, scope: Construct, construct_id: str, *, vpc: ec2.IVpc = None, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        aws_account_id = self.node.try_get_context("aws-account-id")

        # Create EKS Cluster
        cluster = eks.Cluster(
            self, "MyEKS",
            version=eks.KubernetesVersion.V1_34,
            endpoint_access=eks.EndpointAccess.PUBLIC_AND_PRIVATE,
            default_capacity=0,
            default_capacity_instance=ec2.InstanceType.of(
                ec2.InstanceClass.T3,
                ec2.InstanceSize.MEDIUM
            ),
            kubectl_layer=KubectlV34Layer(self, "kubectl"),
            vpc=vpc,
            vpc_subnets=[
                ec2.SubnetSelection(
                    subnet_type=ec2.SubnetType.PRIVATE_WITH_EGRESS
                )
            ],
            cluster_name="MyEKS",
            tags={"Name": "MyEKS", "Purpose": "Swisscom-Interview"}
        )

        # EKS Admin Access
        admin_user = iam.User(self, "EKSAdmin")
        cluster.aws_auth.add_user_mapping(
            admin_user, groups=["system:masters"]
        )

        # Store environment in SSM
        ssm.StringParameter(
            self, "MyEnvParam",
            parameter_name="/platform/account/env",
            string_value="development",
            description="Environment Name"
        )

        # Lambda code signing
        signing_profile = signer.SigningProfile(
            self, "SigningProfile",
            platform=signer.Platform.AWS_LAMBDA_SHA384_ECDSA
        )

        code_signing_config = _lambda.CodeSigningConfig(
            self, "CodeSigningConfig",
            signing_profiles=[signing_profile]
        )

        # Lambda Function
        fn = _lambda.Function(
            self, "MySSMParamLambda",
            runtime=_lambda.Runtime.PYTHON_3_13,
            handler="index.lambda_handler",
            code=_lambda.Code.from_asset(
                os.path.join(
                    os.path.dirname(__file__),
                    "lambda_functions"
                )
            ),
            environment={
                "SSM_PARAM_NAME": "/platform/account/env"
            },
            code_signing_config=code_signing_config
        )

        fn.add_to_role_policy(
            iam.PolicyStatement(
                actions=["ssm:GetParameter"],
                resources=[
                    f"arn:aws:ssm:{self.region}:{self.account}:parameter/platform/account/env"
                ]
            )
        )

        # Custom Resource Provider
        provider = cr.Provider(
            self, "EnvToHelmProvider",
            on_event_handler=fn
        )

        env_cr = CustomResource(
            self, "EnvToHelmValues",
            service_token=provider.service_token
        )

        replica_count = Token.as_number(
            env_cr.get_att("ReplicaCount")
        )

        # Install ingress-nginx Helm chart
        cluster.add_helm_chart(
            "nginx-ingress",
            chart="nginx-ingress",
            repository="https://helm.nginx.com/stable",
            namespace="kube-system",
            values={
                "controller": {
                    "replicaCount": replica_count
                }
            }
        )

Lambda Code (Custom Resource Logic)

This Lambda:

Reads the environment from SSM
Computes the correct replica count
Returns values back to CloudFormation

import boto3
import os
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)

ssm = boto3.client("ssm")
PARA_NAME = os.environ["SSM_PARAM_NAME"]

def get_parameter(para_name):
    parameter = ssm.get_parameter(Name=para_name)
    env = parameter["Parameter"]["Value"].strip().lower()

    if env == "development":
        value = 1
    elif env in ["staging", "production"]:
        value = 2
    else:
        raise ValueError(
            f"Invalid environment {env} in SSM Parameter {para_name}"
        )

    logger.info(
        f"Computed replicaCount={value} for env={env}"
    )

    return {
        "Environment": env,
        "ReplicaCount": value
    }

def lambda_handler(event, context):

    if event.get("RequestType") == "Delete":
        return {
            "PhysicalResourceId": event.get(
                "PhysicalResourceId", "env"
            ),
            "Data": {}
        }

    data = get_parameter(PARA_NAME)

    return {
        "PhysicalResourceId": f"{PARA_NAME}:{data['Environment']}",
        "Data": data
    }

Why This Pattern Works Well

This approach provides:

Single CDK codebase
Environment-aware behavior
No manual Helm overrides
Deploy-time decision making
Testable business logic

It scales well as you add more add-ons such as:

cert-manager
external-dns
cluster-autoscaler
logging agents

Conclusion

AWS CDK is declarative by nature, but real platforms require deploy-time intelligence. By combining Lambda-backed Custom Resources with CDK, you can make infrastructure decisions based on real account metadata, not hardcoded assumptions.

This pattern is a powerful tool for platform teams aiming for consistency, safety, and automation across environments.

DEV Community