DEV Community

Ian Carlson
Ian Carlson

Posted on

Mastering AWS CDK Dependencies: How to Avoid the "Stuck Stack" Deadlock

Image of aws cdk technology

Mastering AWS CDK Dependencies: How to Avoid the "Stuck Stack" Deadlock

As applications built with the AWS Cloud Development Kit (CDK) grow in complexity, developers often encounter a specific, frustrating architectural hurdle: the CloudFormation export deadlock.

It usually begins innocently. You might have a DatabaseStack and an ApiStack. To allow the API to access the database, you pass the DynamoDB table object from one stack to the other. Behind the scenes, CDK creates a CloudFormation export/import relationship.

Months later, when you attempt to rename a resource or refactor the DatabaseStack, the deployment fails with a blocking error:

Export Output:MyTableExport is in use by stack ApiStack

You are now in a dependency deadlock. You cannot update the producer (Database) because the consumer (API) is referencing the export. Conversely, you cannot update the consumer to stop using the export because the deployment process relies on the valid state of the producer.

There are robust architectural patterns to prevent this (Consolidation and Runtime Decoupling), but first, we must address the immediate operational issue: How do you resolve an active deadlock?


Immediate Remediation: Breaking the Deadlock

If you are currently facing this error, standard deployments will not work because CloudFormation enforces strict referential integrity. You must manually break the dependency chain using a multi-step deployment process.

Step 1: Sever the Dependency (The "Detach" Deploy)

Modify your Consumer Stack (e.g., ApiStack) to stop referencing the Producer Stack's output.

  • The Code Change: Temporarily hard-code the value you were previously importing. For example, replace the imported Table Object with a string literal: const tableName = "my-actual-table-name".
  • The Deploy: Deploy only the Consumer Stack (cdk deploy ApiStack).
  • The Result: CloudFormation updates the API stack. It removes the Import reference, freeing the Export.

Step 2: Update the Producer

With the export no longer "in use," you are free to modify the Producer Stack.

  • The Code Change: Apply your intended refactor, rename, or deletion in the DatabaseStack.
  • The Deploy: Deploy the Producer Stack (cdk deploy DatabaseStack).
  • The Result: The deployment succeeds.

Step 3: Re-bind the Resources

Now that the infrastructure is in a clean state, you can restore the connection using one of the sustainable strategies outlined below.


Strategy 1: The Simplest Solution (Stack Consolidation)

Before implementing complex decoupling patterns, it is worth evaluating whether multiple stacks are necessary for your use case.

Historically, developers split applications into NetworkStack, DatabaseStack, and ApiStack to avoid hitting the CloudFormation limit of 200 resources per stack. AWS has since increased this limit to 500 resources. Consequently, most medium-sized applications can comfortably fit within a single stack context.

The Case for Consolidation

When resources exist in the same stack, CloudFormation manages dependencies internally using Direct References rather than Export/Imports.

  • Eliminates Deadlocks: CloudFormation automatically determines the correct order of operations for updates and replacements.
  • Atomic Deployments: The entire application either deploys successfully or rolls back completely, preventing inconsistent states.
  • Simplified Code: There is no need to pass properties between stack classes or manage complex interface contracts.

For single teams managing individual services, consolidation is often the most pragmatic approach.


Strategy 2: The Scalable Solution (Runtime Decoupling)

If organizational boundaries or lifecycle differences (e.g., stateful vs. stateless resources) require you to maintain multiple stacks, you must decouple them to avoid future deadlocks.

To do this, we shift from Deploy-time dependencies (hard CloudFormation Exports) to Runtime dependencies.

  1. Stack A writes a configuration value (e.g., a Table Name) to a specialized store, such as AWS Systems Manager (SSM) Parameter Store.
  2. Stack B provides its Lambda function with the key to that parameter.
  3. The Lambda fetches the value at runtime.

In this model, the stacks are unaware of each other's existence; they share only a contract regarding the parameter key.


Implementation: The SSM Parameter Store Pattern

Step 1: The Contract (Shared Configuration)

Define the parameter name in a shared constant file. This ensures both the producer and consumer reference the exact same location in the Parameter Store.

// lib/config.ts
export const TABLE_PARAM_KEY = '/my-app/prod/dynamodb-table-name';
Enter fullscreen mode Exit fullscreen mode

Step 2: The Producer (Database Stack)

Create the resource and publish its identifier to SSM.

// lib/database-stack.ts
import * as cdk from 'aws-cdk-lib';
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';
import * as ssm from 'aws-cdk-lib/aws-ssm';
import { TABLE_PARAM_KEY } from './config';

export class DatabaseStack extends cdk.Stack {
  constructor(scope: cdk.App, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // 1. Create the resource
    const table = new dynamodb.Table(this, 'MyTable', {
      partitionKey: { name: 'pk', type: dynamodb.AttributeType.STRING },
      billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,
    });

    // 2. Publish the value to Parameter Store
    new ssm.StringParameter(this, 'TableNameParam', {
      parameterName: TABLE_PARAM_KEY,
      stringValue: table.tableName,
    });
  }
}
Enter fullscreen mode Exit fullscreen mode

Step 3: The Consumer (API Stack)

The API stack does not import the DatabaseStack. It simply passes the parameter key to the Lambda environment and grants the necessary read permissions.

// lib/api-stack.ts
import * as cdk from 'aws-cdk-lib';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as ssm from 'aws-cdk-lib/aws-ssm';
import { TABLE_PARAM_KEY } from './config';

export class ApiStack extends cdk.Stack {
  constructor(scope: cdk.App, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // 1. Create the Lambda Function
    const apiHandler = new lambda.Function(this, 'ApiHandler', {
      runtime: lambda.Runtime.NODEJS_18_X,
      code: lambda.Code.fromAsset('lambda'),
      handler: 'index.handler',
      environment: {
        // Pass the KEY, not the value
        TABLE_CONFIG_KEY: TABLE_PARAM_KEY, 
      },
    });

    // 2. Grant Permissions
    // Construct a reference to the parameter to grant specific access
    const param = ssm.StringParameter.fromStringParameterName(
      this, 
      'TableParamRef', 
      TABLE_PARAM_KEY
    );

    param.grantRead(apiHandler);
  }
}
Enter fullscreen mode Exit fullscreen mode

Step 4: Runtime Logic with Caching

Fetching parameters from SSM involves an API call, which introduces latency and cost. To optimize this, we utilize the Lambda Execution Context to cache the value. Variables defined outside the handler persist between invocations as long as the execution environment remains active ("warm").

// lambda/index.mjs
import { SSMClient, GetParameterCommand } from "@aws-sdk/client-ssm";
import { DynamoDBClient, PutItemCommand } from "@aws-sdk/client-dynamodb";

const ssmClient = new SSMClient();
const dynamoClient = new DynamoDBClient();

// GLOBAL CACHE
// These variables persist across warm invocations
let cachedTableName = null;
let cacheTimestamp = 0;
const CACHE_TTL_MS = 60 * 1000 * 5; // Cache for 5 minutes

const getTableName = async () => {
  const now = Date.now();

  // 1. Check if cache is valid
  if (cachedTableName && (now - cacheTimestamp < CACHE_TTL_MS)) {
    console.log("Using cached table name");
    return cachedTableName;
  }

  // 2. If not, fetch from SSM
  console.log("Cache expired or empty. Fetching from SSM...");
  const command = new GetParameterCommand({
    Name: process.env.TABLE_CONFIG_KEY 
  });

  const response = await ssmClient.send(command);

  // 3. Update Cache
  cachedTableName = response.Parameter.Value;
  cacheTimestamp = now;

  return cachedTableName;
};

export const handler = async (event) => {
  try {
    const tableName = await getTableName();

    const command = new PutItemCommand({
      TableName: tableName,
      Item: {
        pk: { S: "user_123" },
        data: { S: "some data" }
      }
    });

    await dynamoClient.send(command);
    return { statusCode: 200, body: `Written to ${tableName}` };

  } catch (error) {
    console.error(error);
    return { statusCode: 500, body: "Error" };
  }
};
Enter fullscreen mode Exit fullscreen mode

Beyond Parameter Store: Advanced Decoupling

While SSM is the standard solution, specific requirements may dictate alternative stores.

1. Sensitive Data: AWS Secrets Manager

If the shared value is sensitive (e.g., database credentials, API keys), do not use SSM Parameter Store. AWS Secrets Manager is the appropriate tool. The implementation pattern remains identical: the Producer writes the secret, and the Consumer fetches it at runtime using the SDK, ensuring data remains encrypted and rotatable.

2. Service Discovery: AWS Cloud Map

For more complex, dynamic architectures, AWS Cloud Map offers a robust alternative to SSM. It allows you to register abstract resources (like S3 buckets or Event Buses) as "services" with custom attributes.

Why Cloud Map?

  • Logical Naming: Resolve my-app/user-service to dynamic endpoints.
  • Health Checks: Unlike SSM, Cloud Map can verify a resource's health before returning its address.
  • Metadata: Attach rich metadata (version, region, protocol) to the resource registration.

Conclusion

Dependency deadlocks in CloudFormation are a common rite of passage for CDK engineers, but they are not inevitable.

If your architecture allows, Strategy 1 (Consolidation) is the most efficient path, resolving dependencies natively within the CloudFormation engine. For distributed systems requiring modularity, Strategy 2 (Runtime Decoupling) provides the flexibility needed to evolve stacks independently without fear of "locking" your infrastructure.

Top comments (0)