Steven Smiley

Posted on Mar 3, 2022 • Edited on Mar 15, 2022

12-Factor AWS CDK Apps

#aws #devops #awscdk #cloud

The twelve-factor app methodology outlines best practices for building web apps to enable scalability and reliability on cloud platforms. AWS CDK focuses on deploying cloud infrastructure, but enables you to put infrastructure, application code, and configuration all in one place. What can we learn if we apply the twelve-factor app methodology apply to AWS CDK? Is it still relevant in an increasingly serverless world?

I won't bury the lede: CDK makes it easy to adhere to the 12-factor model, and embracing serverless on AWS makes many of the factors so easy that you won't have to worry about them. Let's examine each factor and uncover practices we can apply.

The Twelve Factors

I. Codebase

One codebase tracked in revision control, many deploys

CDK enables you to have one codebase by grouping an app's infrastructure and application code into the same repository. Without this, an app's source code isn't really complete -- how is this app deployed?

CDK enables multiple apps to share the same code via constructs, which are then imported as libraries. Without CDK, infrastructure code is often copied and pasted multiple times across apps or even within the same app -- violating this factor and the Don't Repeat Yourself (DRY) principle.

CDK makes many deploys easy as well, by introducing the concept of environments, especially in conjunction with CDK Pipelines. By defining pipeline stages, logical groupings of stacks, you can deploy to as many environments as you want using the same codebase.

II. Dependencies

Explicitly declare and isolate dependencies

Each CDK-supported language has a mechanism for both declaring dependencies (for example Python's requirements.txt) and isolating dependencies (for example Python's Virtualenv). This greatly simplifies environment setup and ensure reliability when deploying, especially through a pipeline.

Be sure to avoid installing or updating packages outside of this mechanism. If you pip install xxxxx into your virtualenv but forget to add it to your requirements.txt, your app will cdk synth locally but fail on deployment.

III. Config

Store config in the environment

Separating config from app code is much easier with CDK than with purely declarative formats like CloudFormation. It can be tempting to hard-code values in templates rather than declare dozens or hundreds of variables, but this would prevent them from being portable.

The line between app and environment can be difficult to discern with CDK, however, as CDK is deploying the infrastructure which makes up the environment. For specific application components, you should use CDK to provision infrastructure with the appropriate environment variables. For example, using Lambda environment variables, SSM parameters, or Secrets Manager secrets. But how do we handle configuration of the CDK app itself?

Each CDK app has an App construct which defines the entry point and takes an environment parameter, binding the app to the environment. With CDK Pipelines, the App contains a pipeline stack, which then defines the rest of the deployment environments. In each pipeline stage, you'll need to provide the configuration parameters for that stage.

We can store this config as context variables in the cdk.json, but this doesn't scale well and it doesn't support types. Since CDK gives us the power of a general purpose programming language, let's use it! For example, in Python we can create a config.py file and define variables, or even objects with types, which we then import into the pipeline.

You may be tempted to avoid declaring the environment in the CDK application altogether, to strictly separate the two. This is possible, but not recommended for production -- during CDK's synth phase, it will resolve environment-specific details that you want to ensure are deterministic. For example, you might look up an existing VPC or distribute resources across availability zones. This information will be stored in cdk.context.json and won't need to be looked up again, preventing unexpected behavior.

IV. Backing services

Treat backing services as attached resources

CDK apps do not use backing services in the way this factor is traditionally viewed, but we can still learn from the principle. At its core, this factor advocates for strong encapsulation and loose coupling between app components. This allows us to separate concerns across clear boundaries, iterate on independent components without affecting interfacing components, and potentially swap them out completely if needed.

To achieve this with CDK apps, encapsulate logical components into constructs and compose them into stacks and stages for deployment. This may increase the verbosity of an app since it requires passing parameters across construct boundaries but the resulting constructs are portable, testable, and maintainable.

V. Build, release, run

Strictly separate build and run stages

This is straightforward: build occurs during cdk synth, release occurs during cdk deploy, and run is the resulting resources managed by AWS.

To go a step further, CDK Pipelines automatically defines separate stages to build the CDK app and then perform CloudFormation deployments, each with separate prepare and deploy steps.

This factor is built into CDK -- awesome!

VI. Processes

Execute the app as one or more stateless processes

This factor does not directly translate to CDK apps but there is a principle we can extract from it -- strong delineation between stateful and stateless resources.

In CDK, this is best performed at the stack level, separating stateful and stateless stacks. When defining a database, for instance, place it in a separate stack and use cross-stack references to connect other resources to it. You can also enable termination protection on stateful stacks.

At the resource level, set RemovalPolicy to RETAIN or SNAPSHOT to avoid deleting data.

VII. Port binding

Export services via port binding

Again, this factor does not apply to the CDK-portion of an app, but there's still an underlying principle: create self-contained apps that can interface with other apps through web services.

We accomplish this with CDK by grouping all related resources into a single CDK app, which may provide backing services for other apps using deliberately-exposed interfaces. This also encourages us to prevent other apps from modifying our apps underlying resources.

In AWS, the strictest boundary is at the account level. By deploying CDK apps into separate AWS accounts, we prevent direct access to underlying resources, allowing access only across deliberately exposed interfaces, such as web APIs or cross-account shared resources.

VIII. Concurrency

Scale out via the process model

One of the primary benefits of using AWS in the first place is that most services are designed from the ground up with this principle built-in. The more we build apps by connecting those services together instead of running our own processes, the less we have to think about concurrency.

We do have to consider AWS service limits, however. CDK doesn't check to see if we have a sufficiently-high limit during the build phase, and will fail during deployment if these limits are reached. For example, currently CloudFormation limits each stack to 500 resources. I don't know of any tool that can check templates against service limits, but this could be an opportunity for the future.

IX. Disposability

Maximize robustness with fast startup and graceful shutdown

In the context of a CDK app, this factor encourages us to minimize deployment time and design stateless stacks that can be deleted without side effects.

CloudFormation deployments can feel slow compared to competing tools like Terraform because they perform many built-in checks to improve deployment safety. But that doesn't mean they have to be slow. One easy way to improve deployment speed is using CDK Pipelines waves, which deploy stacks and stages in parallel. This achieves a healthy balance between speed and safety.

X. Dev/prod parity

Keep development, staging, and production as similar as possible

Without Infrastructure as Code, this factor is essentially impossible. But even with other approaches to IaC, this can be difficult to achieve and often requires significant effort in developing parameterized templates and a deployment pipeline. CDK Pipelines makes this so easy!

Define each stage and stack to take configuration parameters for anything that needs to be different across environments, then use those same constructs to deploy to each environment with their respective parameters. This one pipeline can deploy across AWS accounts, so each is isolated. You can even set a ManualApprovalStep between environments if desired, to gate deployment to production.

XI. Logs

Treat logs as event streams

This is already the model for CloudWatch Logs, so there isn't much for us to consider here. One CDK tip, however, is to always set log retention policies. The default log retention is infinite, and you don't want to pay for accumulating a bunch of logs you don't need.

XII. Admin processes

Run admin/management tasks as one-off processes

If there are admin tasks that you want to be prepared to execute on-demand, consider creating them with Systems Manager Automation documents or Step Functions, which can now perform any AWS API action directly. You can invoke them on-demand with input parameters from the AWS Console. By bundling maintenance operations with the app, whoever needs them later will be able to find them easily and will be very happy, that might even be future you!

Conclusion

We've seen that CDK is not only compatible with the 12 factor model, but much of it is built into the CDK model. And the more we leverage 'serverless' AWS services, the less we have to worry about these altogether.

Below I have summarized the key practices for building 12-factor CDK apps. Unsurprisingly, this list is compatible with the CDK developer guide's best practices, but through this exercise we've learned some specific techniques and the reasoning behind them.

Key Practices for 12-Factor CDK Apps

Model apps using constructs, composed into stacks and stages for deployment.
- When constructs need to be used in multiple apps, move them to their own repository so they can be maintained independently of the application's lifecycle.
- Separate stateful and stateless stacks. When defining a database, for instance, place it in a separate stack and use cross-stack references to connect other resources to it.
  - Enable termination protection on stateful stacks and resources so they aren't accidentally deleted.
  - Set RemovalPolicy to RETAIN or SNAPSHOT to avoid deleting data.
- Use CDK Pipelines for painless Continuous Delivery
  - Group stacks and stages into waves for parallel deployment
  - Deploy CDK apps into separate AWS accounts to prevent direct access to underlying resources, allowing access only across deliberately exposed interfaces, such as web APIs or cross-account shared resources.
Ensure you are using the dependency isolation mechanism expected by your CDK app's language. In Python, source .venv/bin/activate.
- Declare and install dependencies using only the mechanism supported by your CDK app's language. In Python, add it to requirements.txt and run pip install -r requirements.txt.
Define separate files for constants and config
- Import config only to your main app and pipeline. If you find yourself importing config directly into a stage, stack, or construct -- stop. If it varies across environments it needs to be a parameter, and if it doesn't then it needs to be defined as a constant.
- Constants can be used across the app, as long as their values would not change across environments. For example, it can be useful to define tag keys and values that will apply globally.
- Explicitly specify production environments and commit cdk.context.json to avoid non-deterministic behavior.
Set LogRetention policies
Create admin processes in your app to support operations

DEV Community