Overview
When introducing breaking changes in CDK, implementing feature flags is mandatory.
Since there isn't much information about feature flags, I created this article with two goals:
- Summarize the implementation method for contributors
- Explain the role and purpose of feature flags for CDK users who are unfamiliar with them
If you're in the latter category, feel free to skip the implementation steps section.
Introduction
In 2023, AWS Network Load Balancer (NLB) finally introduced the long-awaited security group (SG) support.
This was a feature that most NLB users had been eagerly waiting for, and with its introduction, there's no longer any benefit to using the legacy "NLB without SG."
https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-security-groups.html
AWS CDK also enabled SG configuration immediately after the release, but to avoid breaking changes, the default behavior was still to create NLBs without SGs.
However, once you create an NLB without SG, it's impossible to add SG configuration later. (The NLB will be replaced!)
declare const vpc: ec2.IVpc;
// Creates NLB without SG
new elbv2.NetworkLoadBalancer(this, 'Nlb', {
vpc,
});
// Creates NLB with SG
new elbv2.NetworkLoadBalancer(this, 'Nlb', {
vpc,
securityGroups: [sg] ← Having to specify SG every time is tedious
});
// Desired behavior: NLB with SG is created by default
new elbv2.NetworkLoadBalancer(this, 'Nlb', {
vpc,
// securityGroups: [sg] ← No explicit SG configuration needed
});
To improve this situation, I submitted a PR using feature flags to make implicit SG enablement the default behavior. It took about six months to merge, but it was successfully released in v2.221.1.
From here, I'll summarize the know-how for implementing feature flags in CDK using the PR creation process as an example.
What Are Feature Flags?
Feature flags in CDK are a mechanism for gradually introducing breaking changes. New projects get the new behavior by default, while existing projects maintain the old behavior and can explicitly enable the flag to use the new feature.
For a detailed explanation, I'll defer to this excellent existing article. (Sorry for in Japanese)
https://www.ogis-ri.co.jp/otc/hiroba/technical/cdk-concepts/part8.html
Existing Implementation
Let's first review the implementation before the modification.
export class NetworkLoadBalancer extends BaseLoadBalancer implements INetworkLoadBalancer {
...
this.connections = new ec2.Connections({
securityGroups: props.securityGroups,
});
...
}
This code might seem a bit confusing at first...
The NetworkLoadBalancer class inherits IConnectable via INetworkLoadBalancer, allowing you to configure security group rules through connections.
Many of you are probably familiar with allowTo/From() methods like the following. The IConnectable interface makes this possible.
declare const instance: ec2.Instance;
const nlb = new elbv2.NetworkLoadBalancer(this, 'Nlb');
nlb.connections.allowTo(instance, ec2.Port.tcp(8080));
Classes that inherit IConnectable need to store an ec2.Connections instance with security groups as arguments in this.connections. This is what the CDK implementation above is doing.
this.connections = new ec2.Connections({
securityGroups: props.securityGroups,
});
When you pass undefined to securityGroups of ec2.Connections, no security group is created, resulting in a legacy NLB without any security group attached.
In other words, with the existing implementation, unless you explicitly pass security groups to props.securityGroups, you couldn't create an NLB with security group configuration.
The Naive Approach
How should we make security groups created by default?
The simplest approach would be to create a security group within the L2 construct if props.securityGroups is undefined.
export class NetworkLoadBalancer extends BaseLoadBalancer implements INetworkLoadBalancer {
...
this.connections = new ec2.Connections({
// Create new SG if props.securityGroups is undefined
securityGroups: props.securityGroups ?? new ec2.SecurityGroup({...}),
});
...
}
However, if we actually released this implementation, a very serious problem would occur.
For example, suppose you had already created legacy NLBs without SGs using CDK. If you upgrade aws-cdk-lib and deploy in this state, all NLBs would be changed to NLBs with SG configuration by default, causing all existing legacy NLBs to be recreated. NLB recreation involves changes to access URL FQDNs and IPs, so it's easy to imagine the chaos of production systems going down one after another.
To avoid this problem, let's implement a feature flag to achieve:
- (i) No changes for existing users
- (ii) Provide NLBs with security group configuration for new users
Implementation Steps
1. Define the Feature Flag
First, add the flag to packages/aws-cdk-lib/cx-api/lib/features.ts. This file contains a list of feature flags defined as constants.
export const NETWORK_LOAD_BALANCER_WITH_SECURITY_GROUP_BY_DEFAULT = '@aws-cdk/aws-elasticloadbalancingv2:networkLoadBalancerWithSecurityGroupByDefault';
For the feature flag name, just write out what changes the flag causes in lowerCamelCase.
Don't worry about the length - define it freely like @aws-cdk/aws-ecs:reduceEc2FargateCloudWatchPermissions or @aws-cdk/s3-notifications:addS3TrustKeyPolicyForSnsSubscriptions.
Also, add the feature flag definition to FLAGS in features.ts. It should be appended at the bottom, otherwise you'll get a rosetta error (unverified).
export const FLAGS: Record<string, FlagInfo> = {
...(existing feature flag definitions),
//////////////////////////////////////////////////////////////////////
[NETWORK_LOAD_BALANCER_WITH_SECURITY_GROUP_BY_DEFAULT]: {
// Choose from ApiDefault, BugFix, VisibleContext, Temporary. Almost always ApiDefault or BugFix.
type: FlagType.ApiDefault,
// Summary
summary: 'When enabled, Network Load Balancer will be created with a security group by default.',
// Detailed description
detailsMd: `
When this feature flag is enabled, Network Load Balancer will be created with a security group by default.
`,
// When it was introduced. V2NEXT is fine. It will be replaced by sed during release.
introducedIn: { v2: 'V2NEXT' },
// Recommended value
// This value is set in cdk.json when creating a new project (cdk init)
recommendedValue: true,
// Default value when not set in cdk.json
// This value is used when existing users upgrade aws-cdk-lib, so set a value that reproduces existing behavior
// If not set, defaults to false.
unconfiguredBehavesLike: { v2: false },
// How to achieve the old behavior
compatibilityWithOldBehaviorMd: 'Disable the feature flag to create Network Load Balancer without a security group by default.',
},
Feature flags are recommended to have true for new behavior and false for old behavior.
If you follow this guideline, you only need to set recommendedValue: true, and unconfiguredBehavesLike can be left undefined.
https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md#feature-flags
(I find the mysterious /////// section dividers a bit quirky)
2. Add Feature Flag Documentation
This is the part I still don't fully understand the correct way to write.
First, there are two choices for documentation: README.md or FEATURE_FLAGS.md. The official documentation recommends the former, but the majority of merged PRs use the latter. Sometimes both are updated in the same PR.
Furthermore, there are two candidate files for FEATURE_FLAGS.md, and it's unclear which one to write in.
As mentioned later, there are multiple merge records with modifications only to packages/aws-cdk-lib/cx-api/FEATURE_FLAGS.md, so that should be sufficient.
If modifying README.md
Add to packages/aws-cdk-lib/cx-api/README.md.
* `@aws-cdk/aws-elasticloadbalancingv2:networkLoadBalancerWithSecurityGroupByDefault`
When this feature flag is enabled, Network Load Balancer will be created with a security group by default.
_cdk.json_
{
"context": {
"@aws-cdk/aws-elasticloadbalancingv2:networkLoadBalancerWithSecurityGroupByDefault": true
}
}
If modifying FEATURE_FLAGS.md
Add to packages/aws-cdk-lib/cx-api/FEATURE_FLAGS.md or packages/@aws-cdk/cx-api/FEATURE_FLAGS.md.
Personally, I only modify the former.
First, add a summary to the overview table.
| Flag | Summary | Since | Type |
| ----- | ----- | ----- | ----- |
| [@aws-cdk/aws-elasticloadbalancingv2:networkLoadBalancerWithSecurityGroupByDefault](#aws-cdkaws-elasticloadbalancingv2networkloadbalancerwithsecuritygroupbydefault) | When enabled, Network Load Balancer will be created with a security group by default. | V2NEXT | new default |
Next, add the recommended value to the cdk.json example list.
{
"context": {
... (existing flag list),
"@aws-cdk/aws-elasticloadbalancingv2:networkLoadBalancerWithSecurityGroupByDefault": true,
}
Finally, add the feature flag summary.
### @aws-cdk/aws-elasticloadbalancingv2:networkLoadBalancerWithSecurityGroupByDefault
*When enabled, Network Load Balancer will be created with a security group by default.*
Flag type: New default behavior
When this feature flag is enabled, Network Load Balancer will be created with a security group by default.
| Since | Unset behaves like | Recommended value |
| ----- | ----- | ----- |
| (not in v1) | | |
| V2NEXT | `false` | `true` |
**Compatibility with old behavior:** Disable the feature flag to create Network Load Balancer without a security group by default.
For "Since", write the version when it was introduced, but at PR time, V2NEXT is fine. It will be replaced with the actual version during release.
3. Implementation in the Construct
This is the main part. Check the feature flag in the actual L2 construct and branch the behavior based on its value.
Make it perform the new behavior when true and the old behavior when false.
import { FeatureFlags } from '../../core';
constructor(scope: Construct, id: string, props: NetworkLoadBalancerProps) {
// Get the feature flags configured for this CDK App
const enableDefaultSg = FeatureFlags.of(this).isEnabled(
cxapi.NETWORK_LOAD_BALANCER_WITH_SECURITY_GROUP_BY_DEFAULT
);
// Create security group by default if flag is enabled (new behavior)
if (enableDefaultSg) {
this.connections = new ec2.Connections({
securityGroups: props.securityGroups ?? new ec2.SecurityGroup({...}),
});
// Old behavior if flag is disabled
} else {
this.connections = new ec2.Connections({
securityGroups: props.securityGroups,
});
}
}
The implementation itself is quite simple.
4. Create Unit Tests
Test the CloudFormation template content for both feature flag true/false states.
In typical cases, unit tests for disabled feature flag (existing behavior) should already be implemented, so add unit tests with the feature flag enabled.
test('creates NLB with auto-generated security group', () => {
app = new cdk.App({
postCliContext: { // Inject feature flag into the App here
'@aws-cdk/aws-elasticloadbalancingv2:networkLoadBalancerWithSecurityGroupByDefault': true,
},
// GIVEN
// Define as a Stack under the App with feature flag enabled
const stack = new cdk.Stack(app);
const vpc = new ec2.Vpc(stack, 'Stack');
// WHEN
new elbv2.NetworkLoadBalancer(stack, 'LB', {
vpc,
internetFacing: true,
});
// THEN
const template = Template.fromStack(stack);
template.hasResourceProperties('AWS::ElasticLoadBalancingV2::LoadBalancer', {
Scheme: 'internet-facing',
SecurityGroups: [
{
'Fn::GetAtt': [
Match.stringLikeRegexp('LBSecurityGroup.*'),
'GroupId',
],
},
],
Type: 'network',
});
template.hasResourceProperties('AWS::EC2::SecurityGroup', {
GroupDescription: Match.stringLikeRegexp('Automatically created Security Group for ELB.*'),
VpcId: { Ref: Match.stringLikeRegexp('Stack.*') },
SecurityGroupEgress: [{
CidrIp: '255.255.255.255/32',
Description: 'Disallow all traffic',
FromPort: 252,
IpProtocol: 'icmp',
ToPort: 86,
}],
});
});
By enabling the feature flag, we can confirm that a security group is now configured on the NLB by default.
5. Create Integration Tests (integ test)
Create integ tests to verify actual deployment behavior.
Similar to unit tests, in most cases integ tests for disabled feature flag (old behavior) already exist, so add integ tests with the feature flag enabled.
const app = new App({
// Set feature flag as context when creating CDK App
postCliContext: {
'@aws-cdk/aws-elasticloadbalancingv2:networkLoadBalancerWithSecurityGroupByDefault': true, // Enable feature flag
},
});
const stack = new Stack(app, 'NetworkLoadBalancerSecurityGroupFlagStack');
new NetworkLoadBalancer(stack, 'NLB', {
vpc: vpc,
internetFacing: true,
});
new IntegTest(app, 'NetworkLoadBalancerSecurityGroupFlag', {
testCases: [stack],
});
In the actual PR, I implemented assertions to verify security group configuration.
The necessity of assertions in integ tests varies depending on the modification, so implement appropriate integ tests as needed.
Summary
With the above steps, you can introduce a feature flag to CDK.
I hope you now understand the very important role feature flags play in maintaining backward compatibility, though they're something general CDK users don't often need to think about. (Which is actually a testament to CDK's good design)
However, there's rarely a contribution guide for implementing such minor features.
So I put extra effort into writing this article.
If you've ever thought "Ugh, this CDK behavior is not good... I want to fix it my way...", please use this as a reference to submit a PR.
Appendix
The Position of Feature Flags
As mentioned in the article above, feature flags are originally meant to be a last resort, so they should not be added carelessly.
However, it's true that many changes cannot be achieved without feature flags, and I feel that they will inevitably continue to increase at a certain pace.
Personally, I think that CDK often has deprecated default behaviors due to backward compatibility concerns, and such cases should be actively fixed by introducing feature flags.
Currently, I'm proposing modifications to make default values align with modern times for CloudFront Functions runtime settings and IP address types for Lambda function URLs as CloudFront origins. Hopefully they'll be merged within a year...
Feature Flags in Alpha Modules
CDK alpha modules before GA allow breaking changes, so feature flag implementation is not required.
However, you need to clearly state in the PR body that it's a breaking change.
BREAKING CHANGE: Corrected LogRetention IDs for DatabaseCluster. Previously, regardless of the log type, the string 'objectObject' was always included, but after the correction, the log type is now included.
Let's submit PRs that aggressively change behaviors before GA to create more polished modules.
Top comments (0)