In this post, I am sharing the lessons learned from working with CloudFormation.
My first experience with Infrastructure-as-Code tools has been with CloudFormation. Starting with CloudFormation is quite straightforward, but trying out things and managing a critical workload are two different things. Understanding how CloudFormaton works is essential for managing a production environment.
Be aware in what type of stack update your change will result
When you submit an update, AWS CloudFormation updates resources based on what you submit compared to the stack's current template. An update to a resource could result in one of the following update behaviors:
Update with No Interruption, the resource will be updated without disrupting operation of that resource and without changing the resource's physical ID. An example of this would be, changing the Memory size of a Lambda function.
Update with Some Interruption, the resource will be updated with some interruption. An example of this would be, if you change the SSESpecification of a DynamoDb table, that specifies the setting to enable server-side encryption.
Replacement, in this case, CloudFormation recreates the resource during an update, which also generates a new physical ID. When the new resource is created, the references from other dependent resources will be changed to point to the replacement resource, and then the old resource will be deleted. An example of this would be, making a change to the LocalSecondaryIndexes property of DynamoDb table.
In this case, you must plan carefully prior to making a change that results in a replacement. Take into consideration that you will need to take a snapshot of the database that can be used to restore the data, and prepare a strategy for the applications that use that DynamoDb table, so they can handle an interruption while the table is being replaced.
Which update method will be applied depends on which property you update for a given resource type. The update behavior for each property is described in the AWS Resource Types Reference. Knowing what will be affected by your change is crucial so that you do not lose any data.
Be aware of the resources that are created outside of your CloudFormation stack
Even though every resource is being specified in the CloudFormation template, some resources could be created outside of that stack as well. For example, if in your template you have defined a Lambda function with appropriate permissions so that it can write logs to CloudWatch, the log group for the Lambda function will be automatically created on the first log write. Now the log group exists outside of your CloudFormation defined stack.
This will not cause any issues at the start, but if you delete your stack to clean up all the resources, the log group won't be deleted. So you end up with so-called orphaned resources.
Also as your stack evolves you might add new resources that depend on that log group. Even though the log group is not part of your stack, but exists in your AWS environment the creation of new resources dependent on that log group will be allowed from CloudFormation. Again at the start no issues at all. But what if you want to reuse the same template to create a new stack? Since the log group does exist at the time of the creation of the stack, the creation of the dependent resources will fail and you won't be able to create the stack.
Being aware of whether any other resources are created outside of your stack, you can plan ahead and define them in your CloudFormation template instead.
Always protect your critical (database) resources
As it was mentioned above, changing a property of a resource could trigger replacement, which in the case of a database means losing data. This could happen not only with intentional changes but also by accident e.g. changing the name of the resource by mistake. And just like that your resource will be deleted.
While this is not so critical for resources that do not hold any data if this happens to a database the data will be lost. Even if you had automatic snapshots turned on, all the automatic snapshots will be deleted as well. So, always protect your critical resources from getting deleted.
Don't make changes outside of CloudFormation
Don't make changes to stack resources outside of CloudFormation. Doing so can create a mismatch between your stack's template and the current state of your stack resources, which can cause errors if you update or delete the stack.
You might fall into the temptation to do a quick fix from the AWS Console, but this can come at a price. If a change is being done outside of CloudFormation, and never propagated through CloudFormation, it will remain as such. But it is even worse if that results in errors on stack update. So, do not mix CloudFormation with manual changes.
Top comments (0)