Originally created by Canonical for Ubuntu on AWS EC2, it’s now the de facto early boot configuration method
- Cloud Init is an invaluable resource for Cloud Engineers and Software Developers alike.
- It's a straightforward service on the surface but is highly customizable to whatever needs an org may have the case for.
- Cloud Init isn't only AWS EC2 user data; it does network configuration, vendor configuration, and provides metadata services.
You're probably using Cloud Init and don't even realize it. Created by Canonical in the early days of EC2, it helped revolutionize how we treat our servers and how runtime initialization is conducted. Since its inception, it has been one of the primary methods of early configurations for our infrastructure. It's also run in the stacks of every major public cloud provider and many private cloud environments like LXD, KVM, and OpenStack.
Cloud Init allows engineers to reduce or even eliminate package installs or configurations during application deployment. "Why should I need to install ImageMagick on every single Rails deployment?" Similarly, Cloud Init can provide breathing room between OS image builds since you can do any security patching as a part of Cloud Init so you can rotate your AMI on a more manageable basis, such as a weekly cadence.
Cloud Init works in a couple of different stages.
First, for systemd machines, is the
Generator stage. If you're unfamiliar with systemd, a generator is a binary executed early in the boot process to dynamically generate unit files, symlinks, and more. Cloud Init's generator determines if the rest of the Cloud Init process should continue. If so, Cloud Init is included in the list of boot goals for the system.
Next is the
Local phase. This phase runs the
cloud-init-local.service systemd service and runs as early as possible. Essentially its entire purpose is to locate data sources and generate (or apply) networking configurations for the system. It's worth noting that this phase blocks much of the boot process, including the network initialization.
Network phase continues the Cloud Init boot. This phase relies on networking being up (and, by association, the
Local phase). This stage will run any
cloud_init modules found. These might be things such as
Network phase is the
Config phase, this is the phase that runs the modules that don't affect any other stages. Specifically, it runs the
cloud_config modules in the Cloud Init config directory.
runcmd is included in this step.
Cloud Init closes out with the
Final phase. Running any
cloud_final modules, this phase runs as late as possible. It is the stage that includes any user data scripts and configuration management tooling (Puppet, Chef, etc.).
Each server using Cloud Init also has a collection of data that Cloud Init uses to configure the instance. This includes what we generally think of as instance metadata on EC2 instances but also more.
Some providers will create or attach a config drive containing metadata service information files. OpenStack is an example of one such provider.
While we interact with user data, Cloud providers can also implement vendor data. The idea here is the same as user data; it exists to allow the cloud provider to customize the image at runtime. Some potential vendor data tasks might involve setting the instance's hostname or configuring package repository paths. Vendor data can be disabled if desired. It's also worth mentioning that user data overwrites vendor data when Cloud Init determines the final configuration.
Cloud Init can be instrumented in two ways: a shell script or a YAML formatted cloud-config file. Both approaches are pretty straightforward:
sudo yum --assumeyes --security update-minimal
Or, the equivalent cloud-config:
- [ sudo yum --assumeyes --security update-minimal ]
The script option is pretty easy to understand. As mentioned above, it's executed in the
Final phase. The cloud-config option is more interesting since you can set up modules to run in the different phases, such as the
bootcmd option. Check out the module reference page for a complete list of available modules. There is also a great list of example configurations on the cloud-config examples page.
If for some reason, you want to, you can prevent Cloud Init from running. This can be accomplished in a couple of different ways. The easiest is to add a file during the AMI build time:
You can also add a parameter to
It's also possible to disable user data by setting the
allow_userdata parameter in
Occasionally, you may want to dig deeper into Cloud Init. Maybe your user data isn't executing how you expect or possibly taking longer than expected. Fortunately, Cloud Init tracks a lot of details for debugging.
The main logs are:
These logs can interact with the
cloud-init command with the
analyze sub-command. This can help parse the logs into a more usable format.
There are also logs in the
/run/cloud-init directory. These logs are more related to some of the inner workings and decisions of Cloud Init.
/var/lib/cloud/ directory is where the data files are kept. A handy file in this directory is the
status.json file. This includes the stages ran and the start/finish times for each one (in epoch format).
[ec2-user@ip-10-0-0-60 data]$ cat /var/lib/cloud/data/status.json
...File snipped for brevity
Config files are kept in
/etc/cloud/cloud.cfg and the
Systems equipped with Cloud Init come with a binary used to interact with it. The command to use is
One of the most useful commands is
cloud-init status which returns the status of the Cloud Init run. An optional
--long flag grants more detail:
[ec2-user@ip-10-0-0-41 ~]# sudo cloud-init status
[ec2-user@ip-10-0-0-41 ~]# sudo cloud-init status --long
time: Mon, 13 Jun 2022 04:47:45 +0000
cloud-init status command also has another great flag:
--wait. This flag waits until Cloud Init is completed before returning. It's helpful if you are using AWS CodeDeploy or a configuration management system that phones home on startup but isn't tied to Cloud Init for some reason. There is a very real chance that your CodeDeploy may start up before Cloud Init is finished which means any configuration, binaries, or environment variables set by your user data script would not be available.
[ec2-user@ip-10-0-0-41 ~]$ sudo cloud-init status --wait
Another useful command is
cloud-init query which references the cached instance metadata that was captured by Cloud Init:
[ec2-user@ip-10-0-0-41 ~]$ sudo cloud-init query cloud_name
[ec2-user@ip-10-0-0-41 ~]$ sudo cloud-init query availability_zone
Knowing more about Cloud Init and how to properly leverage it can be extremely advantageous to multiple facets of an org. It can make Cloud Engineers and System Administrators' lives easier by reducing the need for configuration tooling and AMI rotations. It can also speed up application deployments.
The documentation for Cloud Init is pretty in-depth and a valuable resource. It has great details on many of the cloud providers' implementations of the metadata service. The documentation also has information about creating custom modules that can be injected and executed just like