I’m a big fan of using serverless components within AWS to build the systems I need: It’s an easy, low-threshold way to set up systems without the hassle of managing resources. Nonetheless, sometimes you need to spin up an old-fashioned EC2 virtual machine. You might even have to spin up an EC2 instance from a non-Amazon Linux AMI when, for example, the service you want to run is unavailable in any of the repositories provided by the distribution. However, non-Amazon Linux AMIs don’t have the Cloudwatch Agent installed and configured out of the box.
However, you want to have the logs of your EC2 instance available in Cloudwatch for easy and secure access to the logs and analysis of the behavior of your instance and services running on that instance. In this article, I explain how to manually install and configure the Cloudwatch Agent to get your instance’s logs available in Cloudwatch.
If you want to jump ahead: A working example can be found on GitHub.
Installing the Cloudwatch Agent
Whenever the Cloudwatch agent is unavailable as a package in the repository of the Linux distribution you are running, you can install it manually by leveraging the user data initialization script of the EC2 instance. The user data initialization script runs every time an EC2 instance is launched and can be used to execute commands and scripts to configure your instance. The Cloudwatch Agent is available for several distributions and other OSs. A complete overview of the available flavors can be found here. In this example, we will launch an EC2 instance with Debian 12 and we thus need the .deb package for easy manual installation of the Cloudwatch Agent. Let’s spin up some resources!
Launching an EC2 instance
Let’s set up a stack with AWS CDK that will deploy an EC2 instance based on Debian 12. The Debian distribution is a good candidate to illustrate this topic since it doesn’t have the Cloudwatch Agent package in its repository.
I’m using Python for these examples, and I’ve created a project using poetry and initialized an empty CDK stack:
from aws_cdk import Stack
from constructs import Construct
class Ec2CloudwatchAgentStack(Stack):
def __init__(self, scope: Construct, id: str, **kwargs) -> None:
super().__init__(scope, id, **kwargs)
I’ve chosen the Debian 12 AMI from the AMI catalog since this AMI doesn’t have the standard Amazon Linux tooling installed and configured, so suits perfectly to illustrate the topic at hand.
To configure an EC2 using a custom AMI, you need to look up the AMI identifier. The ID of the Debian 12 AMI is ami-0715d656023fe21b4
.
Let’s add the EC2 configuration to our empty stack: We also need to define a VPC and to be able to log in after we create the EC2 instance, we add the EC2 instance to the public subnet, and we also need a key-pair to authenticate.
Configuring an EC2 instance inside a public subnet means that it is reachable from the internet. This is bad practice and should be avoided, especially when no security measures are in place like firewalls etc.
# VPC with private subnet
vpc = ec2.Vpc(self, "VPC",
max_azs=2,
subnet_configuration=[ec2.SubnetConfiguration(
name="public", cidr_mask=24, subnet_type=ec2.SubnetType.PUBLIC
)])
# security group
security_group = ec2.SecurityGroup(self, "SecurityGroup",
vpc=vpc,
description="Allow ssh access to ec2 instances",
allow_all_outbound=True)
# open ssh to the ec2 instance
security_group.add_ingress_rule(
ec2.Peer.any_ipv4(),
ec2.Port.tcp(22),
"Allow ssh access from the world"
)
# import existing keypair key-pair-van-auke
key_pair = ec2.KeyPair.from_key_pair_name(self, "MyKeyPair",
key_pair_name="key-pair-van-auke")
# EC2 instance based on Debian 12
instance = ec2.Instance(self, "MyEc2Instance",
instance_type=ec2.InstanceType.of(
ec2.InstanceClass.BURSTABLE2, ec2.InstanceSize.MICRO),
machine_image=ec2.MachineImage.generic_linux({
'eu-west-1': 'ami-0584590e5f0e97daa'
}),
vpc=vpc,
vpc_subnets=ec2.SubnetSelection(subnet_type=ec2.SubnetType.PUBLIC),
key_pair=key_pair,
security_group=security_group)
This configuration can now be deployed to your AWS account using the cdk deploy
command in your terminal. This will take a few minutes, and when it’s deployed we can test the configuration by trying to ssh into the instance:
$ ssh -i key-pair-van-auke.pem admin@54.73.44.74
Linux ip-10-0-0-15 6.1.0-23-cloud-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.99-1 (2024-07-15) x86_64
The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Mon Dec 2 10:56:14 2024 from 85.146.13.35
admin@ip-10-0-0-15:~$
As you can see the EC2 instance is deployed successfully and we can log in.
EC2 User data
As you might know, EC2 instances have a way to run a script when the instance is created. The so-called user-data. To quote the documentation:
When you launch an Amazon EC2 instance, you can pass user data to the instance that is used to perform automated configuration tasks or to run scripts after the instance starts.
In this example, we will leverage this user data to install and configure the Amazon Cloudwatch Agent on the EC2 instance to forward specific logs to Amazon Cloudwatch for easy access and analysis.
We will start by extending the CDK configuration with the user-data instructions to download the .deb
package of the Amazon Cloudwatch agent, install the package, and enable and start the service:
# edit user data
instance.user_data.add_commands(
"apt-get update",
"apt-get install -y gpg",
"apt-get install -y wget",
"wget https://s3.amazonaws.com/amazoncloudwatch-agent/debian/amd64/latest/amazon-cloudwatch-agent.deb",
"dpkg -i -E ./amazon-cloudwatch-agent.deb",
"systemctl enable amazon-cloudwatch-agent",
"systemctl start amazon-cloudwatch-agent"
)
If we then execute the cdk deploy
command, the EC2 instance will be recreated and these instructions will be executed on the first boot of the instance.
Let’s check the service status:
admin@ip-10-0-0-130:~$ systemctl status amazon-cloudwatch-agent.service
○ amazon-cloudwatch-agent.service - Amazon CloudWatch Agent
Loaded: loaded (/etc/systemd/system/amazon-cloudwatch-agent.service; enabled; preset: enabled)
Active: inactive (dead) since Fri 2024-12-20 07:24:07 UTC; 5min ago
Duration: 54ms
Process: 890 ExecStart=/opt/aws/amazon-cloudwatch-agent/bin/start-amazon-cloudwatch-agent (code=exited, status=0/SUCCESS)
Main PID: 890 (code=exited, status=0/SUCCESS)
CPU: 35ms
Dec 20 07:24:07 ip-10-0-0-130 systemd[1]: Started amazon-cloudwatch-agent.service - Amazon CloudWatch Agent.
Dec 20 07:24:07 ip-10-0-0-130 start-amazon-cloudwatch-agent[894]: D! [EC2] Found active network interface
Dec 20 07:24:07 ip-10-0-0-130 start-amazon-cloudwatch-agent[894]: I! imds retry client will retry 1 timesI! Detected the instance is EC2
Dec 20 07:24:07 ip-10-0-0-130 start-amazon-cloudwatch-agent[894]: 2024/12/20 07:24:07 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json ...
Dec 20 07:24:07 ip-10-0-0-130 start-amazon-cloudwatch-agent[894]: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json does not exist or cannot read. Skipping it.
Dec 20 07:24:07 ip-10-0-0-130 systemd[1]: amazon-cloudwatch-agent.service: Deactivated successfully.
Cool, the service is installed, started, however not running anymore: The last log line states that it’s missing the configuration file, and therefore shutdown. In order to instruct the Cloudwatch agent which log file(s) it needs to watch and forward to Amazon Cloudwatch, we need to supply a configuration file.
Configuring the Cloudwatch Agent
The Cloudwatch Agent needs a configuration file to know which data it needs to forward to the Amazon Cloudwatch service. There are multiple approaches possible: We can provide the configuration file in the user-data, or download it from S3.
In this situation, the Cloudwatch Agent configuration JSON file will have these contents:
{
"agent": {
"metrics_collection_interval": 60,
"logfile": "/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log"
},
"logs": {
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/log/cloud-init-output.log",1️⃣
"log_group_name": "/aws/ec2-cloudwatch-agent",2️⃣
"log_stream_name": "{instance_id}/messages",3️⃣
"timestamp_format": "%b %d %H:%M:%S"4️⃣
}
]
}
}
}
}
The important part of the config file is the collect_list
Array: Here you can define:
1️⃣ Which log files need to be watched and forwarded to Cloudwatch
2️⃣ The name of the log group to submit the logs to
3️⃣ The name of the log stream. You can see that I’ve added the instance_id
as variable, so each instance will have its log stream.
4️⃣ The timestamp format of the messages
For more comprehensive documentation about the possibilities of the config file I want to redirect you to the corresponding documentation
Cloudwatch Configuration file via User-Data
To provide the configuration file to the Cloudwatch Agent via User-Data we can write the text we add to the user data into a file at a specific location:
# edit user data
instance.user_data.add_commands(
"apt-get update",
"apt-get install -y gpg",
"apt-get install -y wget",
"wget https://s3.amazonaws.com/amazoncloudwatch-agent/debian/amd64/latest/amazon-cloudwatch-agent.deb",
"dpkg -i -E ./amazon-cloudwatch-agent.deb",
"systemctl enable amazon-cloudwatch-agent",
"systemctl start amazon-cloudwatch-agent",
"sudo mkdir -p /opt/aws/amazon-cloudwatch-agent/etc",
"sudo touch /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json",
"sudo cat <<EOF > /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json",
# The content to write to the config file
"{",
" \"agent\": {",
" \"metrics_collection_interval\": 60,",
" \"logfile\": \"/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log\"",
" },",
" \"logs\": {",
" \"logs_collected\": {",
" \"files\": {",
" \"collect_list\": [",
" {",
" \"file_path\": \"/var/log/cloud-init-output.log\",",
" \"log_group_name\": \"/aws/ec2-cloudwatch-agent\",",
" \"log_stream_name\": \"{instance_id}/messages\",",
" \"timestamp_format\": \"%b %d %H:%M:%S\"",
" }",
" ]",
" }",
" }",
" }",
"}",
"EOF",
)
This approach doesn’t necessarily produce the most readable CDK code, but it is easy to manage and it provides an easy way to make the configuration dynamic based on data that is available or produced by the stack this code runs in. We can for example inject tag values or data from Parameter Store that we have available in the stack to make a distinction between instances in Amazon Cloudwatch, for example.
When deploying this configuration, you’ll see that there is a log group created in Cloudwatch named /aws/ec2-cloudwatch-agent
and it contains the log lines from /var/lo/cloud-init-output.log
. If you want to forward other log files to Cloudwatch, you can extend the collect_list
configuration array with the location of the file and the destination log group.
Conclusion
Sometimes when you need to fall back onto EC2 instances, you don’t want them to be a black box that you have to SSH into to be able to see what is happening inside that instance. I’ve shown you how to deploy an EC2 instance and install the Amazon Cloudwatch Agent on a Linux distribution that doesn’t contain the agent out of the box or isn’t available in the package manager either. I’ve also explained how to configure the agent to forward the contents of a specific log file to Amazon Cloudwatch so you can analyze the logs without having to SSH into the instance.
Top comments (0)