DEV Community

How to Set Up Runtime Monitoring ECS Cluster Using GuardDuty

Image description

Introduction:

Amazon Web Services has announced a new feature in re:Invent 2023 which enables GuardDuty to monitor and detect all the potential security activities at runtime level in ECS cluster (Fargate and EC2) and EKS. GuardDuty uses ML algorithms to which needs an agent to be installed in container host to gather all the events occurring at the host. This agent can be installed manually with the help of Systems Manager Document or automatically in EC2 and Fargate respectively.

Architecture:

Image description

Implementation Steps:

  • Enable GuardDuty Runtime Monitoring

  • Create VPC Endpoint for GuardDuty Service

  • Create ECS Cluster with EC2 Host

  • Set up ALB across EC2 hosts

  • Post Provision EC2 Instances and Install GD Agent

  • Validation

Enable GuardDuty Runtime Monitoring:

For gathering runtime monitoring event details, it's recommended to enable runtime monitoring before EC2 instances are deployed.

  • Open GuardDuty console https://console.aws.amazon.com/guardduty/

  • Enable GuardDuty service in your AWS account.

  • Select "Runtime Monitoring" in the left navigation pane.

  • Under "Configuration" section, please enable "Runtime Monitoring" option.

Image description

Create VPC Endpoint for GuardDuty Service:

Before agent is installed, VPC endpoint with guardduty service needs to be created in order to establish connection between agent and GuardDuty console. Please follow below steps -

  • Open VPC console https://console.aws.amazon.com/vpc/

  • Select "Endpoints" from left navigation bar.

  • Create a new endpoints with service name "com.amazonaws..guardduty-data".

  • Make sure "Enable DNS name" is checked under "Additional Settings".

  • Choose required subnets and security groups for endpoint.

  • Use below custom policy -

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": "*",
            "Resource": "*",
            "Effect": "Allow",
            "Principal": "*"
        },
        {
            "Condition": {
                "StringNotEquals": {
                    "aws:PrincipalAccount": "<account-id>"
                }
            },
            "Action": "*",
            "Resource": "*",
            "Effect": "Deny",
            "Principal": "*"
        }
    ]
}
Enter fullscreen mode Exit fullscreen mode

Create ECS Cluster with EC2 Host:

Here, we are creating a ECS cluster with underlying EC2 hosts, there are two options to create - either Fargate (Serverless) or EC2 (IaaS). Fargate option can be chosen when avoiding administrative efforts is required to reduce operational headaches. Mostly it's good to have cluster with EC2 host so that we can get complete insights on activities happening at lower level from view and troubleshooting perspective.
Note: During creation of cluster, ECS service will automatically build underlying EC2 hosts that requires Amazon Linux 2 kernel-5.10 OS version (if you are using Amazon Linux 2 AMI). If OS version is < 5.10, then it will not allow to install GuardDuty agents and throw below error.

Error:

install errors: error: Failed dependencies:
    kernel >= 5.4.0 is needed by amazon-guardduty-agent-1.0-0.x86_64
failed to run commands: exit status 1
Failed to install package; install status Failed
Enter fullscreen mode Exit fullscreen mode

Here, we are creating cluster with containers on basic httpd image which is stored into ECR repository.

Create a Cluster:

  • Open ECS console https://console.aws.amazon.com/ecs/
    • Click on Create Cluster.
      • Give Cluster name.
      • Under "Infrastructure" section, choose below settings -
        • Click on checkbox: Amazon EC2 instances
        • Auto Scaling group (ASG): Create new ASG
        • Provisioning Model: On-Demand
        • Operating system/Architecture: Amazon Linux 2 (kernel 5.10)
        • EC2 Instance Type: t2.micro
        • Desired Capacity:
          • Minimum: 2 Maximum: 4
        • SSH Keypair: Choose a keypair
      • Under "Network settings for Amazon EC2 instances" -
        • Choose correct VPC, Subnets and Security Groups.
        • Auto Assign Public IP: Turn On
      • Submit the catalog.

Create a Task Definition:

Task definition is a blueprint of container which are to be created, that takes couple of inputs from user like CPU, Memory requirements, task role, container image and port details.

  • Open "Task Definitions" from left navigation bar.
  • Click on "Create new task definition"
  • Under "Infrastructure requirements" section, choose below -
    • Launch Type: Amazon EC2 Instances
    • Operating system/Architecture: Linux X86_64
    • Network Mode: Default (later discussed why "default")
    • Task Size:
      • CPU: 0.8 vCPU, Memory: 0.9 GB
    • Task Role:
    • Under "Container" section, choose below -
      • Name:
      • Image URI:
      • Essential Container: YES
      • Host Port: 80, Container Port: 80 (For simplicity, port 80 is taken)
      • Resource Allocation Limits:
        • CPU: 0.5 vCPU, Memory Hard Limit: 0.7GB, Soft Limit: 0.5 GB
      • Under "Logging" section, keep this as default
      • Under HealthCheck section -
        • Command: CMD-SHELL,echo hello world
        • Interval: 5 seconds
        • Timeout: 5 seconds
        • Start Period: 10 seconds
        • Retries: 3 This will generate below JSON content of task definition -
{
    "taskDefinitionArn": "arn:aws:ecs:us-east-1:<account-id>:task-definition/mytaskdef1:2",
    "containerDefinitions": [
        {
            "name": "httpd",
            "image": "<image-uri>",
            "cpu": 512,
            "memory": 717,
            "memoryReservation": 512,
            "portMappings": [
                {
                    "name": "httpd-80-tcp",
                    "containerPort": 80,
                    "hostPort": 80,
                    "protocol": "tcp",
                    "appProtocol": "http"
                }
            ],
            "essential": true,
            "environment": [],
            "environmentFiles": [],
            "mountPoints": [],
            "volumesFrom": [],
            "workingDirectory": "/",
            "ulimits": [],
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-create-group": "true",
                    "awslogs-group": "/ecs/mytaskdef1",
                    "awslogs-region": "us-east-1",
                    "awslogs-stream-prefix": "ecs"
                },
                "secretOptions": []
            },
            "healthCheck": {
                "command": [
                    "CMD-SHELL",
                    "echo hello world"
                ],
                "interval": 5,
                "timeout": 5,
                "retries": 3,
                "startPeriod": 10
            }
        }
    ],
    "family": "mytaskdef1",
    "taskRoleArn": "arn:aws:iam::<account-id>:role/ecsTaskExecutionRole",
    "executionRoleArn": "arn:aws:iam::<account-id>:role/ecsTaskExecutionRole",
    "revision": 2,
    "volumes": [],
    "status": "ACTIVE",
    "requiresAttributes": [
        {
            "name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
        },
        {
            "name": "ecs.capability.execution-role-awslogs"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.17"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.21"
        },
        {
            "name": "com.amazonaws.ecs.capability.task-iam-role"
        },
        {
            "name": "ecs.capability.container-health-check"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.29"
        }
    ],
    "placementConstraints": [],
    "compatibilities": [
        "EC2"
    ],
    "requiresCompatibilities": [
        "EC2"
    ],
    "cpu": "819",
    "memory": "922",
    "runtimePlatform": {
        "cpuArchitecture": "X86_64",
        "operatingSystemFamily": "LINUX"
    },
    "registeredAt": "2024-01-26T15:20:05.159Z",
    "registeredBy": "arn:aws:iam::<account-id>:root",
    "tags": []
}
Enter fullscreen mode Exit fullscreen mode

Create Tasks:

  • Open the cluster and click on "Run Tasks" option.
  • Under Compute option, choose "launch Type" as EC2.
  • Under "Deployment Configuration" choose Application Type as "Task"
  • Choose Task Definition which we created in last step.
  • Choose Desired Tasks: 2

Image description

Set up ALB across EC2 hosts:

We have set up an application load balancer which is distributing the income traffic at port 80 across all the ECS tasks register in the target group.

Image description

Above snapshot shows that both the target groups are in healthy state as 200 success code is come. Also default website is coming as expected.

Image description

Post Provision EC2 Instances and Install GD Agent:

As of now, we haven't installed GuardDuty agents into EC2 instances manually, so below image shows that both instances and cluster are not healthy because of agent reporting issues as we can see "Agent not reporting" message under Issue section. Hence it's required to install agents manually, there are two different ways to do this -

  • Install GuardDuty Agent through System Managers Document.
  • Install agents manually by downloading RPM scripts. Before install agents, lets ensure that kernel OS version >= 5.10 as below -
[root@ecs-host-1b ~]# uname -r
5.10.205-195.804.amzn2.x86_64
[root@ecs-host-1b ~]#
Enter fullscreen mode Exit fullscreen mode

Here, we have chosen to install GuardDuty agent using Systems Manger -

  • Open Systems Manager console and click on "Documents" option. Look for "AmazonGuardDuty-ConfigureRuntimeMonitoringSsmPlugin" document.
  • Provide package name as "AmazonGuardDuty-RuntimeMonitoringSsmPlugin"
  • Choose the instances where agent needs to be installed and click Run.
  • Post successful installation, validate the agent status by running
    • sudo systemctl status amazon-guardduty-agent
[root@ecs-host-1b ~]# sudo systemctl status amazon-guardduty-agent
● amazon-guardduty-agent.service - Amazon GuardDuty Agent
   Loaded: loaded (/usr/lib/systemd/system/amazon-guardduty-agent.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2024-01-26 16:07:07 UTC; 4min 42s ago
 Main PID: 22504 (amazon-guarddut)
    Tasks: 14
   Memory: 114.6M (limit: 128.0M)
   CGroup: /system.slice/amazon-guardduty-agent.service
           └─22504 /opt/aws/amazon-guardduty-agent/bin/amazon-guardduty-agent --worker-threads 8
Enter fullscreen mode Exit fullscreen mode
  • Now we can see both EC2 instances are ECS cluster are in healthy state in GuardDuty console as below -

Image description

Validation:

After successful agent reporting, we'll see security findings are getting gathered in GuardDuty console. For instant findings, we have used https://github.com/awslabs/amazon-guardduty-tester/tree/master repository. After sometimes, we are able to see couple of findings present in the console as below.

Image description

Hope this article will help you a lot in configuration things as expected. I have given few important links which you might need during setting this up -

  1. https://docs.aws.amazon.com/guardduty/latest/ug/how-runtime-monitoring-works-ec2.html
  2. https://docs.aws.amazon.com/guardduty/latest/ug/prereq-runtime-monitoring-ec2-support.html
  3. https://docs.aws.amazon.com/guardduty/latest/ug/managing-gdu-agent-ec2-manually.html
  4. https://docs.aws.amazon.com/guardduty/latest/ug/runtime-monitoring-agent-release-history.html

Thanks for reading!! Let's connect in https://www.linkedin.com/in/anirban-das-507816169/
Happy Learning !! Cheers!!

Top comments (0)