DEV Community

Cover image for Mastering Log Retrieval in ECS Services: From Initial Trials to Advanced Solutions
Tanja Bayer for Cubesoft GmbH

Posted on

Mastering Log Retrieval in ECS Services: From Initial Trials to Advanced Solutions

In the ever-evolving landscape of containerized applications, managing and accessing logs presents a universal challenge for developers and system administrators alike. Effective log management is crucial for maintaining operational efficiency, ensuring security, and facilitating prompt troubleshooting. However, the diverse nature of applications and the complexity of container orchestration platforms often turn log management into a daunting task. This blog post delves into a common yet intricate issue faced in log management within Amazon ECS (Elastic Container Service) environments. We encountered this challenge firsthand while working with various applications, including Shopware, where the intricacies of plugin-generated logs posed a unique set of hurdles. Our journey from an initial, straightforward approach to a more sophisticated and effective solution offers insights and lessons valuable for anyone grappling with similar log management challenges in containerized services.

The Problem

The challenge we faced was integrating Shopware's file-based logs into a centralized logging system like AWS CloudWatch. Unlike standard logging practices, Shopware writes logs to separate files, not the console, complicating their capture and monitoring in Amazon ECS environments.

The First Approach

Initially, we attempted to tackle the log integration issue using a configuration within supervisord.conf. Our strategy was to employ the tail command to continuously read the latest entries from Shopware's log files and redirect these entries to the standard output (stdout) and standard error (stderr). This method was designed to mimic the behavior of console logging, thereby facilitating the integration of these logs into AWS CloudWatch. The relevant part of our supervisord.conf looked something like this:

[program:prod_logs]
command=tail -f /sw6/var/log/prod*.log
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
autorestart=false
startretries=0
Enter fullscreen mode Exit fullscreen mode

This setup aimed to ensure that all log data from Shopware's files were captured as if they were console outputs, theoretically allowing for seamless integration into CloudWatch.

However, we encountered a significant limitation. The tail command in our supervisor configuration was initiated at the start of the container's lifecycle. This meant that if the log files specified did not exist at startup (a common occurrence with dynamically created plugin logs), tail would fail to track these files, resulting in missed log entries. Consequently, this approach proved to be partially effective but not entirely reliable, especially for logs generated after the initial startup process.

Recognizing these limitations, we realized the need for a more robust solution that could dynamically adapt to the creation of new log files and ensure comprehensive log capture for our monitoring needs.

A Better Approach

Realizing the limitations of our initial method, we shifted our focus towards a more dynamic and robust solution: the implementation of the FireLens Log Driver in our ECS service. FireLens, a log router for Amazon ECS, offers the flexibility to route container logs to various destinations, including AWS CloudWatch, with enhanced processing capabilities.

We configured our ECS tasks to use FireLens, specifically utilizing Fluent Bit, an open-source log processor and forwarder. This setup enabled us to define custom log routing and processing rules, tailored to the unique structure of Shopware's log files.

Here’s an overview of how we configured FireLens with Fluent Bit:

  1. Task Definition Enhancement: We modified our ECS task definitions to include the FireLens configuration, specifying Fluent Bit as the log router.
  2. Custom Log Processing: Fluent Bit allowed us to create custom configurations to tail the Shopware log files, process the log data (like adding metadata, filtering, or transforming log formats), and then route them effectively to CloudWatch.
  3. Dynamic Log File Handling: Unlike our initial approach, Fluent Bit's ability to dynamically detect and tail new log files resolved the issue of missing logs that were created after the container started.
  4. Streamlined Integration: This setup streamlined the integration of Shopware’s logs into CloudWatch, ensuring that all log data, regardless of when it was generated, was consistently captured and made available for monitoring and analysis.

By leveraging FireLens's flexibility and Fluent Bit’s dynamic file tailing capabilities, we successfully integrated Shopware's logs into CloudWatch, enhancing our overall monitoring and operational efficiency in the ECS environment.

Implementation Details

Implementing the FireLens Log Driver in our ECS service involved several key steps. Below, I outline the process with generic code snippets, providing a blueprint that can be adapted to similar scenarios.

Configuring the Logs for the Firelens Container

First, we set up the FireLens container, ensuring that its logs, particularly errors, were captured for monitoring:

const containerLogDriver = LogDriver.awsLogs({
    streamPrefix: 'firelens', // Replace if you want
    logGroup: logGroup
});
Enter fullscreen mode Exit fullscreen mode

FireLens Log Driver Setup

We configured the FireLens Log Driver with specific options to route logs to AWS CloudWatch

const fireLensLogDriver = new FireLensLogDriver({
    options: {
        Name: 'cloudwatch',
        region: 'your-region', // Specify the AWS region
        delivery_stream: 'application-stream-name', // Customize the stream name
        log_group_name: 'name-of-aws-log-group',
        log_stream_name: `/application/${uuid()}`, // Replace with a unique identifier
        log_key: 'log',
        auto_create_group: 'true'
    }
});
Enter fullscreen mode Exit fullscreen mode

Task Definition for the Application Container

In the task definition for the application container, we included the FireLens Log Driver and specified mount points for log volumes:

const taskDefinition = new TaskDefinition(this, 'TaskDefinition', {
    // Task definition properties
    volumes: [
        // Other volumes
        { name: logVolumeName } // use a preferred name here
    ]
});

taskDefinition
    .addContainer('ApplicationContainer', {
        // Other container properties
        containerName: 'application',
        essential: true, 
        logging: fireLensLogDriver
    }).addMountPoints({
        sourceVolume: logVolumeName,
        containerPath: '/path/to/logs', // Path to the log directory for shopware /sw6/var/logs
        readOnly: false
    });
Enter fullscreen mode Exit fullscreen mode

Custom FireLens Image

We created a custom Docker image for FireLens to tailor it to our specific needs. This involved starting with the base image from Amazon's AWS for Fluent Bit and adding our custom configuration:

# We run on ARM64, but you can use the x86 version just remove the platform part

FROM --platform=linux/arm64 amazon/aws-for-fluent-bit:latest 

RUN mkdir -p /sw6/var/log
ADD logConfig.conf /logConfig.conf
Enter fullscreen mode Exit fullscreen mode

Fluent Bit Configuration: logConfig.conf

The logConfig.conf file is crucial as it defines how Fluent Bit will process and forward logs:

[INPUT]
    Name              tail
    Tag               container-logs
    # Change if you have another directory, according to the path in the
    Path              /sw6/var/log/*.log
    DB                /sw6/var/log/flb_service.db
    DB.locking        true
    Skip_Long_Lines   On
    Refresh_Interval  10
    Rotate_Wait       30

[OUTPUT]
    Name cloudwatch
    # make sure the part before -firelens* matches your container name in the task definition
    Match   shopware-firelens*
    region eu-west-1
    log_group_name shopware-$(ecs_cluster)
    log_stream_name /console/$(ecs_task_id)
    auto_create_group true
    retry_limit 2


    Name cloudwatch
    # use same name as in the Tag section of Input
    Match   container-logs
    region eu-west-1
    log_group_name shopware-$(ecs_cluster)
    log_stream_name /container/$(ecs_task_id)
    auto_create_group true
    retry_limit 2
Enter fullscreen mode Exit fullscreen mode

This configuration allows us to tail all log files dynamically in the Shopware logs folder and manage console output effectively.

Adaptions to Dockerfile used in Service

If you encounter problems with your service - in our case shopware - writing to the log files, because of missing permissions make sure you create the DIR in your Dockerfile and mount it in your Docker

USER sw6
RUN mkdir -P /sw6/var/log && chown -R sw6:sw6 /sw6/var/log

VOLUME ["/sw6/var/log"]
Enter fullscreen mode Exit fullscreen mode

Repository Creation and Docker Image Deployment

We then created a repository and deployed the custom Fluent Bit Docker image:

const tag = Date.now().toString();
const fluentbitRepository = new Repository(this, 'aws-for-fluentbit');

shopwareContainerRepository.grantPull(shopwareTaskDefinition.obtainExecutionRole());

new DockerImageDeployment(this, 'fluentbit-docker-image-deployment', {
    // make sure to specify the correct platform, which matches to the one specified in your Dockerfile
    source: Source.directory('/path/to/docker/fluent-bit/', { platform: Platform.LINUX_ARM64 }),
    destination: Destination.ecr(fluentbitRepository, {
        tag
    })
});
Enter fullscreen mode Exit fullscreen mode

FireLens Log Router Configuration

Finally, we configured the FireLens log router with Fluent Bit:

taskDefinition
    .addFirelensLogRouter('firelog-router', {
        firelensConfig: {
            type: FirelensLogRouterType.FLUENTBIT,
            options: {
                configFileType: FirelensConfigFileType.FILE,
                configFileValue: '/logConfig.conf',
                enableECSLogMetadata: false
            }
        },
        logging: containerLogDriver,
        image: ContainerImage.fromEcrRepository(repository, tag),
        // Additional properties
        containerName: 'firelens-router',
        cpu: cpuValue,
        memoryLimitMiB: memoryValue
    })
    .addMountPoints({
        sourceVolume: logVolumeName,
        containerPath: '/path/to/logs', // Path to the log directory same as the one in the application container above
        readOnly: false
    });
Enter fullscreen mode Exit fullscreen mode

Permission Policy for CloudWatch Logs

Lastly, we added a policy to the task role to allow the necessary actions for logging to CloudWatch:

const logPolicyStatement = new PolicyStatement({
    effect: Effect.ALLOW,
    actions: ['logs:CreateLogGroup', 'logs:CreateLogStream', 'logs:PutLogEvents', 'logs:DescribeLogStreams'],
    resources: ['*'] // Adjust as needed for specific log groups
});

shopwareTaskDefinition.addToTaskRolePolicy(logPolicyStatement);
Enter fullscreen mode Exit fullscreen mode

These steps illustrate the general process we followed to configure FireLens with Fluent Bit for efficient log management. The configuration ensures dynamic log file handling and seamless integration of application logs into CloudWatch, significantly improving our monitoring and troubleshooting capabilities.

Conclusion

Our journey in mastering log retrieval in ECS services, particularly with Shopware, highlights the importance of adaptable and dynamic log management strategies in containerized environments. By moving from a basic supervisord.conf configuration to a more sophisticated approach using FireLens and Fluent Bit, we were able to overcome the challenges posed by file-based logging. This transition not only enhanced our ability to effectively integrate logs into AWS CloudWatch but also underscored the significance of choosing the right tools and configurations for specific use cases. Our experience demonstrates that with the right approach, even complex logging scenarios can be managed efficiently, ensuring comprehensive monitoring and analysis capabilities in cloud-based applications.

Hey there, dear readers! Just a quick heads-up: we're code whisperers, not Shakespearean poets, so we've enlisted the help of a snazzy AI buddy to jazz up our written word a bit. Don't fret, the information is top-notch, but if any phrases seem to twinkle with literary brilliance, credit our bot. Remember, behind every great blog post is a sleep-deprived developer and their trusty AI sidekick.

Resources:
AWS CDK Documentation

FluentBit ECS Log Collection

Creating a task definition that uses a FireLens configuration

Top comments (0)