DEV Community

Paul Eggerling-Boeck
Paul Eggerling-Boeck

Posted on • Originally published at Medium on

Spring Boot Logging for AWS CloudWatch

What I learned while implementing JSON logging in my Spring Boot-enabled Java Lambda function.

As I was working on my weather data collecting AWS Lambda function, I came across an article suggesting structured JSON logging as a best practice for using CloudWatch logs. In this article, I’ll go over what I learned while implementing such logging in my Spring Boot-enabled Java Lambda function.


Photo by Mildly Useful on Unsplash

Since I’m using the Spring Boot framework in my Lambda function, I get Logback logging configured by default. I decided to run with that logging framework as the path of least resistance. With the default configuration, I could generate simple log statements, but going beyond that to a more structured log format was going to require overriding the default configuration. So off to the internet I went, in search of answers. I started with Google searches looking for examples of how to implement structured JSON logging with Logback. I found a number of examples and was able to put together a logback.xml file that used the net.logstash.logback.encoder.LogstashEncoder encoder to produce a JSON output. The problem was that the configuration didn’t work. My Spring Boot application wouldn’t start and all I could see was that there was an error with the Logback configuration, but couldn’t get details of what the error was. I could not, for the life of me figure out what was wrong.

That’s when I turned to ChatGPT. I provided my current logback.xml file and the error message I was getting. Without any further prompting, ChatGPT told me that I needed to add a dependency on logstash-logback-encoder. None of the other articles/examples I had found mentioned anything about neeeding a specific encoding dependency beyond the basic logback-classic dependency that comes along with a dependency on spring-boot-starter. Once I added that dependency, I was off to the races!

Here’s the first Logback configuration file I used:

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
  <include resource="org/springframework/boot/logging/logback/base.xml"/>
  <appender name="jsonConsoleAppender" class="ch.qos.logback.core.ConsoleAppender">
    <encoder class="net.logstash.logback.encoder.LogstashEncoder"/>
  </appender>
  <root level="INFO">
    <appender-ref ref="jsonConsoleAppender"/>
  </root>
</configuration>
Enter fullscreen mode Exit fullscreen mode

Which generated a log output that looked like this:

2023-05-21T14:06:30.778-05:00 INFO 45777 --- [main] org.weathertracker.Handler : My Log Message
{"@timestamp":"2023-05-21T14:06:30.778622-05:00","@version":"1","message":"My Log Message","logger_name":"org.weathertracker.Handler","thread_name":"main","level":"INFO","level_value":20000}
Enter fullscreen mode Exit fullscreen mode

Yep, there’s definitely JSON-formatted information in there! But, I have two problems with log statements that look like this:

  1. The JSON is pretty hard to read in that single-line format
  2. The logger name and log message are printed twice, once on the first line of the log entry, and once in the JSON body.

So I asked ChatGPT what modifications would be needed to ‘pretty print’ the JSON. Here’s the configuration it suggested (adding the jsonGeneratorDecorator element).

<configuration>
    <include resource="org/springframework/boot/logging/logback/base.xml" />

    <appender name="jsonConsoleAppender" class="ch.qos.logback.core.ConsoleAppender">
        <encoder class="net.logstash.logback.encoder.LogstashEncoder">
            <jsonGeneratorDecorator class="net.logstash.logback.decorate.PrettyPrintingJsonGeneratorDecorator" />
        </encoder>
    </appender>

    <root level="INFO">
        <appender-ref ref="jsonConsoleAppender" />
    </root>
</configuration>
Enter fullscreen mode Exit fullscreen mode

Which generated log statements that looked like this:

2023-05-21T14:25:57.280-05:00 INFO 46244 --- [main] org.weathertracker.Handler : My Log Message
{
  "@timestamp" : "2023-05-21T14:25:57.280827-05:00",
  "@version" : "1",
  "message" : "My Log Message",
  "logger_name" : "org.weathertracker.Handler",
  "thread_name" : "main",
  "level" : "INFO",
  "level_value" : 20000
}
Enter fullscreen mode Exit fullscreen mode

So that solved the first problem, but I was still getting the duplicated first line of the log statement. So I asked ChatGPT how to get rid of that duplication and here’s what it suggested (adding a new logger element with additivity=”false”):

<configuration>
    <include resource="org/springframework/boot/logging/logback/base.xml" />

    <appender name="jsonConsoleAppender" class="ch.qos.logback.core.ConsoleAppender">
        <encoder class="net.logstash.logback.encoder.LogstashEncoder">
            <jsonGeneratorDecorator class="net.logstash.logback.decorate.PrettyPrintingJsonGeneratorDecorator" />
        </encoder>
    </appender>

    <!-- Disable the default pattern layout -->
    <root level="INFO">
        <appender-ref ref="jsonConsoleAppender" />
    </root>

    <!-- Additivity is set to false to prevent duplication of logs in the parent appenders -->
    <logger name="<your package name>" level="INFO" additivity="false">
        <appender-ref ref="jsonConsoleAppender" />
    </logger>
</configuration>
Enter fullscreen mode Exit fullscreen mode

That did the trick so now each log statement is a simple block of JSON without the leading line! That’s exactly what I was looking for. So now, my JSON formatted logs looked great when I was running my application locally and looking at the console log output. It was time to upload my Lambda function to AWS and see what the CloudWatch logs looked like. Here’s what my first attempt generated:

Screenshot of a partial AWS CloudWatch log listing

Awesome, The JSON formatting is there. But what’s up with every line of the JSON object being a separate CloudWatch log entry? That’s not gonna work at all. Imagine trying to search thousands of CloudWatch log entries trying to find an ERROR log entry to troubleshoot an issue. With this setup, all you’d find would be the actual lines that said “level’: “ERROR”. Not super useful. So I went round and round trying different configurations for another couple of hours trying to get to exactly what I needed. Here’s what I finally landed on.

<configuration>
  <include resource="org/springframework/boot/logging/logback/base.xml"/>
  <appender name="jsonConsoleAppender" class="ch.qos.logback.core.ConsoleAppender">
    <encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
      <providers>
        <pattern>
          <pattern>
            {
            "level": "%level",
            "message": "%msg",
            "logger": "%logger",
            "stacktrace": "%ex"
            }
          </pattern>
        </pattern>
      </providers>
    </encoder>
  </appender>

  <!-- Disable the default pattern layout -->
  <root level="WARN">
    <appender-ref ref="jsonConsoleAppender"/>
  </root>

  <!-- Additivity is set to false to prevent duplication of logs in the parent appenders -->
  <logger name="org" level="INFO" additivity="false">
    <appender-ref ref="jsonConsoleAppender"/>
  </logger>
</configuration>
Enter fullscreen mode Exit fullscreen mode

The trick I finally realized lies in removing the jsonGeneratorDecorator for pretty printing and adding the provider element with a LoggingEventCompositeJsonEncoder encoder. With this element, I was able to customize the format of my log message and have it display the way I wanted it to in CloudWatch logs. On CloudWatch, the log message displays as a single line in the summary, but when you expand the log entry as shown below, you get the full JSON object in a single log entry.

Screenshot of an AWS Cloudwatch log entry.

This came at the expense of pretty printing the JSON object when testing the function locally. That’s a tradeoff I’m willing to accept. I can always add back the jsonGeneratorDecorator and pretty print locally when I’m testing. You may notice that I also removed the timestamp and thread_name elements from the JSON object since every CloudWatch entry gets a timestamp by default when it’s ingested and I don’t do any multi-threading so didn’t need to have that information in my logs.

By the way, there’s nothing special that needs to be done on the Java side to get this functionality in the log statements. All of the above example log statements were generated by this one line of Java code:

LOG.info("My Log Message");
Enter fullscreen mode Exit fullscreen mode

Remember that article about Lambda best practices that I mentioned at the beginning of this article? The other best practice I read about is associating the Lambda request ID with each log entry. You might be able to imagine a real-world environment where a Lambda function is being executed dozens or even hundreds of times per second. For better or worse, all of the executions of a single Lambda function share the same CloudWatch log group. So when looking at the basic listing of log entries in a CloudWatch log group, you’ll potentially see messages from many function invocations interleaved with one another. Trying to get valuable information from logs in that situation is a real disaster.

To get around this issue, you can leverage the Mapped Diagnostic Context (MDC) functionality built into SLF4J and Logback. All it takes is two small changes, one to the logback.xml configuration, and one to the Java code. Let’s start with the Java code. We can add the Lambda request ID to every log entry generated, regardless of logger class, by adding the ID to the MDC context with a single statement:

MDC.put("awsRequestId", context.getAwsRequestId());
Enter fullscreen mode Exit fullscreen mode

I added that statement as the first line in my Lambda function handler which receives the Context object as a parameter when AWS invokes the function. In order to output the request ID into your log entries, I only needed to add one line to the element in the Logback XML configuration so that it looks like this:

<pattern>
  {
  "level": "%level",
  "message": "%msg",
  "logger": "%logger",
  "awsRequestId": "%X{awsRequestId}",
  "stacktrace": "%ex"
  }
</pattern>
Enter fullscreen mode Exit fullscreen mode

The magic lies in the pattern “%X{awsRequestId}” which pulls the value from the MDC context and places it in the log entry. Easy peasy right!? Now I have a way to filter your CloudWatch log entries so that I can isolate entries from a single Lambda function invocation. Filtering CloudWatch logs is a bit beyond the scope of this (already long) article so I won’t dive into the details, but you can read more about it here.

One thing to note about viewing log entries generated with this configuration in CloudWatch. If you’re looking at the listing in the Log Groups section of the CloudWatch console, you will not see nice line wrapping of a multi-line message or a stack trace. However, if you view the log in the Log Insights section of the console, you will see nicely formatted multi-line messages and stack traces.

Was this article helpful? Did you learn something? Was it worth your time, or a waste of time? I’d love to hear from you if you have questions or feedback on this article and/or if I can help you get past any stumbling blocks you may have encountered along the way!

Further Reading:


Top comments (0)